Blurring in different directions in several passes will always be faster with better result.
The if statements in shaders are also something one has to be very carefull with, as they will always be executed, when using a shader model below 3.0. 3.0 and above uses dynamic branching, which means that the shader kinda compiles to several mutuations, which are than choosen at runtime. While dynamic branching can be a great thing when used wisely, it can cause the shader compiling to take endlessly long and the execution can be very slow, especially if used much.
Another thing which can be very bad are loops. It most cases it is faster to unroll them, which can basicly done by the compiler, but can offer some more freedom when doing it yourself.
When using shaders, there are instruction limits as well as some other limits. I don´t see a way to check a shader for being within those limits, when using loops with a varying counter. Especially, as this has to happen at compile time.
The reason for all these restriction I guess, is basicly that they are meant to run extremely parallel for the best performance possible.