But in practice this detect process is rather efficient, especially if a shader was compiled from HLSL. Thus, NVIDIA's pixel processors do not spend several clocks on vector normalization as ATI does (it is important not to forget about the format limitation - FP16).