By using per pixel diffuse shading means that you'll have to compute the distance between the light source to all the pixel in the level, right? Wont it be slow?
It is slower than per vertex of course, but graphic chips are designed to do such calculations fast as hell. This only gets a bottle neck on modern hardware, if the calculations get very complex.
Just a simple example:
It takes many seconds to calculate the mandelbrot set on my cpu, even at kinda low resolutions. My gpu is able to do it with more than 60fps at 1024*1024 with a number of iterations, the floating point precision isn´t enough anymore.
It is probably more than a million times faster in that case.