Even if you could get access to the zbuffer (and I think this would be possible in a dll), it would be useless. You need the depth as seen from the light's position, not the camera's. On top of that, the bit depth would be a problem. I'm using a floating point texture and bit depth is already a major problem (huge artifacts on large scenes).
To get the depth, the scene is rendered with a shader that outputs the (linearized) depth of the pixels. You can use a render event to set the material to this depth-material when the current view is the light-view.