Did you try moving even further away?
I would expect them to test the view frustum against the bounding boxes and it might be that there are still some overlaps even if you think there shouldn't be, but those should disappear given enough distance of course.
Shaders shouldn't really impact performance for this since the slow part is usually the pixel shader which won't be executed if the object isn't visible.

[quote]
These would mean in a MDL level I need to cut up models into smaller mdls that fit well within a square BBOX?
[\quote]
That's how you do it. The hard part is finding a good balance between effective culling (10k triangles more or less don't really matter for performance, but if you can save 100k it might be worth it) and not having too many objects (lets say ~3000 in total, although this number is somewhat arbitrary) in your scene.

I am not sure about the shape used for culling btw, it might just be the bounding box, but it could also be something bigger like a sphere (makes the check if visible or not quite fast) or whatever...