Hi, I already thought about this one, too, yes. Though, as I read the topic's name, I thought it was a typo ^^
Theoretically this is possible since it is a variant on computer vision, in which a common (and very very basic) challenge is to capture a (coloured) ball as video (or image) and then try to get the position of the ball in the screen, or even better: predict the future position by evaluating motion parameters with a simple Kalman-filter. So why should'nt it possible to do this with computer generated imagery?
The problem is, that you need a well specified purpose (what do you want to achieve?) and a well specified environment (what are going to see?).
An interesting task could be shooting zone selection dependent on visible skin and visible wounds - if the AI selected an opponent to shoot, the AI sight could be evaluated against skin color and wound overlay. The zone with biggest accessible skin and with biggest wound amount "wins".
Another option could be the scenario of the classic Thief-game: the AI sight renders the scene similarily as it is visible to the player and difference maps based on lights and shadows are analysed to locate possible locations of moving opponents.
Together with influence of the global domain knowledge (entities, coordinates, meshes, animation frames, health data, etc.), a believable "Sight AI" can be simulated, that is more realistic than an AI that relies entirely on the global domain knowledge and traces.
For performance issues... it depends. During my B.Sc. thesis I wrote an app that did several things in realtime entirely on the CPU @ 24fps including, but not limited to skin color detection, face detection and -tracking, fast fourier transformation and gabor wavelet transformation - so what you want to do can be achieved in realtime, no doubt, even on CPU. For using the GPU.. ah, mmm... regular pixel shaders might be not suited for this, I think special GPU programming APIs like CUDA are more suited. But for experiments, I would advise you to try CPU first. I worked all the time with OpenCV, a free computer vision library, which is super fast, reliable, stable and feature rich - and free.
Sadly, I have at the moment no time for doing free stuff and -experiments, except that 1 or 2 minutes which I spend on SSAO on sleepless nights

Best regards,
-Christian
P.S: Neural Networks are definetely not suited for this task.