Excellent testing technique.
But Im sorry.... my bad. I only skimmed over your get-heights,
so I didnt see the ent_getvertex's in there.
They are a well-known SERIOUS issue with speed when calling on a pix-by-pixel basis.
As you are certainly aware, when you just call ent_getvertex(?,?,xx) you get
just ONE vertex worth of data. Obviously..
But were you aware, if you call ent_getvertex(?,?,1) then ent_setvertex(?,?,1),
it gives you access to ALL the vertices, without any more ent_getvertex calls?
(At least until the next wait(), and then the buffer gets flushed away.)
I know your code, to my un-familiar eye, doesnt look friendly to the idea,
but is it PLAUSABLE to fit the following concept into a SINGLE frame...?
[because if you hit any wait(), you need to re-call ent_getvertex(?,?,1) & ent_setvertex(?,?,1)]
1> Call ent_getvertex(?,?,1) then ent_setvertex(?,?,1) to access the vertex buffer.
2> Set all your heights DIRECTLY into this buffer. Do a ent_setvertex to lock it in.
[note: even though you did a ent_setvertex(?,?,1) to save, the buffer is STILL THERE until the next wait()]
3> Paint your skin using the data from the buffer rather than individual ent_getvertex calls.
Does that sound within the realms of possibility?
And YES, I am aware thats a LOT of changes...
I am CERTAIN this will give you a serious improvement, but I GUESSTIMATE
it will actually give a HUGE improvement, based on my rough understanding
of the sheer number of times your get_height gets called, which looks large.
BUT, will it be enough? And will the improvement be worth that much re-coding?
I certainly HOPE so.... But only time and MUCH effort will tell...
Best of luck...