bmap_blit is not implemented in Lite-C but in C++ and thus could profit from the compilers optimization.
Considering that the Lite-C compiler probably does zero to nothing optimizations with your code, its not really a surprise that bmap_blit() is faster than memcpy()

(I assume that Lite-Cs memcpy is an actual, non inlined, function)


Shitlord by trade and passion. Graphics programmer at Laminar Research.
I write blog posts at feresignum.com