bmap_blit is not implemented in Lite-C but in C++ and thus could profit from the compilers optimization. Considering that the Lite-C compiler probably does zero to nothing optimizations with your code, its not really a surprise that bmap_blit() is faster than memcpy()
(I assume that Lite-Cs memcpy is an actual, non inlined, function)
Shitlord by trade and passion. Graphics programmer at Laminar Research. I write blog posts at feresignum.com