memcpy() is not a lite-c function - its a C function, like malloc() and free(). If bmap_blit is implemented in C++, then its working at the same level as memcpy() so of course its going to be just as fast. That would explain it.

I wonder if all bmap_blit is doing is just a memcpy itself? I find the nearly identical run times for the two different commands to be a little suspicious.

I also still am wondering why the manual categorizes the speed of this command as "slow"? - its obviously not slow.