It actually can work. You forgot a step in your algorithm. You have only one pixel_for_bmap instruction in your code. You need two of them. In you inner loop you should:

1. Read RGB from bitmap A.
2. Read A from bitmap B.
3. Combine these values to RGBA.
4. Write this newly created value to A (or B).


Always learn from history, to be sure you make the same mistakes again...