It actually can work. You forgot a step in your algorithm. You have only one pixel_for_bmap instruction in your code. You need two of them. In you inner loop you should:
1. Read RGB from bitmap A. 2. Read A from bitmap B. 3. Combine these values to RGBA. 4. Write this newly created value to A (or B).
Always learn from history, to be sure you make the same mistakes again...