The section I'm talking about starts at 5:20 on Day 3.
I don't see how this helps, considering RGB only needs 24-Bits. Apparently, it has something to do with the CPU being more performant on access boundaries. He states that if you are trying to operate on a value that's 32 bits, then that 32 bit value should be aligned on a multiple of 4 bytes. I can see that, but if we imagine the Bitmap as an array in memory, like so:
| R | G | B | R |
that seems aligned, even without a padding byte, since each element in the array is 8-bits, and so they ought to all start on 8 bit boundaries.
Is he perhaps planning to access the memory as 4-byte unsigned integers, as opposed to chars, in order to set RGB all together?
Pixel alignment on 32-bit boundaries (4 bytes) helps. Code usually has specialized and more efficient blitter than 24-bit case. Because each pixel can be loaded as uint32 which is one load from memory. For 24-bit code you need to be careful that you don't load last pixel as 32-bit load, because it may be accessing memory that is not available (one byte past the end). Having "uint32" pixels also helps for simd code because exactly 4 pixels fit into register.
Alpha is needed only for source bitmap. Not for framebuffer. Blending equation used in HH uses alpha value only from source bitmap. Technically you could do only RGB bitmap for framebuffer, but for various reasons - better alignment, you can use same code as for blitting regular bitmaps - framebuffer also is RGBA.