CreateDibSection + BitBlt vs StretchDIBits

Hey guys, I'm on Episode 4 of Handmade Hero, and I was just wondering what Casey mentioned about how CreatingDibSection + BitBlt is potentially faster than what he ended up using which was StretchDIBits directly from a buffer we allocated ourselves. I was just wondering why this is, since he didn't go too in depth. From what I understood one of the reasons is because the memory is managed by the OS, so presumably it can perform some optimizations like aligning the memory to a certain memory offset. It just seems odd to me, because StretchDIBits seems like a much more direct path, so I don't understand why it would be less efficient. Sorry, if this is a bad question; I'm quite new to this stuff.

Edited by Draos on
I would be pretty surprised if that would be faster than allocating memory yourself.

But I would say BitBlt should be faster than StretchDIBits because former does not resize image, so it needs to do less/simpler operations when its blitting pixels.

Anyways - comparing performance of BitBlt/Stretch or any other GDI function is kind of useless. You can get crazy faster performance if you blit whole buffer through OpenGL or Direct3D.
I guess that makes sense, but then Windows has a StretchBlt function which basically does what StretchDIBits does but has to first be selected into a DC. So, I feel like there has to be some reason for both families of functions to exist. Like why would we ever use DCs to blit if we can just blit directly from our own memory?
Because DC is not always from your memory. It can be loaded bitmap, which you can load with GDI functions, not your own code. Or it can be existing window - and you are copying one portion to another. In Windows the DC context is not only about your own managed "image". It can be many things.

Also, when I said that I doubt about performance differences, I meant it on modern computers and modern Windows. If you go back 15 or 20 years, then probably it mattered - whether you are blitting from your own custom DIB section, or from existing DC. Because you can create more "optimized" DC. Basically if you know target DC where you will be blitting you can preconvert all source DC's to same format to make blitting faster when you are reusing source DC's multiple times. If you are using them only once, like HH does, then probably this does not matter.

Edited by Mārtiņš Možeiko on
that cleared it up for me. thanks a lot man!