Hey Everyone,
I wanted to make a post to get some feedback and concrete information about the drawbacks/problems with DLL reloading like Casey has done on handmade hero. (At least the kind of DLL reloading he did near the beginning, not sure if it has changed much past the first 50 or so episodes). I have adopted this form of platform layer + application DLL structure for most of my projects and it has mostly been great. However, there are some subtle problems with this format that I feel like I didn't know about beforehand. I'm not sure if Casey explained these on the stream anywhere but I figured it would be nice to put together a list of things to keep in mind when considering this DLL approach. That way people have a better idea of what kinds of things they should avoid or work around when using this setup.
So far I have run into 4 scenarios where the DLL reloading causes problems. I'd like to also get feedback from anyone here that has used this approach if you have run into any other problems. Or if I have misdiagnosed one of these problems let me know. I will also try to provide some ideas of how I have worked around the problem as well as example scenarios where I first ran into the problem.
1. Function pointers cannot be stored across the DLL reload boundary. Since the exact memory location of a function inside the DLL is not guaranteed by the linker in any way you cannot store a function pointer that points to a function inside the application DLL. This often becomes an issue if you want to use some sort of "callback" structure where some task is being performed over multiple frames and you want it to call a function of your choice when it's finished. If a DLL reload happens during these frames then the callback function pointer will become invalid. One way to work around this is to try and re-fill your function pointers whenever the application DLL reloads. Or you can have a lookup table of functions that gets refilled at the top of AppUpdate. Or just avoid the use of function pointers altogether unless they have a lifetime shorter than a single frame.
2. Memory allocated using malloc by the DLL cannot be passed to free after a DLL reload. In fact it's probably best to avoid using malloc/free all together in your application DLL if I'm understanding this correctly. As far as I can tell it seems like the standard heap for each DLL is unique and undefined behavior can result by mixing memory allocated by one DLL to the next. Whatever the cause, I would always run into problems if I allocated memory, had the application DLL reload, and then tried to free that memory. One way to get around this is to have the platform layer provide allocation and deallocation functions for the application to use since the platform layer's standard heap never changes and therefore malloc/free is fine to use in the platform layer. Or you can just make sure you always use your own MemoryArenas and preallocated memory regions provided by the platform layer at the beginning of the application (like he does in the show). NOTE: mmozeiko pointed out that this only happens when you have the CRT statically linked. A dynamically linked CRT will solve this problem and also reduce code duplication since the entire process would only have a single copy of the CRT.
3. Static and global variables in the DLL get reset whenever the DLL reloads. This one isn't normally a problem for me since it makes a lot of sense that the static variables defined in one application DLL don't persist into the next application DLL. The best way to work around this is to not use static/global variables unless they are for debugging something or you are okay with them getting reset randomly. However, recently I ran into problems when trying to set up Box2D in my application. Their code has a few places that use static variables in their classes and it caused me a couple days of debugging in order to figure out why the application was crashing after a DLL reload. (This is actually the inspiration for writing this post). So the moral of the story is to be aware of these limitations of the DLL reloading format and check through the code-base of any library/external code you use to make sure that you can get around these problems.
4. String literal pointers can't be stored across DLL reload boundary. If you are using string literals (e.g. "Hello World") in the code the pointer to their locations should not be stored across multiple frames. This is similar to the function pointer problem where the next application DLL doesn't have any reason to place the same string literal at the same memory location. I often ran into this problem when I had text for buttons or UI elements that were constant and I felt like I didn't need to allocate the memory for them in the MemoryArena since I could just store a pointer to the string literal for each button. The best way to get around this is to copy the string literal into your own MemoryArena and then store the pointer into that arena instead of a pointer directly to the string literal.
Submitted by mmozeiko
5. Threads running code in the currently loaded DLL must be stopped before the current DLL is unloaded and then started again when the new DLL is loaded. This becomes especially hard when external libraries are running their own threads that you don't have as much control over. You can solve this by cycling threads that you own. For external code that runs on multiple threads you might have to move the library into the platform layer so that the library isn't actually part of the application DLL.
6. The virtual function table for C++ classes stores pointers to functions that might move when the DLL is reloaded. If you use inheritance in C++ then the virtual function table (__vfptr) is allocated when calling new (or placement new) and pointers are taken and stored to the overloaded versions appropriate for the class you just instantiated. Because of point 1, these tables can easily point to invalid space where the function used to be and a crash will happen when you try to call these overloaded functions. One way to avoid this is to not use virtual functions with overloads in inherited classes.
Submitted by ratchetfreak
7. The size and layout of structures has to be the same across the DLL load so that the memory block passed from one DLL to the next is interpreted in the same manner. This one may be obvious and I believe it was explained in the show but I included it here for completeness sake. To avoid this you simply have to not add new variables to structures during a DLL reload. If you want to add a variable or change the layout of some memory you just need to restart the program from the beginning.
Let me know what you guys think. Hopefully this can be helpful for anyone getting into, or considering using, the application DLL scheme. Let me know if you have any problems that you have run into in your own projects that I missed and I will add them to this list.
Also I have to be clear that despite these problems the ability to reload my code quickly has saved me many hours of work over the last couple of years and I would highly recommend this style as long as you have a clear idea of what you're getting into.