Hotloaded Application DLL Drawbacks

Hey Everyone,

I wanted to make a post to get some feedback and concrete information about the drawbacks/problems with DLL reloading like Casey has done on handmade hero. (At least the kind of DLL reloading he did near the beginning, not sure if it has changed much past the first 50 or so episodes). I have adopted this form of platform layer + application DLL structure for most of my projects and it has mostly been great. However, there are some subtle problems with this format that I feel like I didn't know about beforehand. I'm not sure if Casey explained these on the stream anywhere but I figured it would be nice to put together a list of things to keep in mind when considering this DLL approach. That way people have a better idea of what kinds of things they should avoid or work around when using this setup.

So far I have run into 4 scenarios where the DLL reloading causes problems. I'd like to also get feedback from anyone here that has used this approach if you have run into any other problems. Or if I have misdiagnosed one of these problems let me know. I will also try to provide some ideas of how I have worked around the problem as well as example scenarios where I first ran into the problem.

1. Function pointers cannot be stored across the DLL reload boundary. Since the exact memory location of a function inside the DLL is not guaranteed by the linker in any way you cannot store a function pointer that points to a function inside the application DLL. This often becomes an issue if you want to use some sort of "callback" structure where some task is being performed over multiple frames and you want it to call a function of your choice when it's finished. If a DLL reload happens during these frames then the callback function pointer will become invalid. One way to work around this is to try and re-fill your function pointers whenever the application DLL reloads. Or you can have a lookup table of functions that gets refilled at the top of AppUpdate. Or just avoid the use of function pointers altogether unless they have a lifetime shorter than a single frame.

2. Memory allocated using malloc by the DLL cannot be passed to free after a DLL reload. In fact it's probably best to avoid using malloc/free all together in your application DLL if I'm understanding this correctly. As far as I can tell it seems like the standard heap for each DLL is unique and undefined behavior can result by mixing memory allocated by one DLL to the next. Whatever the cause, I would always run into problems if I allocated memory, had the application DLL reload, and then tried to free that memory. One way to get around this is to have the platform layer provide allocation and deallocation functions for the application to use since the platform layer's standard heap never changes and therefore malloc/free is fine to use in the platform layer. Or you can just make sure you always use your own MemoryArenas and preallocated memory regions provided by the platform layer at the beginning of the application (like he does in the show). NOTE: mmozeiko pointed out that this only happens when you have the CRT statically linked. A dynamically linked CRT will solve this problem and also reduce code duplication since the entire process would only have a single copy of the CRT.

3. Static and global variables in the DLL get reset whenever the DLL reloads. This one isn't normally a problem for me since it makes a lot of sense that the static variables defined in one application DLL don't persist into the next application DLL. The best way to work around this is to not use static/global variables unless they are for debugging something or you are okay with them getting reset randomly. However, recently I ran into problems when trying to set up Box2D in my application. Their code has a few places that use static variables in their classes and it caused me a couple days of debugging in order to figure out why the application was crashing after a DLL reload. (This is actually the inspiration for writing this post). So the moral of the story is to be aware of these limitations of the DLL reloading format and check through the code-base of any library/external code you use to make sure that you can get around these problems.

4. String literal pointers can't be stored across DLL reload boundary. If you are using string literals (e.g. "Hello World") in the code the pointer to their locations should not be stored across multiple frames. This is similar to the function pointer problem where the next application DLL doesn't have any reason to place the same string literal at the same memory location. I often ran into this problem when I had text for buttons or UI elements that were constant and I felt like I didn't need to allocate the memory for them in the MemoryArena since I could just store a pointer to the string literal for each button. The best way to get around this is to copy the string literal into your own MemoryArena and then store the pointer into that arena instead of a pointer directly to the string literal.

Submitted by mmozeiko
5. Threads running code in the currently loaded DLL must be stopped before the current DLL is unloaded and then started again when the new DLL is loaded. This becomes especially hard when external libraries are running their own threads that you don't have as much control over. You can solve this by cycling threads that you own. For external code that runs on multiple threads you might have to move the library into the platform layer so that the library isn't actually part of the application DLL.

6. The virtual function table for C++ classes stores pointers to functions that might move when the DLL is reloaded. If you use inheritance in C++ then the virtual function table (__vfptr) is allocated when calling new (or placement new) and pointers are taken and stored to the overloaded versions appropriate for the class you just instantiated. Because of point 1, these tables can easily point to invalid space where the function used to be and a crash will happen when you try to call these overloaded functions. One way to avoid this is to not use virtual functions with overloads in inherited classes.

Submitted by ratchetfreak
7. The size and layout of structures has to be the same across the DLL load so that the memory block passed from one DLL to the next is interpreted in the same manner. This one may be obvious and I believe it was explained in the show but I included it here for completeness sake. To avoid this you simply have to not add new variables to structures during a DLL reload. If you want to add a variable or change the layout of some memory you just need to restart the program from the beginning.

Let me know what you guys think. Hopefully this can be helpful for anyone getting into, or considering using, the application DLL scheme. Let me know if you have any problems that you have run into in your own projects that I missed and I will add them to this list.

Also I have to be clear that despite these problems the ability to reload my code quickly has saved me many hours of work over the last couple of years and I would highly recommend this style as long as you have a clear idea of what you're getting into.


Edited by Taylor Robbins on Reason: Added point 6 and 7
Point 1 is really a subset of point 3. Functions kind of are global "variables" - so they address potentially changes when recompiling. Same with point 4. String literals are global variables.

Point 2 is valid only if you use static C runtime. If you use dynamic runtime you won't have this problem as there will be only one heap shared across all your dll's in process. Which is something you should really be doing anyway as there will be less code duplication (so less cache pressure).

Extra caution in point 3 - C++ virtual tables. They are also global variables, but in a hidden way. So if you use C++ virtual functions in code defined in DLL, you'll get same problems as global variables.


Another disadvantage of hot reloading is that you need to stop all threads whose code is in dll (if they are running). Maybe this is not an issue, but in case you are using some libraries that runs some of their code in background then it could require adding extra code.

Edited by Mārtiņš Možeiko on
I agree that some of these points are symptoms of the same cause, however I think it's easier to understand if I point them out separately like this. I added your note about the static CRT to point 2 and added a 5th point for threads running code in the application DLL. I will add your point about virtual tables but I think I would need to do some experimentation to fully articulate when you might run into problems with your virtual table getting corrupted. If you have some examples or a better explanation that would also help
Something like this won't work - DllFunction is in dll that is reloaded, pass sizeof(void*) memory to it from main executable:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
struct Base
{
  void nonvirtual() {}
  virtual void run() = 0;
};

struct Derived : Base
{
  void run() {}
};

void DllFunction(void* memory)
{
  Base** obj = (Base**)memory;

  // first time create new object
  if (!*obj) *obj = new Derived();

  // this will work always, regardless of dll reloads
  (*obj)->nonvirtual();

  // this will work until dll is reloaded
  (*obj)->run();
}

Perhaps an obvious one but layout for any data that persists across a reload must remain the same.

Such a nice post. Thanks!
Agree very nice post, maybe could put it in the wiki as a reference?
After testing some stuff with with virtual functions in classes I think I understand the problem a little bit more. I added point 6 to explain this. Let me know if you feel like this is a decent explanation.

Also added point 7 about consistent memory layout suggested by ratchetfreak.