At the most fundamental level, everything in a computer is represented using bits and bytes, including the code we write and the data structures in our running programs.
But we usually don’t think of our code as a bunch of bytes (just like we usually don’t think of our fellow human friends as piles of atoms). Instead, we have various sophisticated tools like IDEs and language servers that let us work with the code at higher levels of abstraction.
When it comes to the contents of our program’s memory, however, our tools barely help us beyond the bits and bytes stage. Most of us only have vague notions of what RAM contains. “Variables go on the stack, allocations are on the heap.” Okay, but where exactly? And what does that look like?
To illustrate this discrepancy between code and runtime structures, let’s take a look at what our current tools can do for us:
Instead of working with bits and bytes directly, we have editors that decode the binary text and render it as glyphs, laid out on independent lines:
More advanced text editors and IDEs recognize program entities like types and functions, which they may list in an outline:
And to reason about the relationships between program entities, some tools provide features like “go to definition” or “list all references.” (references of the function fib
shown below):
Now, let’s take a look at a typical debugger’s visualization of memory: Clearly, there's some structure to these values. But what do they mean? Are we looking at the stack, the heap, something else? How do these bytes relate to other bytes somewhere else in RAM?
My goal for this jam was to create a tool to answer these questions, and here are my results:
For context, my application visualizes the memory of the following C program, paused inside the malloc
call in the 6th iteration of the for
loop:
So, what does the memory look like? Like in the memory visualizer from earlier, we can still see the raw byte values, 16 of them per row.
But hey, look, we found the stack! The brown region is the main
function's stack frame. And sure enough, i = 5
.
In fact, the way I found the stack was using the jump list, which lists program entities, akin to the "outline" from earlier:
Finally, for the third level of abstraction, entity relationships: Pointer values like List.next
can be clicked to jump to the referenced value (similar to "go to definition"). And the inspector shows incoming and outgoing pointers for the selected object (similar to "list all references"). To see these in action, watch the last demo video post from Sunday below.
All in all, I'm very happy with my results. Yes the ui could look better (the contrast is especially bad, sorry), and yes it could be faster. But those things weren't the goal.
The goal was to be able to see what memory looks like and to be able to explore it by following pointers. Going into this, I wasn't sure how useful such a tool would be, as the individual data structures would probably better be visualized by other means like lists and graphs. But to my surprise, I found the "annotated hex viewer" approach quite fruitful. I've already improved my mental model of memory. And I'm sure this tool would be invaluable for debugging memory safety issues (for example, you'd see what a dangling pointer was actually pointing to).
But before this tool could be integrated into a debugger, there are still several open problems:
-
It currently doesn't support
union
s or "inconsistent" memory mappings, where two pointers of different types point to the same memory location. -
It currently doesn't understand how large dynamic arrays are (the dark green byte down below is the first byte of an array). And, relying on dwarf debug info, it can't understand more complicated memory layouts like joined allocations (the orange
Group
is the start of a swiss table hashmap buffer; immediately after theGroup
is an array ofSlot<K, V>
, the hashmap entries). -
Other languages like rust, which use a lot of type abstraction, could use some (automatic and/or manual) type simplification, as they're not exactly readable:
Alright, that's it from me, thanks for reading and take care ✌️