The 2024 Wheel Reinvention Jam just concluded. See the results.

Linux debugger

Ho there folks. As some of you may already be aware, I have been working for some time on a debugger project for Linux partly for fun, partly to investigate what can be done if you do not use gdb as a back end and partly as a potential kick-off point for a serious effort to improve the experience of debugging handmade type projects on Linux.

I have not settled on a name for the debugger, the working name is "drake", but perhaps something like "owldb" or "piggy" would be more appropriate.

So far I have:
* a very hacky exploration of ELF+DWARF decoding which is somewhat like readelf in function, written from scratch with no libraries beyond the C runtime.
* a basic ptrace-based debugger using SDL2 for graphics, udis86 for disassembly, libelf+libdwarf for binaries and the C runtime.

The debugger has the following functionality at the moment:
* Standalone launch directly on executable (curiously, I've not actually done command-line parsing for this yet)
* Spawn process for debugging
* Single instruction stepping
* Resume/continue debuggee
* Software breakpoints
* Watch pane showing all in-scope variables, and registers
* Disassembly pane (can disassemble from binary or debuggee memory)
* Threads pane with basic stack walking and symbol resolution
* Source pane
* Source file select (can only select source files referred to by the executable at the moment)
* Highlight of breakpoints and currently executing line in source and disassembly pane
* Partial multi-threading support (select "current" thread, resume all threads, single step one thread, threads view lists all threads)
* Support for inspecting information from dynamic libraries (from both the runtime link step and dlopen())

I'm currently working on implementing step over/step in to next line (currently steps are per-instruction only).

Other functionality I want to implement (some mundane and some more aspirational):
* Watch specific variables, memory locations and expressions
* More stepping types and targettable step-into
* Thread controls (eg step and resume single/multiple thread, freeze and thaw threads)
* Memory explorer
* Hardware breakpoints (mainly to allow data breakpoints, and probably for performance wins on some operations)
* Conditional breakpoints
* Call debuggee functions
* C lexer/parser to enhance the debug information and provide better feedback on the UI (debug info doesn't tell you everything about how to tie things to source code, and generally seem to work on a line rather than sub-line granularity)
* Integration with an editor (4coder if mr4thdimention is willing)
* Data observation breakpoints (collect values of variables at certain points, tabulate and visualize)
* Customisation via a C layer (explore the possibilities here, like custom visualisations, render debug info to game (or game to debugger controlled surface), understand more about user types/data structures and systems).
* Deterministic record (to provide reversible debugging and improved bug recreation)

I'm sure there's a bunch of things I've forgotten from that list.

I'm not really at my first milestone yet, which is to run the Linux version of HandmadeHero and perform some real debugging task on it (perhaps reenact solving some bug from the stream).
Most excellent! I look forward to demonstration. :) Which distributions are you planning to support?
I don't think I'm relying on anything that I'd expect to differ between distros. I think there may be a minimum kernel level of 2.6 for some of the features though.

At the moment I've been solely testing against gcc with default debug info (-g option), but I want to support clang as well (although I read an article implying the debug information is not so good for clang, but that may be outdated).

Which reminds me, add debug info for macros to the list of things I want to support.

I'm kind of interested to know what efficiencies could be gained by closer links to the compiler and editor as well.
It's great that you're making steady progress on your debugger! Here are my two cents:

Clang won't give you macro information and it also won't tell you which symbols were defined at compilation time, according to my explorations and also to this link:

That means that you not only need a preprocessor, but also need to integrate the debugger in the building process in order to provide some simple functionality, like highlighting for #ifdef blocks.

The advantage clang's debug info has over gcc is that it reports column numbers, so you get sub-line resolution. However, it's not perfect, see:
Quick update:

I finished the first pass on step-over-line functionality. Which is to say it functions correctly for simple scenarios.

I got distracted from stepping when I spotted that I was being dumb and not using all the stack unwind information that was available -- libraries often have unwind info even if they don't have debug information. Therefore, I've been fixing that oversight (and doing some cleanup), but it has revealed loading all the unwind information up front is total overkill since even quite small programs can easily link to large libraries (like ld-linux and libc) with lots of unwind information. Therefore the memory usage in the BinaryFileArena increased from ~3MB to ~25MB on one of my test programs, and start-up time from ~1 second to ~10 seconds.

Next I'm going to look at implementing a more intelligent cache for unwind information to improve the memory usage and start-up time.

EDIT: Some corrections on the numbers (blame a bad memory)

Edited by Mike T on
I got to see a live demo of this at HandmadeCon this year.
fierydrake, thanks for sharing! It was super motivational to see the work you've done. I've been interested in building a Linux debugger for a couple of years now but really lacked any kind of ground to stand on. Thanks to the various discussions with you, I now feel confident enough that I can start to explore the space on my own.

Also great work on the debugger UI itself. Just building the application user interface itself is a large chunk of work and you've got a very nice looking one already.
It was great to meet you Aaron, I look forward to seeing what you make!

Just to update:
* Fixed slow start up time, it was due to a mismatch between the libdwarf API and how I wanted to access the register unwind information
* Implemented step-return and at the same time collapsed and made a more general stepping function
* Started work on a watch window which kicked off updates to the input system

Next:
* Finish input system updates
* Implement a _simple_ watch system for addresses and variables (full expression evaluation will come later)
* Consolidation and bug fixing passes
* Update library version of debugger API to integrate new changes
* Many tests
Got the keyboard input working much better this evening. Time to deal with mouse input (I have been putting it off until now).
Mouse input is in and to test I added scrolling to source and disassembly pane with the mouse wheel. Feels pretty nice.

Next: Need to restructure my input handling to happen next to rendering for UI elements.
Man I saw you sitting there working on your debugger during the lunch break on Sunday but never found the time to head over and meet you. I would have loved to see little demo of the current functionality.

Out of all the projects I've seen spawn from the Handmade Hero community this is definitely the one I look forward to the most. Even if it's half as functional as gdb if you have a nice user interface and everything is solid then it's a huge step forward in my book. With how annoyed I get of windows nowadays I can't wait for the day that I can really develop on a Linux based platform. The last thing I really lack is a good debugger.

I can't say I know all that much about debugging or disassembly but I wouldn't mind putting a decent amount of time and effort towards this project if it would help. Let me know if you have anything that I might be able to help with!
Thanks Taylor.

Due to a busy period at work and now some problems with my machine at home, I haven't made much progress of late. I had been hoping to get down to serious work over the Christmas, but alas the replacement parts I have ordered are almost as unstable as those they replaced.

Nevertheless, I hope to forge ahead just as soon as I get things running smoothly again.
Current status:
Awesome job man!

Edited by Nikita Smith on
Looks very cool. Looking forward to trying it out first hand some time!
Progress in the last two weeks:

* Added initial support for:
- multiple compilation units
- non-contiguous code ranges in scopes
- location lists as specifiers of variable location
- C++ constructs (classes, references, etc) -- a lot more still to do on this one
* First pass refactor of DWARF expression evaluator
* Improved handling of invalid pointers when evaluating variable values
* Cleaned up code for:
- calculating unwind information
- type modifiers
* Moved to file offsets as the primary representation of instruction location
* Bug fixes

The result is more robust debugging of HMH code -- can now line-step all the way through UpdateAndRender() without error.