We are a community of programmers producing quality software through deeper understanding.

Originally inspired by Casey Muratori's Handmade Hero, we have grown into a thriving community focused on building truly high-quality software. We're not low-level in the typical sense. Instead we realize that to write great software, you need to understand things at more than a surface level.

Modern software is a mess. The status quo needs to change. But we're optimistic that we can change it.

Latest News

Happy New Year, Handmade Network! I hope you all had an amazing holiday season, and I hope you're all looking forward to 2022 as much as I am.

The Handmade movement had a huge year in 2021. The Handmade Seattle conference, hosted by former Handmade Network staff member Abner Coimbre, brought the community together for yet another time, even in the face of the many changing circumstances that 2021 held in store. We on the Handmade Network staff team hosted the Wheel Reinvention Jam, and we had a number of very well-done project submissions. Ben Visness and Asaf Gartner rewrote the entire Handmade Network website backend in preparation for some huge features that are slowly making their way onto the website.

This is, of course, only an incomplete list. The real list of all of the great things that happened for the Handmade movement in 2021 is not enumerable. But let's dig into these.

Handmade Seattle 2021

Abner Coimbre successfully hosted Handmade Seattle for a 3rd year in the face of consistently changing circumstances - local regulations for COVID-19, traveling restrictions for international Handmade community members, and many other things I don't know about or understand. Handmade Seattle, in 2021, hosted 25 different presentations, all being either talks, demos of Handmade projects, or podcast episodes. The conference happened both live in-person and also virtually.

I'm not exactly sure how he manages it, but with each year, Abner has produced a better conference, in the face of COVID-19 restrictions, all while staying independent without any corporate sponsorships. I had the privilege of attending in-person, and I was amazed with how smoothly everything went.

I'm here to tell you that the media - talk recordings, demo videos, podcast episodes - from Handmade Seattle 2021 is starting to roll out, so you can enjoy all of the content from the conference for free, even if you missed it originally. To check it out, go here.

Wheel Reinvention Jam

As I mentioned, the Handmade Network staff hosted our "Wheel Reinvention Jam" back in October. This was like a game jam, but instead of being for games, it was all about taking existing "boring" software - the kind of stuff that we regularly depend on as users, like file explorers, word processors, and more - and innovating on it. To put it shortly, the jam was a huge success, and we got a number of amazing project entries.

We on the admin team also did a stream to show off some of our favorite projects. And then, something horrible happened: I lost the recording. Long story short, I downloaded the recording, and unknowingly permanently deleted it as I was cleaning off my desktop one day. In other words, I had a very important lessons about backing up important data on my machine.

Hope was, as far as I was concerned, lost.

But I was mistaken - hope is never lost in a community with a hero protecting it.

Martins Mozeiko, the resident programming wizard in the community, has an automated system set up that records stream VODs so he can enjoy them at his own pace after they air. He had a copy of the stream's recording!!!

So, I'm here to report that - because of Martins - you all have the ability to watch the Wheel Reinvention Jam recap stream. Here it is:

Personal Projects Update & Discord Integration

Ben and Asaf have been hard at work at the admin team's plan for projects on the Handmade Network website. We realized that the #project-showcase channel on the Discord server was so popular because so many projects didn't fit the original model of projects on Handmade Network. These projects were smaller, sometimes not intended to be finished, were experimental and not well-established, but nevertheless demonstrated very great results that people love to see. We figured that this prevalent force in the community - of those who are working on such projects - should be reflected in the services we offer on the website.

Now, Handmade Network has two tiers of projects: Personal Projects, and Featured Projects. Personal Projects can be created by anyone, and do not require approval by the admin team. They give you a place to host media for your project.

Featured Projects are similar to the old Handmade Network website projects; they are like personal projects, but are upgraded with a number of features (forums, episode guide, etc.), and they are presented as being featured (and thus hand-picked by the admin team).

The most awesome part of all this comes with how media is managed for these projects. For a while now, we've had the Community Showcase section of the website - these are posts from the #project-showcase Discord channel of people showing off work on their projects. We've upgraded this system to also allow associating some of those posts with projects (and other things) through tagging.

On each post, you can add tags: Here is a screenshot of my awesome work on &hero for the &cooljam. Each word that follows a & symbol refers to one of these tags. Each project gets one tag name, that it can use to pull in certain resources. And, other things can have tags too (like jam events), that can categorize posts in other ways.

Posts that are tagged will not only go to the regular Community Showcase section of the site, but they'll also be pulled onto their respective project pages (and, yes, a post can be tagged to multiple projects).

Ben is writing a more in-depth post about this. It'll be released shortly, so if you're looking for more details about how these new features work, you'll want to check that out when it's ready.

I am extremely excited for this new upgrade to the site's project features. Go and try it out!

State of the Network Podcast 2021

I've just released a new podcast episode I did with community member Rudy Faile, who interviewed me about the current state of Handmade Network, all of the things in the Handmade movement in 2021, and where we are planning to go in 2022. Check it out here!

Closing Remarks

One final remark I want to make is announcing a temporary hiatus for myself on my side-projects, including Handmade Network. In short, I'm taking at least a few months for self-improvement, to focus on some very important personal matters. Thank you to all of those who have expressed support - I'll be back at some point during this year, and I will keep my finger on the pulse of the community, because I'm so excited for what is yet to come.

That's all for the news, for now. Here's to a prosperous 2022, and let's continue moving the ball forward!

Best wishes, Ryan

Around the Network

Macoy Madson

This article is mirrored on my blog

My linker in development was crashing on free. It was calling one malloc but then freeing with a free associated with a different malloc. This subsequently caused a segmentation fault because the free expected a metadata structure that didn't exist in the other malloc (at least, not at the same size).

This took a lot of sleuthing. I knew it was a problem with my linker/loader, but couldn't step-debug it because my loader doesn't create a program image the way GDB expects.[^1]

This article goes over how I found the issue. It might help you if you are encountering strange things with your work-in-progress linker (ha ha!), or like reading about someone debugging something.[^2]

Finding the problem

The problem manifest in my attempt to statically link[^3] the following simple Cakelisp program:

(add-c-search-directory-module "/home/macoy/musl/include")
(c-import "stdio.h" "stdlib.h")

(defun main (&return int)
  (fprintf stderr "Hello, C runtime!\n")
  (var data (* char) (type-cast (malloc (* (sizeof (type (* char))) 10))
                                (* char)))
  (fprintf stderr "Allocated and got %p!\n" data)
  (set (at 0 data) 0)
  (fprintf stderr "Accessed %p, it's now %d\n" data (at 0 data))
  (free data)
  (fprintf stderr "Freed!\n")
  (return 0))

This program first proved that musl libc was at least partially functional by successfully printing to stderr. However, the program segfaulted in free.

I used a combination of a signal handler and rudimentary stack printing via backtrace.h to find I was calling the incorrect malloc function relative to the later free.

I discovered this by noticing that I could successfully set the data returned by malloc without encountering a segmentation fault, so the memory was at least valid.

I then hacked together a damn simple "interactive debugger", which gets triggered when SIGSEGV is caught by my signal handler[^4]:

;; Very minimal!
(defun-local interactive-debugger ()
  (fprintf stderr "Commands:\n
\tquit\n
\tprint-symbol [symbol-name]\n")
  (var print-symbol-tag-length (const int) (strlen "print-symbol"))
  (while 1
    (fprintf stderr "> ")
    (var input ([] 256 char) (array 0))
    ;; Note: We need to request stdin before running this in a signal handler!
    (fgets input (sizeof input) stdin)
    (cond
      ((or (= 0 (strcmp "quit\n" input))
           (= 0 (strcmp "q\n" input)))
       (break))
      ((= 0 (strncmp "print-symbol " input print-symbol-tag-length))
       (var symbol-name-buffer ([] 128 char) (array 0))
       (strcpy symbol-name-buffer (+ input print-symbol-tag-length 1))
       (set (at (- (strlen symbol-name-buffer) 1) symbol-name-buffer) 0)
       (fprintf stderr "Searching for '%s'\n" symbol-name-buffer)
       ;; This prints where it finds it for us
       (var symbol (* void) (find-symbol-address-in-allocated-sections
                             symbol-name-buffer))))))

The print-symbol command alerted me to the fact that malloc was resolving to the lite_malloc.c implementation, but free was resolving to the mallocng implementation.

I then started looking at lite_malloc.c and untangling the mess.

Let's walk through the issue.

musl's malloc implementation

musl has a "lite" or simple malloc that is defined as a fallback when e.g. mallocng isn't included in your musl build.

It is defined like so:

static void *__simple_malloc(size_t n)
{
    // [Implementation omitted by article author]
}

weak_alias(__simple_malloc, __libc_malloc_impl);

void *__libc_malloc(size_t n)
{
    return __libc_malloc_impl(n);
}

static void *default_malloc(size_t n)
{
    return __libc_malloc_impl(n);
}

weak_alias(default_malloc, malloc);

After reading several pages on this relatively obscure feature, I came to understand what these weak_alias macros accomplish.

If the user defines their own malloc, that creates a strong definition, thereby overriding musl libc's malloc.

If the user does not define their own malloc, the definition of default_malloc will be resolved to by the linker when the linker asks for malloc.

This I deduce is accomplished like so:

  • GCC sees __attribute__(weak, alias("malloc")) associated with default_malloc (either via #pragma or direct attribute on the default_malloc definition)
  • GCC generates the code for default_malloc, and puts it in the object file under the ELF symbol name malloc and default_malloc, which we can confirm with nm:
~/Repositories/linker-loader $ nm --defined /home/macoy/Downloads/musl-1.2.3/obj/src/malloc/lite_malloc.lo
0000000000000000 b brk.2119
0000000000000000 D __bump_lockptr
0000000000000000 b cur.2120
0000000000000000 t default_malloc
0000000000000000 b end.2121
0000000000000000 T __libc_malloc
0000000000000000 W __libc_malloc_impl
0000000000000000 b lock
0000000000000000 W malloc
0000000000000000 b mmap_step.2122
0000000000000000 t __simple_malloc

The W denotes a weak symbol. Note that you would be able reference default_malloc directly, i.e. the alias isn't an override, but in this case default_malloc is marked static, so it will not be exposed to the linker by its true name.

The other weak_alias on __simple_malloc is the one that broke my loader. This alias accomplishes a different goal. In case the user has not defined their own malloc, default_malloc will be called, which references __libc_malloc_impl. The weak alias on __simple_malloc says "If __libc_malloc_impl is not defined, then use __simple_malloc instead."

The intent I believe with this alias is to allow users when building musl to switch which malloc implementation musl will use internally and by default. This is corroborated by the configure --help prompt:

Optional packages:
  --with-malloc=...       choose malloc implementation [mallocng]

The problem with my linker

I end up finding a weak definition of malloc, which resolves to calling default_malloc. This is actually the right behavior because there is no strong definition of malloc, because I never override it in the user program.

However, __libc_malloc_impl resolves to the weak __simple_malloc, when it should instead resolve to the strong __libc_malloc_impl provided by musl libc's mallocng implementation:

~/Repositories/linker-loader $ nm --defined /home/macoy/Downloads/musl-1.2.3/obj/src/malloc/mallocng/malloc.lo
0000000000000000 t alloc_slot
0000000000000000 r debruijn32.3106
0000000000000000 t enframe
0000000000000000 t get_stride
0000000000000000 T __libc_malloc_impl
0000000000000000 T __malloc_alloc_meta
0000000000000000 T __malloc_allzerop
0000000000000000 T __malloc_atfork
0000000000000000 B __malloc_context
0000000000000004 C __malloc_lock
0000000000000000 R __malloc_size_classes
0000000000000000 r med_cnt_tab
0000000000000000 t queue
0000000000000000 t rdlock
0000000000000000 t size_to_class
0000000000000000 r small_cnt_tab
0000000000000000 t step_seq
0000000000000000 t wrlock

Here are the relevant lines adjacent to each other, for comparison:

# lite_malloc.lo:
0000000000000000 W __libc_malloc_impl

# malloc.lo:
0000000000000000 T __libc_malloc_impl

W denotes a weak symbol definition while T denotes a strong public/global symbol defined in the text section of the object file.

In the ELF specification (PDF), the issue becomes quite clear (emphasis mine):

When the link editor combines several relocatable object files, it does not allow multiple definitions of STB_GLOBAL symbols with the same name. On the other hand, if a defined global symbol exists, the appearance of a weak symbol with the same name will not cause an error. The link editor honors the global definition and ignores the weak ones. Similarly, if a common symbol exists (i.e., a symbol whose st_shndx field holds SHN_COMMON), the appearance of a weak symbol with the same name will not cause an error. The link editor honors the common definition and ignores the weak ones.

I was resolving __libc_malloc_impl to the weak definition in lite_malloc.lo instead of the strong definition in malloc.lo.

Later, when I try to free, I end up referencing the free I find in malloc/free.lo, which just calls __libc_free, which is only defined in malloc/mallocng/free.c. If instead there was a corresponding __simple_free, I never would have realized that I was calling __simple_malloc.

Of course, I had the TODO item to implement this all properly before I began debugging:

*** TODO Symbol resolution needs to be addressed, especially once I
         load programs that override "weak" functions

I had put it off not knowing whether it would become an issue. It isn't a straightforward implementation, which is why I didn't do it immediately.

Now, after not doing it and seeing why it is important, I have gained a better understanding of how it is supposed to work.

Takeaways

If you do not implement the specification to a T[^5], you may end up debugging tricky things like this without any debugger nor good idea of what's going wrong.

The advantage is if you persist, you can learn new tools for understanding the problem and investigating the data.

If you're interested in other linker adventures, read about my "linker-loader" project, which talks about why I even bother with all this work.

You can also read the much simpler Know What Your Linker Knows article where I explain how objdump can be useful when debugging link errors. Around the time I wrote that article was when I started learning more about linkers, which are something you don't really have to think too hard about during regular program development.

[^1]: It does not meet the assumptions required by the GDB add-symbol-file, so I couldn't use gdb even if I manually added objects one-by-one, with specified offsets in memory

[^2]: There must be like, a dozen people in the world that would meet that criteria, right? Right?!

[^3]: I wanted to statically link to musl libc because dynamic linking to e.g. glibc seemed much more complicated, especially because I did not have any dynamic linking support yet in my linker/loader.

[^4]: Yes, I am aware I am calling functions which aren't safe to call in signal handlers. In my case I am not expecting to "ship" this debugger, it's only a means to an end, so it ended up being fine.

[^5]: Why hadn't I? Well, I find implementing things piece-by-piece and testing as I go to result in higher success rates than all-or-nothing pushes. Sometimes it bites you when the missing pieces are essential to the next test. Also, sometimes you're not really sure how to make things until you're halfway down the road making it, because it's unique/experimental and/or you don't fully understand the purpose of the specification.

emerson
New blog post: Imports and modules
Christoffer Lernö

When talking about packages / modules, I think it's useful to start with Java. As a language C/C++ but with an import / module system from the beginning, it ended up being a very influential.

Importing a namespace or a graph

Interestingly, the import statement in Java doesn't actually import anything. It's a simple namespace folding mechanism, allowing you to use something like java.util.Random as just Random. The fact that you can use the fully qualified name somewhere later in the source code to implicitly use another package, means that the imports do not fully define the dependencies of a Java source file.

In Java, given a collection of source files, all must be compiled to determine the actual dependencies. However, we can imagine instead a different model where the import statements create a dependency graph, starting from the source file that is the main entry point. In this model we may have N source files, but not all are even compiled, since only the subset M can be reached from the import graph.

This later model allows some extra features. For example we can build the feature where including a source file may also implicitly cause a dynamic or static library to be linked. Because only the source code in the graph is compiled, we'll then only get the extra link parameter if the imports reach the source file with the parameter.

The disadvantage is that the imports need to have a clear way of finding the additional dependencies. This is typically done with a file hierarchy or strict naming scheme, so that importing foo.bar allows the compiler to easily find the file or files that define that particular module.

Folding the import

For module systems that allow sub modules, so that there's both foo.bar and foo.baz, the problem with verbosity appears: do we really want to type std.io.net.Socket everywhere? I think the general consensus is that this is annoying.

The two common ways to solve this are namespace folding and namespace renaming, but I'm going to present one more which I term namespace shortening.

The namespace folding is the easiest. You import std.io.net and now you can use Socket unqualified. This is how it works in Java. However, we should note that in Java any global or function is actually prefixed with the class name, which means that even when folding the namespace, your globals and "functions" (static methods) ends up having a prefix.

To overcome collisions and shortcomings of namespace folding, there's namespace renaming, where the import explicitly renames the module name in the file scope, so std.io.net might become n and you now use n.Socket rather than the fully folded or fully qualified name. The downside is naming this namespace alias. Naming things well is known to be one of the harder things in programming, and it can also add to the confusion if the alias is chosen to be different in different parts of the program, e.g. n.Socket in one file and netio.Socket in another.

A way to address the renaming problem is to recognize that usually only the last namespace element is sufficient to distinguish one function from another, so we can allow an abbreviated namespace, allowing the shortened namespace to be used in place of the full one. With this scheme std.io.net.open_socket(), io.net.open_socket() and net.open_socket() are all valid as long as there is no ambiguity (for example, if an import made foo.net.open_socket() available in the current scope, then net.open_socket() would be ambiguous and a longer path, like io.net.open_socket() would be required). C3 uses this scheme for all globals, functions and macros and it seems successful so far.

Lots of imports

In Java, imports quickly became fairly onerous to write, since using a class foo.bar.Baz would use another class from foo.bar.Bar and now both needed to be imported. While wildcard imports helped a bit, those would pull in more classes than necessary, and so inspecting the import statements would obfuscate the actual dependencies.

As a workaround, languages like D added the concept of re-exported imports (D calls this feature "public imports"). So in our foo.bar.Baz case, it could import foo.bar.Bar and re-export it. So that an import of foo.bar.Baz implicitly imports foo.bar.Bar as well. The downside here again is that it's not possible from looking at the imports to see what the actual dependencies are.

A related feature is implicit imports determined by the namespace hierarchy. So for example in Java, any source file in the package foo.bar.baz has all the classes of foo.bar implicitly folded into its namespace. This folding goes bottom up, but not the other way around. So while foo.bar.baz.AbcClass sees foo.bar.Baz, Baz can't access foo.bar.baz.AbcClass without an explicit import.

An experiment: no imports

For C3 I wanted to try going completely without imports. This was feasible mainly due to two observations: (1) type names tend to be fairly universally unique (2) methods and globals are usually unique with a shortened namespace. So given, Foo and foo::some_function() these should mostly be unique without the need for imports. So this is a completely implicit import scheme.

This is completmented by the compiler requiring the programmer to explicitly say which libraries should be used for compilation. So imports could be said to be done globally for the whole program in the build settings.

This certainly works, but has a drawback: let's say a program relies on a library like Raylib. Now Raylib in itself will create a lot of types and functions and while it's no problem to resolve them, it could make it confusing for a casual reader "Oh, a Vector2, is this part of the C3 standard library?", whereas having an import raylib; at the top would immediately hint to the reader where Vector2 might be found.

Wildcard imports for all?

The problem with zero imports suggests an alternative of wildcard imports as the default, so import raylib; would be the standard type of imports and would recursively import everything in raylib, and similarly import std; would get the whole standard library. This would be more for the reader of the code to find the dependencies than being necessary for the compiler.

One problem with this design are the sub modules visibility rules: "what does foo::bar::baz and foo::bar see?"

Java would allow foo::bar::baz to see the foo::bar parent module, but not vice versa. However, looking at the actual usage patterns, it seems to make sense to make this bidirectional, so that all are visible to each other.

But if parent and children modules are visible to each other, what about sibling modules? E.g. does foo::bar::baz see foo::bar::abc? In actual usecases there are arguments both for and against. But if we have sibling visibility what about foo::def and foo::bar::abc? Could they be visible to each other? And if not, would such rules get complicated?

To create a more practical scenario, imagine that we have the following:

  1. std::io::file::open_filename_for_read() a function to open a file for reading
  2. std::io::Path representing a general path.
  3. std::io::OpenMode a distinct type for a mask value for file or resource opening
  4. std::io::readoptions::READ_ONLY a constant of type OpenMode

Let's say this is the implementation of (1)

fn File* open_filename_for_read(char[] filename)
{
  Path* p = io::path_from_string(foo);
  defer io::path_free(p);
  return file::open_file(p, readoptions::READ_ONLY);
}

Here we see that std::io::file must be able to use std::io and std::io::readoptions. The readoptions sub module needs std::io but not the file sub module. Note how C3 uses a function in a sub module as other languages would typically use static methods. If we want to avoid excessive imports in this case, then file would need sibling and parent visibility, whereas the readoptions use only requires parent visibility.

Excessive rules around visibility is both hard to implement well, hard to test and hard to remember, so it might be preferrable to simply say that a module has visibility to any other module in the same top module. The downside would of course be that visibility is much wider than what's probably desired (e.g. std::math having visibility to std::io).

Conclusions and further research for C3

Like everything in language design, imports and modules have a lot of trade-offs. Import statements may be used to narrow down the dependency graph, but at the same time a language with a lot of imports don't necessarily use them in that manner. For namespace folding it matters a lot whether functions are usually grouped as static methods or free functions. Imports can be used to implicitly determine things like linking arguments, in which case the actual import graph matters.

For C3, the scheme with implicit imports works thanks to library imports also being restricted by build scripts, but high level imports could still improve readability. However such a scheme would probably need recursive imports which raises the question of implicit imports between sub modules. For C3 in particular this an important usability concern as sub modules are used to organize functions and constants more than is common in many other languages. This is the area I'm currently researching, but I hope that within a few weeks I can have a design candidate.

Luke
owntempo
da447m

Community Showcase

This is a selection of recent work done by community members. Want to participate? Join us on Discord.