This article is mirrored on my blog.
I recently released File Helper,
a file organization application I wrote using Cakelisp.
This application had only two external files that were necessary for it
to fully function:
- A font
- An application icon
I packaged File Helper in a
.tar.gz for Windows or Linux
respectively. These archives contain the platform executable as well as
a license file and the two necessary font and icon files.
However, wouldn't it be nice if instead I shipped a single executable,
thereby eliminating the extract step?
It might sound trivial, but eliminating that extra step has many
- Less technical users won't get confused. Double-clicking an archive
usually opens it in a browser rather than extracting it, which might
confuse them and cause them to not use my product.
- The application has no risk of breaking if the executable is moved.
- The user doesn't have to delete or move the archive after they
Bundling files into executables
An executable is just a file format which your operating system
understands. It is essentially a header and a whole bunch of sections
filled with binary data.
Typically, a linker converts a collection of object files into a
single executable. Because executables are containers which can hold
various kinds of data, we can package data only our application
understands in the same container as the application code.
The operating system is fine with this because it only needs to map the
executable into memory and start executing code at a designated entry
point. It is then up to the program to decide how to interpret the
various executable sections.
There are many different file formats for executables. Usually, an
operating system only supports one executable file format. On Windows,
it's the Win32 Portable Executable
typically with extension
.exe. On Linux, it's usually
I am only targeting those two platforms, so I can add code to
specifically support those formats when building Cakelisp programs.
On Windows, data is added to executables via Resource Files. I wrote a
tutorial on how to do
On Linux, data can be added via dumping the data to an object file which
defines a couple symbols. This is a great
on how to do that.
Good and bad ways
Like everything in programming, you'll hear different advice on how to
The most common alternative method is to convert your data to a C-style
array definition. This has many limitations, and in my opinion should be
- Some compilers (MSVC included) limit the number of elements in an
array, which therefore limits the size of the bundled data.
- Your compiler has to do extra processing (tokenization, parsing,
etc.) to that data which it should actually just treat as a giant
binary blob. Extra unnecessary processing means longer build times.
- An extra stage has to be created and compiled as part of your build
system, which adds complexity.
We are going to proceed with the platform dependent but much more robust
approach, which is to convert our data to object files without using a
Integrated build system
Whether we are on Windows or Linux, we need to process our data file
into some other form in order for the linker to properly understand the
data package. This means adding a step to our build to process the data,
because we want it to automatically stay up-to-date when linked in the
Cakelisp includes a simple C/C++ build system as well as compile-time
code execution. We need to create a new build step to process our binary
data into object files. In order to do that, we use a compile-time build
hook to execute a function which performs the conversion.
The full code is
The end-user interface is simply:
(bundle-file data-start data-end (const char)
data-end to represent pointers to the
symbols associated with our data.
bundle-file invocation is a macro that adds the data file to a
list. It also generates the variables we can use to refer to the data.
Finally, a compile-time function
convert-all-bundle-files calls the
objcopy (or Resource Compiler on Windows[^1]) to generate
the actual object file for each
bundle-file. It only does this if the
data files are changed or the object files don't already exist in the
We can then link the generated objects into the executable alongside our
code object files. It also adds that object file to the linker command
This function is integrated into the Cakelisp build sequence like so:
(add-compile-time-hook-module pre-build convert-all-bundle-files)
This is pretty great: we extended our build system to support bundling
arbitrary data files, all without touching Cakelisp's internals itself.
Not only that, we extended the system in the same language we write our
application code, and within the same invocation---we didn't need to
create some other phase. We were also able to provide the user with an
extremely simple interface to bundling files.
[^1]: On Windows, we need to generate a
.rc file with a list of all
the resources that should be compiled into a single object file.
Because Cakelisp allows arbitrary compile-time code execution, we
can easily do this by writing the filenames out to the generated
rc, then invoking the Resource Compiler on that file. This
platform-specific step can be completely automated!