The 2024 Wheel Reinvention Jam is in 16 days. September 23-29, 2024. More info

Jam Submission - Git importer for Darke Files

TL;DR:

Download at bottom.
Backup data and then try.
Report bugs; have fun!

Darke Files is the version control / file synchronization system I've been working on for the last few months. I've described it a bit in my introduction post.

The CLI comes with two executables, darke and daf, the latter is a handy shortcut for any command starting with darke files.

I've also built a bare-bones server-side application.


For the jam, I've built a Git importer for Darke Files allowing you to transform a Git repository into a Darke repository.

On the command line you can just do

$ git fast-export --all | darke files import-git

Directly piping Git's fast-export output into the Darke CLI will auto-generate dummy Darke identities for your Git committers. (See also below.) To manually set them, first save Git's fast-export to a file and then let Darke CLI generate a mailmap. Edit that mailmap file by hand and pass it to your proper call of import-git:

$ git fast-export --all > my-project-git-export
$ darke files import-git --generate-mailmap < my-project-git-export > mailmap.txt
$ nano mailmap.txt
$ darke files import-git --use-mailmap mailmap.txt < my-project-git-export

The mailmap file contains three columns: The Darke identity (a string in the form of user@host with host being darke.invalid by default), your Git username and your Git email address. Edit the first part to match your user name and Darke server host.

I've converted my Darke CLI Git repository like this:

At this point copy the generated .darke directory into a new, empty directory to avoid messing up your Git repo. In there, use the Darke CLI to list all refs from your repository (generated from your Git branches and tags, see below for more):

$ darke files refs

Choose one of them to switch to (checkout in Git terms) which will extract all the files of that state into your directory. You probably have one named master or main:

$ darke files switch main

At this point, a darke files status should tell you that you have a clean repository. For my Darke CLI Git import, it worked like this:

Congratulations, you now have a fully functional Darke repository.

And after pushing the repository to the server, you'll be able to browse it and look at the history in all its glory:

image.png


In my introduction post, I've identified two features in particular that might not be completely straightforward to implement:

Git's branches and tags vs Darke's refs

A ref in Darke is a pointer to a certain state in your version history. It fulfills the jobs of both Git's branches and tags while being drastically easier to understand. During import, I record the latest commit found for both branches and tags and then create a ref for every one. This would not work in case you have a branch and a tag named the same but I don't believe this to be a common use case. I only consider local branches and tags since otherwise there would be a lot of name conflicts (so pull them all before creating a fast-export).

Identity Handling

This was easier than expected. I built the mailmap feature so you can build that mapping as you want or you let Darke auto-generate a placeholder identity. There is one bit of data that gets lost in the process: Commits are authored by their Git committer and not their Git author since the former is guaranteed to exist. Darke doesn't have a way to attach multiple authors to a commit (yet; Git's way of attaching people to a commit is lacking for many use cases and I intend to build something that'll work better).

One thing that came to bug me a lot, however, was something I had not considered previously:

Directory Handling

Git doesn't handle directories at all, only files. (That's why people use these annoying .gitkeep or just .keep files in otherwise empty directories.) Darke was built differently from the ground up, for it directories are just normal entities to store in a repository. The import now takes the Darke repo directory from the previous commit and transforms that into an internal data structure. It then updates several files depending on what happened in the Git commit. Finally it re-hashes the data structure, updates directories that changed and passes the directory hash along to build the Darke commit.


I'm happy with the outcome of the jam. On Sunday it turned into an open-ended task allowing me to stop and call it done whenever I felt like it. I've managed to complete the task I've set for myself to an acceptable level without too many surprises coming up.

Concerning programming in general, I've learnt a few tricks about reading stdin and debugging programs that read stdin. I've built a nice timings tracking system to inspect the longer taking sub-tasks of the import (run with --debug for the output) that I'll include in other parts of the CLI and server. With relation to Git importing I wouldn't say I've learned anything new, more so I've been painfully reminded of some of Git's idiosyncrasies.

Overall I don't think I've written typical jam code; all this is usable and not code I'll have to rip out as soon as possible. So from my vantage point, this was me doing this for real.

And therefore I've decided to release a very alpha version 0.1.0 now so that all of you can play around with it. Try out the import-git command on a Git repository you have and let me know how it goes. Spin up the server and push some changes there. Play around with the way I've built version control. Checkout the quick start guide in each package for a few more pointers on what to do and how stuff works. But please do let me know of bugs you find or anything else you think I should take a second look at.

// IMPORTANT: Keep backups and don't use this for production data.
darke-cli_0.1.0_linux-amd64.zip | darke-cli_0.1.0_windows-amd64.zip
darke-server_0.1.0_linux-amd64.zip | darke-server_0.1.0_windows-amd64.zip

(I might be able to get you a build for your OS and architecture if these won't run on your box. Ping me on Discord.)


Edited by abec on Reason: update title