Introducing the Gordon programming language

Hello,

Have never posted here before and just registered yesterday, although I have browsed the site a few times before. Love the handmade manifesto!

This seems like the right place to post about my language, called Gordon. It's a compiled imperative language similar to C. Here are some of its features:

  • module system
  • variadic functions
  • simple struct inheritance
  • simple polymorphism
  • handy expression macros
  • compile time execution

The compiler is 34k lines of code with blanks and comments and it's self-hosting (using LLVM). The standard library on the other hand has almost nothing in it yet, it's a little over 600 lines. There are currently no known bugs because I have fixed every bug that has occurred to me along the way. Only x86-64/Linux is currently supported. Oh, and there is no website :P

Here is some Gordon code to look at: https://gist.github.com/tjhann/f26dac0b7c99e850c210a56d61193620

One thing I wanted to make sure is to keep the compiler from being too tightly coupled with LLVM which is the primary reason it uses an intermediate representation (IR). The IR makes it fairly easy to make other backends for Gordon and I hope to finally have a C backend and be able to reboot from C.

I might need a lot of help with that though because I don't actually have a background in C, I come from D, which clearly shows in the syntax and feel. I always liked the core D, but D never provided the quality and stability I wanted. I sticked to a very small subset of it too – I'm not a big fan of metaprogramming.

The compiler and std: https://github.com/tjhann/gordon

Binaries: https://github.com/tjhann/gordon-boot

Edit: I've got a little website now: https://tjhann.github.io/gordon-web/. For new users posting here, you may be shadow banned and nobody else can see your posts. I'm seeing a post from James indicated on the forums page but can't see the post.


Edited by Tero Hänninen on Reason: site, invisible reply

Replicating C++ features is a fine way to learn making compilers, but if you want the language to gain popularity in the long run, you need some major advantages over C++ so that people want to make the switch after having used it for many decades. Many new languages add features on top of C that already exists in C++.

Some of the C++ pitfalls that can be addressed in your language as a start:

  • You can experiment with more original ways of writing. People are often confused by =, == and != when learning C for the first time, because these operations don't follow math convention.
  • Compilation times growing quadratically. I hope that your import keyword is not just a wrapper over header files, which are being phased out in newer versions of C++ to avoid having to parse the same class structure for each compilation unit using them.
  • Linker problems and no standard build system. People are driven into madness from trying to maintain CMake files or their own build systems that are not compatible with other libraries.
  • Unsafe pointers and uninitialized memory. You can make abstractions that have memory protection built into the debug binary to catch more errors with better messages.
  • Theoretically undefined behaviour from two's compliment and endianness. Making developers wonder if they should support old systems or not in time wasting discussions.
  • C++ has too many features that does practically the same thing, which increases the time needed until other's code can be understood. This is the easiest part to fix by finding a powerful subset of features that can solve the same problems with a slightly different approach. Ideally in a way that leads to faster and safer code. Like optimizing use of indices or making pointers safe instead of having iterators.
  • Unmatched parentheses over multiple lines can be a time waster to track down, so a good syntax and parsing strategy can find where the incorrect character is located. In C/C++, the typo might even be from another module because of how headers are just pasted text with no checks about ending parentheses before end of file. Python solved it using white-space sensitivity. Basic solved it by being line-break sensitive.

Then look at the worst coding nightmares people have and try to find a new language feature that can solve many of them without being a do everything code generator. Most progammers cannot handle macros and code generators safely but they should still have a language that lets them solve their problems in real life.


Edited by Dawoodoz on

I'm not really replicating C++ features nor was I thinking C++ at all while I created this. I simply made Gordon based on my experiences with D and primarily for myself, to support my way of doing programming without trying to please anybody else. Besides that, it's only my gift to whoever knows how to appreciate it.

One big point about Gordon is that it's small and simple. That is an advantage that I love perhaps most about it.

If I could choose which group of programmers Gordon would attract the most, it would be C coders.

  • I like the usual operators myself and will keep them. There's also less friction for those who come from C/C++/D/etc.
  • Oh no, there's a real module system and each file is parsed just once, imports are not wrappers.
  • Linker problems: please expand on this. I feel that my understanding on linking is still fairly weak and I want to learn more. I've been wanting to read the old "Linkers and Loaders" book for a long time but so far haven't. One thing I've thought of is encoding package versions in mangles to avoid conflicts. So far it's already possible to pass in custom package name to the compiler to affect mangling.
  • Building: I'd like there to be a standard build tool but so far haven't thought about it much. I know though, that it's going to be a separate program and if people want a unified interface to it and the compiler and possibly other tools, then that interface should be a script/program that commands the compiler, etc. I absolutely detest integrating those tools into the same binary.
  • Pointers & memory: Indexing and slicing are checked in dev builds. Memory is always initialized to zero except if you call malloc or do something like that. No smart pointers unless you count the built in dynamic array type.
  • What do you mean by "theoretically" UB from two's complement and endianness? Integer operations are specified to wrap around in Gordon (like -fwrapv).
  • Feature set: There's not much overlap, one of the main reasons I began Gordon was to have a small and simple language. I'm planning on pretty much sticking to the current feature set and only add things like support for atomic ops, simd and that sort of things (which I so far have very little experience of).
  • Unmatched parenthesis: Hmm, haven't had much trouble with this myself... I guess the parser could be improved in this regard, but it's not a high priority. Edit: I decided to just do this because it was trivial. Parse error inside parens will include: "note: inside parens opened on line 8".

Yeah, I generally don't much like features that generate code. That's why Gordon's macros only ever yield a single expression, for instance, and there is no preprocessor. My way of working has never involved much metaprogramming and my experience is that it's often a huge mistake to go for it. People think they need it when they don't.

That's one reason why wanted to leave D behind, it provides lots of metaprogramming and just about every D coder seems to love that stuff – I was an exception.

A couple features to prevent mistakes:

  • exhaustive switches on enum values
  • switches without an explicit default case trap (with error message in dev builds).
  • exhaustive struct constructors, where you can use '...' to zero initialize fields without an explicit init value.

Thanks a lot for your input!


Edited by Tero Hänninen on Reason: See "Edit:"
Replying to Dawoodoz (#25356)

Let me expand on the module system and how it works with conditional compilation.

It's influenced by Rust in that there's index modules, one per directory. Unlike other modules, they can contain module declarations. One might look like this:

// One can define custom conditional symbols like this.
// They are automatically visible to all submodules.
#define cond = abc || def;

// Modules can be single files or directories of files.  If the condition evaluates
// to false, the compiler won't try to open the files or directories at all.
#setup[macos && cond] {
    module math;
    module xyz;
} #else {
    // ...
}

int main() { return 0; }