Introducing the LOC language

Hi there,

I've just setup a new 'Project' in handmade network for my new imperative programming language: LOC.

https://handmade.network/p/403/loc-language/

The project is still in its early stages. I've finally decided that I want to go public with it, since I'm now confident in my ability to deliver, even though I'll still probably fail at PR.

If you like the idea, or are simply curious about it, feel free to post in that thread. Any feedback will be greatly appreciated.

A small snippet of how it may feel:

HandmadeLOC.png

Is there a download, some docs and examples ?

I'm not really searching for a new language but I would give it a go to see how well it works.

Hey, thanks for the reply.

Well, unfortunately atm there is no download. I want to setup a github for it at some point... but until now it's mostly local SVN on my machine and a few backups on the cloud.

To be fair, I wasn't planning for any release any time soon, when I posted that here on an impulse. As I stated on the project page, many basic features are still missing... eg. I've just introduced enums yesterday, I don't have unions, or inlining, or any of the planned, more advanced features. And since now the backend is direct, by-hand emission of x64 machine code, even floating point runtime is currently disabled until I get my hands in that oil again (and by "disabled", I mean you'd get an error if you're lucky, or an assert if you're not.)

So, it was just ready for the few foreign calls to win32 dlls, and basic integer stuffs, basic structs and pointers... and I gave it a shot on HandmadeHero days 2 to 6... and well, it finally worked. I also have a few "learning from scratch" samples up my sleeve, with which I tried to introduce my kids to programming... but it doesn't go very far, yet.

Now, I'm aware of the mantra "release early, release often"... but I have to constantly fight my inner perfectionnist to even consider that possible. And I'm not that proud of the state of the code itself.

All that being said, if you're genuinely interested in giving it a shot, I'll do my best to setup that github in the following days.

Until then, and before I come up with some actual doc, here is the handmade hero demo, somewhere around day 006, written in LOC (barring the includes in which we define MS-Windows types, and kernel32/user32 foreigns), as your de-facto 'hello world':

sample_main.loc

print.loc


Replying to mrmixer (#29385)

All that being said, if you're genuinely interested in giving it a shot, I'll do my best to setup that github in the following days.

Don't do that for me. As I said, I would test it, but I'm not looking for a new language and it's unlikely that I'd invest a lot of time in your language. But I can test it a bit to at least give you some feedback.

I'm aware of the mantra "release early, release often"...

Not everybody thinks that's a good idea, so don't do it just because somebody said it was great.

Here are a few thoughts about the syntax:

  • I prefer braces over indentation, personal preference I guess;
  • The as keyword feels off; I don't think there are other keywords like it in the examples.
  • The as= is a bit weird too. What is the difference with := ? I don't follow new languages that much but I thought a common declaration/initialization syntax was name : type = value and that if you omit the type you got type inference. name := value means name will have the type of value;
  • The point in pBuffer as ^.u8 feels weird. (^.r8)(&tAsciiChars) also.
  • ^ is annoying to type (requires ^ key followed by space on my keyboard layout). Is it used to both declare and de-reference pointers ?
  • u8.#trunc(uRemainingValue) feels a bit weird too. I'm assuming # is used for "compiler" directives so the fact that there is u8. in front makes it feel like it's a member of u8;
  • u8(#"0") is also a bit weird;
  • What's the meaning of %% and /%= ?
  • Dereferencing with ^ after the variable name seems a bit off. I don't know if it because I'm not use to it or if reading direction is involve. ^some_var reads as "deref some_var", where some_var^ reads as "some_var deref".

In C, I've started to use the syntax (using macros) deref( some_var ) to dereference pointers. For me it was to try to remove the ambiguity of operator precedence, but I use it even when there are no other operator and I like that syntax. I've also been using cast( type, expression ) for similar reasons, but it's sometimes annoying e.g. cast( f32, cast( s16, something ) ) vs ( f32 )( s16 ) something but it may help keep the syntax the same for ^.r8; cast( ^.r8, &tAsciiChars );


Edited by Simon Anciaux on
Replying to GuillaumeMirey (#29387)

Thanks again for the feedback, very much appreciated.

[deleting a long story]

Okay so, about the syntax... what LOC can bring to the table, being that Odin and Zig (and Jai?) are there already, is, I thought at some point, some different sort of appeal towards some HMN-preferred (allegedly?) imperative programming, close to the machine. For people having been introduced to programming with the likes of Python - I believe there are more of those than there are HMN members... ; and for total newcomers also. My target-sample for total newcomers being... my kids.

I've thought hard about introducing them to program in C, as Casey could do... but couldn't resolve to try and teach them programming with curly braces all over the place or trying to explain why the star in

const int* the_variable = hello <bracket> world <bracket>;
is where it is... so here you have the final pitch for LOC: close to the machine, but being my language (scratching all kinds of itches, there), +control, +appeal in the form of a clearer, more inviting syntax.

So, I haven't programmed much Python myself, I'm of the C/C++[/Java/C#] brand by trade, and C++ by heart. With curlys. So, C++-frustration aside, it also feels weird to me atm, given that I haven't programmed that much in LOC yet, to forego curlys and semicolons. But I believe I can get used to it at some point. Even more so by the fact that programming the LOC compiler (in C++) with a few more macros than I would dare to write at work, made perfectly clear that I do not want to be hunting for a closing brace missing its opening brace in my files, ever again.

About the '::' and ':=' and ':'

This syntax is used by both Jai and Odin. To be fair, I kinda like their choice, and initially went for it. But I also remember Casey telling once, somewhere, that when he was young he was very puzzled by the math class, and the use of the '=' symbol, after having used that as the assignment operator all over the place. Well, he has a point, there. A very valid point. And given that I want to teach my kids, I did not want to expose them too much to an assignment-'equals' before they ever had a chance to see an equation at school. So, I chose the pascal ':=' for assignment

Now, because of that choice, ':' cannot represent the var declaration any more. And ':=' certainly cannot be used as the type-inferred var decl. And thus, 'as', it is... taking over the role of the previous ':' (so myvar as u8 = 5 is the LOC equivalent to jai's myvar : u8 = 5 and myvar as= 5 respectively for myvar := 5 in jai) It also feels a little weird for me atm, but I'm also getting used to it, as time goes. Constant decls, on the other hand, are restricted to the double '::' and cannot be using the separate-token form, like myconst : u8 : 5 (we'd use myconst :: u8(5) instead for same effect) I'll maybe change all of that at some point... (and I'm open to suggestions for an alternative declaration syntax, which would allow to keep ':=' specifically for assignment).

The point between ^ and the type to declare a pointer-type is indeed peculiar, but solves both a special representation for my parser, and the "issue" of ^ being a dead-key in some european keyboards: caret-dot is not "dead" any more. And using ^ for dereference comes at the end, as you noticed, so it's unlikely to produce î or ê by mistake either.

As for the why: it's for symmetry, and consistency with my_table as [16]u8 for declaration, and my_table[i] for deref usage : you have my_ptr as ^.u8 for declaration, and my_ptr^ for deref usage. Maybe we're not so far apart, and it's what you're saying too, when mentionning your deref macros, but it always feels a little weird for me, years after, to have the deref star symbol come before the whole thing in C... I'm always tempted to leave parens around if there's any dot-descent in there... but I have the same problem with the addressof operator '&', mind you... which I still haven't adressed in LOC.

#"0" is currently my syntax for the '0' character (instead of the "0" string). I want to leave the single quote available for something else entirely. This may be subject to change.

%% and /% are currently for integer remainder and quotient, respectively. % and / are fmod and fdiv. This "may" become different once I iron-out the allowed implicit casts, but for now it makes very clear that 1/2 shall be 0.5

Regarding you comment:

u8.#trunc(uRemainingValue) feels a bit weird too. I'm assuming # is used for "compiler" directives so the fact that there is u8. in front makes it feel like it's a member of u8;

Yup, it feels like it's a member. This "was" by design, although I might reconsider if that weird feeling is shared by many others, or persists... Atm I'm okayish with it. A bare u8(something) has invocation-syntax, and would default-cast something to u8. u8.somecastspec(something) would cast something to u8, according to a cast-spec which is not 'default'. Now, assuming the type could be a struct, I didn't want the possible cast specs to conflict with members, thus they're prefixed with # so that you cannot use them as your user-specified identifiers anyway (and they stand out in code as something peculiar you're doing there). As for '#' being compiler directives, well... it's not as if LOC had a C preprocessor, so it's more of a marker for something special, yeah. And preventing name-collisions in a more enforced manner than "user shall not use double underscore prefixes".

Many thanks again for your comments.


Edited by guillaume.mirey on
Replying to mrmixer (#29388)

To reiterate:

I prefer braces over indentation, personal preference I guess;
I'm not dismissive of that feeling, at all. I'd very much like a real debate however, especially with people having embraced the handmade manifesto, of the possible added value of the curlyless syntax.

The realization, to some people, that the indentation syntax they came to like in Python, is not what prevents a language to be a compiled, close-to-metal langage, is maybe worth exploring.

Much like what Casey did with handmade hero in its first days had impact because it made people simply realize it was... just... possible.

And once you understand that computers are so fast that you can draw pixels by hand 30 times a second (provided you did not model a "pixel" as a freakin class through a shared pointer), using a compiled language, which is not even optimized yet (next big stepstone in LOC is to add some basic register allocation "strategy", if you see what I mean...)

... and that it didn't take that long to code...

... maybe we could attract more people to that way-of-doing things.

SecondLesson.jpg

... right ?

SecondLessonCode.jpg

... thoughts ?


Edited by guillaume.mirey on
Replying to mrmixer (#29388)

And once you understand that computers are so fast that you can draw pixels by hand 30 times a second (provided you did not model a "pixel" as a freakin class through a shared pointer), using a compiled language, which is not even optimized yet (next big stepstone in LOC is to add some basic register allocation "strategy", if you see what I mean...)

... and that it didn't take that long to code...

... maybe we could attract more people to that way-of-doing things.

I find these claims very strange. By that I am not sure what are you trying to demonstrate. Because the code you posted with graphics.DrawXYZ calls can be written in any slow non-native/non-compiled, interpreted, no-optimizations language you can imagine (like Python) and it will easily work the same 30fps or much much much more fps. There's no need for new language just for that.


Edited by Mārtiņš Možeiko on
Replying to GuillaumeMirey (#29390)

Well they're written pix by pix in that same language, as on the beginnings of handmade hero. It's not the DrawRectangle call per se. Dunno about python perfs tbh, I could be surprised.

I ain't trying to demonstrate per se, I'm asking if you think this goes in the right direction.

Dunno if it sounds like I'm vindicative here, this wasn't my intent. The advantage of being able to show the implementation of drawrectangle, readily as a simple loop setting the color of each pixel, once you introduced procedures and the student asks 'but what's behind DrawRectangle() ?' was clear in my head, but perhaps I'm misleaded.


Edited by guillaume.mirey on
Replying to mmozeiko (#29392)

Okay, so I had not seen this

https://guide.handmade-seattle.com/s/2022/languages-are-banned/#627

Now I have.

I guess I won't bother you guys with LOC.

That video is about Handmade Seattle conferences, not about handmade network. You're welcome to present and discuss languages here or on the discord, no problem.

About indentation vs curly braces: the idea behind those is to delimit a scope. For me it's better to be explicit about the start and end of the scope, and visual "tokens" are more explicit in my opinion. And curly braces (in C) are only used for that and initializer (which looks a bit like a scope but not quite); while indentation can be used for other things, like aligning parameters of a function when you write them on multiple lines. Chasing curly braces is not something that happens often in my opinion, at least it was never something I worried about.

You seem to make some decisions to make the language accessible to kids (or beginners) while trying to keep everything possible (which is good in my opinion). Could it be that instead of having "symbols" you might want to use "words" ? Like you use as instead of :. For example:

/*
"as" to declare a type
"assign", to assign a value
"reference", "ref" to declare a pointer
"address_of", "addr" to get the address of something
"dereference", "deref" to dereference a pointer
*/
a as s32 assign 5
b as ref s32 assign addr a
c as s32 assign deref b

a as s32 assign 5
b as reference s32 assign address_of a
c as s32 assign dereference b

/* Maybe with optional parenthesis if it removes some ambiguity. */
a as s32 assign( 5 )
b as ref( s32 ) assign addr( a )
c as s32 assign deref( b )

revert_digits_in as proc(pBuffer as ref u8, uDigitCount as u8) {

    uDigitRev as u8 assign 0
    uDigitMid as u8 assign uDigitCount >> 1
    uDigitRevRev as u8 assign uDigitCount - 1

    while uDigitRev < uDigitMid {
        tmp as u8 assign deref(pBuffer + uDigitRev)
        deref(pBuffer + uDigitRev) assign deref(pBuffer + uDigitRevRev)
        deref(pBuffer + uDigitRevRev) assign tmp
        uDigitRev assign uDigitRev + 1
        uDigitRevRev assign uDigitRevRev + 1
    }
}

For u8.#trunc( something ), would #trunc( u8, something ) make more sense ? (I don't have an arguments for that ).


Edited by Simon Anciaux on Reason: syntax for proc
Replying to GuillaumeMirey (#29394)

That video is about Handmade Seattle conferences, not about handmade network. You're welcome to present and discuss languages here or on the discord, no problem.
Nice of you to tell me that. I'm aware Abner is mostly responsible for the Seattle Conference, nowadays but... IIRC he was most influential in starting all this here, and he somehow has a point in that video.

I'm not a very socially adept person. Yet I somehow "care", dunno really why, about the future of programming and what the current and next generation will be exposed to. Maybe I worry about that for my own kids, or maybe I feel that I can't even begin to express HMN values to my coworkers, and I'm pissed enough about it.

If there is a legitimate concern that we may become too divided by those endeavours, well... I don't really want LOC to "compete" againt Odin, or some of the other new languages mentionned there I didn't even knew existed.

Tough call.

Lemme think some more about all that... In the meantime, I'll still answer on the subject of increasing the accessibility, since whatever the answer to the above concern, I'll probably continue delving into that.

You seem to make some decisions to make the language accessible to kids (or beginners) while trying to keep everything possible (which is good in my opinion). Could it be that instead of having "symbols" you might want to use "words" ? Like you use as instead of :. For example:[...]
That's indeed a very vivid goal in my mind, and I would seriously consider replacing symbols by words where it makes sense (yes, I already have 'as', even if not perfect... also boolean not,and,or are words). Yet I'm striving for a balance between this accessibility, and some still familiar usability. So, I won't go too far there: I do not see replacing '+' by 'plus' as an advantage, for example; And I won't replace a short and ubiquitous assignment symbol by 'assign' as a word either. Besides, a few short keywords in english (eg. if, while...) are perfectly learnable as a person/child of any culture, but you do not want to have hundreds of them for internationalization: '+' or '=' are far more universal (and known to kids) than the english 'plus' or 'equals', for example.

After that, the more advanced a concept, the less likely I'd compromize for accessibility. If you're starting to really use bit-ops and transmutes... well, I'm assuming you're mature enough to also handle the syntax for those. The only 'constitutional' requirement for my current philosophy in that regard is that advanced tricks are made explicit. Every bit of additional "control" over the end-result in machine code shall also fall into that "advanced" category. One woud need to make sure the underlying algorithm is not cluttered by the perf annotations... either as a language thing, or an IDE thing...

I won't change your mind about indentation, I believe ^^'. If I ever deliver on that associated IDE, we may talk with real examples about the things I have in mind, about clarity and all that. I need those examples myself. I'm 'also' a curly-trained person.


Edited by guillaume.mirey on
Replying to mrmixer (#29395)

Apparently TortoiseGit must talk to a PuttyKeyAgent with my private key but that can't be encoded as a file with ppk version 3 since it's too recent. So I simply have to encode that to v2. Yup. Good to go. Exactly. Oh, wait... err... that agent must also know of my key beforehand, or maybe should the ssh client be aware of it. Or have some allowed whatever in whatever system file ? Be wary that the ssh configuration may be overriden by the configured ssh client in TortoiseGit, by the way do you have <someothercrypticexe> instead of /usr/bin/ssh setup in a <freakingsystemwideconfigoption> ? No ? maybe in the 4-deep-setting tree somewhere in TortoiseGit, then. Oh, before I forget, yes, right, HTTPS is no longer supported for GitHub since 2021, so I don't even know what it does but surely I have to find a way to configure that good'old ssh protocol correctly as of 2023. Best of luck to me. Kiss Kiss.

Twice a 'config-' word in same sentence is never good news to me. No matter the length of said sentence.

so...

I'll do my best to setup that github in the following days
my best don't seem Gudinov atm... man, does all of that feel so tedious. That's my very own impostor syndrome right there. Still trying, though.


Edited by guillaume.mirey on
Replying to mrmixer (#29385)

There you go

I'll do my best to setup that github in the following days
https://github.com/gmirey/LOC

publishing that code has made that long overdue cleanup-pass almost feel like it is aching now... Working on that right away.


Edited by guillaume.mirey on
Replying to mrmixer (#29385)

I compiled it and was able to compile a simple program. I would add -nologo to the compile options of MSVC to remove a few lines from the output.

It would be nice to have some doc about basic types, and simple sample like hello word (to know how to create the entry point). A "include" or "lib" folder with basic functions (more or less the kernel32.loc file) would be useful.

kernel32 :: #load "kernel32.loc"

print :: proc( s as string )
	stdout as= kernel32.GetStdHandle( kernel32.STD_OUTPUT_HANDLE )
	_ := kernel32.WriteConsole( stdout, s.ptr_to_bytes, s.length_in_bytes, 0, 0 )

main :: proc ( argc as i32, argv as ^.^.u8 ) -> i32
	print( "Hello Dave." )
	return 0

Replying to GuillaumeMirey (#29399)

I compiled it and was able to compile a simple program. I would add -nologo to the compile options of MSVC to remove a few lines from the output.
Nice!

okay.

And btw thank you for having taken the time !

It would be nice to have some doc about basic types, and simple sample like hello word
Next TODO after the current cleanup pass would be to add just that.

(to know how to create the entry point)
That, however, would fall into the 'not yet implemented' category. Current entry-point is whatever proc is called 'main', atm, no signature check yet... and to get C-like functionnality such as command-line args passed as the main func parameters... well, I'd need to check how win32 C does it... could it be the case that such functionality may be directly programmable in LOC with a few calls to the kernel ? Have not dug into this atm, at all.

A "include" or "lib" folder with basic functions (more or less the kernel32.loc file) would be useful.
Of course. I was more or less planning, however, to flesh out basic language features before any major work on libs. Currently in kernel32.loc are only those features required for the first round of checking that it's currently more or less WAD (and the most basic of 'print' procedures in print.loc before I get some implementation going on for va_args and an 'any' type).


Edited by guillaume.mirey on
Replying to mrmixer (#29402)