When do libraries go sour?
The Handmade community is often opposed to using libraries. But let's get more specific about why that can be, and whether that's reasonable. What do we look for in a library? When do libraries go sour? How do we evaluate libraries before using them? How can the libraries we make avoid these problems?
This is a
fishbowl: a panel conversation held on the Handmade Network Discord where a select few participants discuss a topic in depth. We host them on a regular basis, so if you want to catch the next one,
join the Discord!
Welcome everyone to another fishbowl.
Our topic for today is responsible code reuse, specifically as it relates to library usage.
Some points we might cover today:
* When do you reach for an existing library? What do you look for in a library?
* When do libraries go sour? When does using a library become irresponsible? What are the telltale signs of issues with a library?
* How to evaluate a library before using it.
* When making a library for others, what do you do to ensure it can be used responsibly?
* The community's general stance regarding using libraries vs DIY.
* What does all of the above mean for package managers and today's culture around library usage?
Participants:
@NeGate
@demetrispanos
@raysan5
@ryanfleury
@AsafGartner
We'll start with a question from @ryanfleury:
Which libraries do we use in our day-to-day projects? Why? Could we replace them easily if we needed to?
Hi everyone! The reason I wanted to start with the concrete is so that we can focus on real cases where the traditional form of code reuse (using libraries) worked out well, because I've seen the abstract route of the discussion be a bit more unproductive. Instead we can see what has been both responsible and useful, and build up the more abstract rules from there.
I use raylib with most of my projects and raylib only uses single-file header-only libraries, most of them are the stb libraries.
(edited)
One example I can give is pgx: https://pkg.go.dev/github.com/jackc/pgx/v4
It's a postgres library for Go that we use for the Handmade Network website. The main thing that it solves for us is a knowledge-acquisition problem. It talks to postgres over its binary protocol, which to the best of my knowledge is undocumented. So doing it ourselves would require reverse engineering libpq (the official C library).
For me personally, I've had the most success with "data-in, data-out" libraries - a few examples of which are well-known in the community. stb_image, meow_hash, stb_truetype
(They are much easier to evaluate when they are open source like this)
for Cuik, it's mostly just stbds (im phasing it out) then TB has luajit, and tbbmalloc (which is also wanna phase out lol)
(edited)
for game projects it's stb_image, FreeType, sokol, raylib, cgltf and a few others
(edited)
for me, common ones are stb_image, zlib, sqlite, lua/luajit (though not sure if this should "count" even if embedded), and then various specialist scientific computing libraries e.g. fftw
AsafGartner
One example I can give is pgx: https://pkg.go.dev/github.com/jackc/pgx/v4
It's a postgres library for Go that we use for the Handmade Network website. The main thing that it solves for us is a knowledge-acquisition problem. It talks to postgres over its binary protocol, which to the best of my knowledge is undocumented. So doing it ourselves would require reverse engineering libpq (the official C library).
Interesting connection here, one important aspect of code is the fact that it captures knowledge. It's not explicit, direct knowledge - it's baked down quite a bit - but the knowledge is really the most useful part. So for stb_image, the useful part is not quite the API (you could imagine tweaking it), it's rather that decoding a PNG in a fairly good way is difficult enough that unpacking all the knowledge is fairly high-effort. Same with meow_hash or for that matter any other hash - producing a hash that has fairly good results for a given problem is a difficult process that requires expertise and care, and it's often quite orthogonal to the problem in question
ryanfleury
For me personally, I've had the most success with "data-in, data-out" libraries - a few examples of which are well-known in the community. stb_image, meow_hash, stb_truetype
You also refer to them as "leaf libraries", right?
AsafGartner
You also refer to them as "leaf libraries", right?
Yeah. I don't think libraries are only possibly useful as "leafs", but I think they have a higher chance of being useful when they are, if that makes sense.
Notably because as you get "closer" to the "knot" of a codebase, you have to start making more assumptions. Leaf libraries are in the privileged position of making very few assumptions - "here's the data transform for this problem. We studied it and know it well - pass in the source data, and we'll use our knowledge to compute the output data"
I'm not sure, actually. Maybe someone else here knows?
It's interesting that we came up with the same terminology.
To me leaf just means that it's replaceable in a bounded way without affecting the rest of the tree.
(edited)
Right yeah. That is maybe a more succinct way of putting it :)
a related concept I use, similar to "leaf", is whether or not using the library forces me into using their type system
4
if the library just accepts arrays of primitives (as an example) then any library could do that
AsafGartner
To me leaf just means that it's replaceable in a bounded way without affecting the rest of the tree.
(edited)
so a replacement would need to be compatible only in terms of API? it doesn't necessarily imply anything about the size correct?
demetrispanos
if the library just accepts arrays of primitives (as an example) then any library could do that
For sure. I think that this also becomes much easier as an API designer when it's a leaf-like scenario
I wouldn't even say that it needs to be compatible. It's just that it doesn't force deep changes in the entire codebase.
(Not counting find/replace of a typename or simple things like that)
demetrispanos
a related concept I use, similar to "leaf", is whether or not using the library forces me into using their type system
this is weird spot in C because the type system is limited enough that everyone and their moms must redefine a slice or other common structures
(edited)
basically library design is sorta held back by the limitations of the language
It's partly limitations of the language and it's partly the language not being opinionated about very common things. So for example, in Metadesk and the debug info stuff we did at work, the library defines a "string type" (just a ptr + size), and then to start parsing, you have to package up your primitives in that form. You can obviously define an API that just takes a char * or something of the sort, but then that function necessarily becomes non-composable with the rest of the library, which uses the length-based string type.
NeGate
this is weird spot in C because the type system is limited enough that everyone and their moms must redefine a slice or other common structures
(edited)
personally I rarely have a problem designing library interfaces that only rely on primitives, or on extremely shallow structs
2
if you do something DOD-like or relational-like to model your problem, you can work very close to the primitive type level
so usually i just avoid passing around slices in my libraries because it's weird, i'll just unravel it
C APIs are usually clunky in this way - e.g. the user must first know to call NeGate_MakeSlice(...) but I think the composition effects are worth it
Otherwise you get these weird "user-facing" APIs floating around, that you eventually want to reuse, that aren't using the composable types that the rest of the library knows to use
(And this is tied in with another API design aspect, which Casey has spoken about before, which is that an API consists of low-level, composable, granular APIs that are precise, and higher level APIs that bundle functionality together, but all of those APIs are user-facing - there aren't "internal" APIs, for the most part)
Anyways I suppose this is going down an API design rabbit hole, which is maybe a bit off-topic
(edited)
ryanfleury
Anyways I suppose this is going down an API design rabbit hole, which is maybe a bit off-topic
(edited)
yeah I agree we shouldn't dig too deep on library design as such, but I think it highlights some of the aspects that contribute to code reuse decisions
1
so Ryan, wrt "Could we replace them easily if we needed to?", i think we've reached a spot where most of our foundational libraries probably can't (sometimes won't) be replaced by us but there's also a lot to be said that we don't necessarily need all of stb_image to get work done
So I guess we could talk about when using a library goes sour, since that connects with @demetrispanos' point regarding a library forcing you to buy into a type system - I have personally had issues with libraries that expose either an incomplete set of extracted information, and furthermore (sort of a subset of this problem) if they provide only high level APIs that let me get at certain data.
A good example is a query-style API vs. a "report an entire batch" API. SearchPoint(...) vs. GrabArray(...). In many cases, you need the latter, but many libraries fit a high-level use case too early with the former, and it makes the library sometimes unusable for a given problem
Oh, we spun up two threads
NeGate
so Ryan, wrt "Could we replace them easily if we needed to?", i think we've reached a spot where most of our foundational libraries probably can't (sometimes won't) be replaced by us but there's also a lot to be said that we don't necessarily need all of stb_image to get work done
Hmmm so what do you mean by "foundational libraries"?
stuff like stb_image or stb_truetype really
like most people dont know or care about the details inside of stb_truetype, they just use it and avoid the garbage world of font rasterization
With stb_truetype it becomes inadequate pretty quickly, I think - I haven't had as many issues with stb_image I suppose (although for some cases it also is inadequate)
(edited)
but rarely does anyone replace stb_truetype with something that isn't a bigger and weirder library
NeGate
but rarely does anyone replace stb_truetype with something that isn't a bigger and weirder library
like for all the text editor memeing we've had around here, it's rare for someone to make an stb_truetype replacement, they just use FreeType or something
stb_image is a low-cost, decent-gain library. Not sure about stb_truetype.
Loading images isn't usually a program's main thing.
NeGate
like for all the text editor memeing we've had around here, it's rare for someone to make an stb_truetype replacement, they just use FreeType or something
but FreeType2 is a very big library...
exactly on both of those statements
Text rendering is another issue where there's a big knowledge acquisition gap.
One point I wanted to mention that's related is the idea that code is always written given a set of assumptions (or constraints). Properly analyzing a library requires understanding those assumptions. So for example, stb_image assumes that you trust your input data. (Same with stb_truetype). They assume a few other things (which are related to our earlier discussion on API design), and basically libraries become more problematic when the diff between the library's assumptions and your assumptions grows large.
Furthermore I think this gets at one aspect of the modern code reuse culture - if you pull in a gigantic DAG of libraries just by trying to grab one, you've introduced a gigantic bundle of assumptions that you have not vetted, and basically cannot vet
(edited)
Yes. And that diff tends to grow larger as your project grows.
(edited)
1
it's ok to have assumptions, it's just that it's also ok not to pick a library because yours dont line up with theirs
ryanfleury
One point I wanted to mention that's related is the idea that code is always written given a set of assumptions (or constraints). Properly analyzing a library requires understanding those assumptions. So for example, stb_image assumes that you trust your input data. (Same with stb_truetype). They assume a few other things (which are related to our earlier discussion on API design), and basically libraries become more problematic when the diff between the library's assumptions and your assumptions grows large.
well, I'm afraid that's the price you pay for the library
(edited)
raysan5
well, I'm afraid that's the price you pay for the library
(edited)
sometimes you'll either compromise or pull a handmade or something
but yea this will always be true in a sense
NeGate
it's ok to have assumptions, it's just that it's also ok not to pick a library because yours dont line up with theirs
Yeah it's not only okay, it's required - you cannot write code given no constraints
1
On this subject & connecting with Asaf's original prompt, I think clearly documenting the assumptions you make for a given API is probably a good place to start, w.r.t. making a library easily responsibly used. (And flipping it, when you're a user, trying to fish out the assumptions, and clearly documenting your project's)
(edited)
ryanfleury
Furthermore I think this gets at one aspect of the modern code reuse culture - if you pull in a gigantic DAG of libraries just by trying to grab one, you've introduced a gigantic bundle of assumptions that you have not vetted, and basically cannot vet
(edited)
yea this is a big problem when it comes to threading especially in the types of languages we work in, like in our lovely imperative languages you need to understand data flow and mutability to be able to actually schedule things to happen in parallel, but once you lug around 20 libraries it only takes one of them being nasty about data flow for you to lose a bunch of potential performance gains
if Foo can't be called while Bar is being used because they both initialize a table but not atomically or something... you can get fucked over that easily until you put a mutex over it and cope
(edited)
NeGate
yea this is a big problem when it comes to threading especially in the types of languages we work in, like in our lovely imperative languages you need to understand data flow and mutability to be able to actually schedule things to happen in parallel, but once you lug around 20 libraries it only takes one of them being nasty about data flow for you to lose a bunch of potential performance gains
For sure
ryanfleury
On this subject & connecting with Asaf's original prompt, I think clearly documenting the assumptions you make for a given API is probably a good place to start, w.r.t. making a library easily responsibly used. (And flipping it, when you're a user, trying to fish out the assumptions, and clearly documenting your project's)
(edited)
i like the idea of making certain things the default:
* functions are pure (given the same input, it produces the same output)
* you must pass in a non-NULL pointer
* functions which take in isolated inputs are therefore thread-safe.
these are some simple ones but you can probably imagine more, and whenever things don't match this
we must document it.
(edited)
1
On "assumptions" - when I was iterating on The Melodist and pulling out things I wanted to reuse in later projects (not as libraries, but as tools within my codebase), I wanted to reuse my operating system abstraction layer. The problem is that my entire operating system abstraction layer was written with the assumption that I always opened one window, and initialized a graphics API. But that assumption makes the layer unusable for, e.g. a simple terminal application, or a multiwindow application. So that makes that "library" unusable for those problems, and if I had somehow hacked around it instead, that would've been less responsible in my mind.
So that makes that "library" unusable for those problems, and if I had somehow hacked around it instead, that would've been less responsible in my mind.
I find that forcing libraries to fit your code (or vice versa) tends to happen quite a lot later in the project.
It's something that people should pay attention to when vetting a library, but it's not something that's often talked about.
1
Oh, this touches on yet another thing - sorry, I'm sort of throwing all of my thoughts in and hoping one will stick - "code reuse" =/= "library that you cannot edit". There is another form of code reuse that arises, which is "I wrote a renderer in my last project, but I want to have a renderer in my new project - instead of trusting my old assumptions, or throwing the old renderer away, I can just duplicate it and mutate it as needed". This can be an invaluable time-saver, without the common drawbacks of overly-assumptive libraries.
(edited)
2
ryanfleury
Oh, this touches on yet another thing - sorry, I'm sort of throwing all of my thoughts in and hoping one will stick - "code reuse" =/= "library that you cannot edit". There is another form of code reuse that arises, which is "I wrote a renderer in my last project, but I want to have a renderer in my new project - instead of trusting my old assumptions, or throwing the old renderer away, I can just duplicate it and mutate it as needed". This can be an invaluable time-saver, without the common drawbacks of overly-assumptive libraries.
(edited)
the problem I see with external libraries that you need to edit for your project is maintenance
(edited)
raysan5
the problem I see with external libraries that you need to edit for your project is maintenance
(edited)
for this I think it's worth somehow distinguishing evolving vs stable libraries
there isn't good terminology for this, but there are libraries like stb_image where you can use a fixed version for years and it almost never matters
(rare exceptions if you're using untrusted inputs)
there are others (anything that touches the network, probably) where you want to keep tracking the latest version
demetrispanos
for this I think it's worth somehow distinguishing evolving vs stable libraries
libraries can always change or need to change for specific projects needs
I mean, stability of a library is very relative
Yeah, one third-party library I forgot to mention that I use and modify is stb_sprintf (another leaf!). I added my own format specifiers for my own string types. But that library almost never changes. If I did want to upgrade, it would indeed be slightly painful, but it wouldn't take very much actual time or resources - and it would be very low amortized across time (because the library changes so infrequently)
(edited)
You can't know what will be added/fixed in the future, so you'd generally expect the library to be good enough as is, no?
1
AsafGartner
You can't know what will be added/fixed in the future, so you'd generally expect the library to be good enough as is, no?
this is not my experience with how most people use libraries
I think most people use libraries with the specific intent to benefit from rolling updates
"it keeps getting better, and I don't have to do anything"
And then there's no adaptability.
Or more accurately, there's explicit anti-adaptability.
1
demetrispanos
"it keeps getting better, and I don't have to do anything"
I imagine it depends on the type of library and the functionality provided, for example stb_image functionality is very focused, chances to have problems with it are low (with trusted input data)
I think that does depend on the style of library - with certain pure-functional data transforms on rock-solid, unchanging problems - e.g. parsing a well-established data format, doing a mathematical transform, producing a hash - the upgrades will be more tangential (performance, reliability, etc.)
(edited)
raysan5
libraries can always change or need to change for specific projects needs
sometimes you end up where you make changes into the internals of a library and since it's not in the trunk then you're kinda fucked even if just small changes comes by just because modifying inside of a library doesn't have the same stability guarentees as the API, this is where i'd just say "lmao dont update your tools" or "you might wanna consider just branching off into your own thing if the library is already doing what it needs to do"
1
NeGate
sometimes you end up where you make changes into the internals of a library and since it's not in the trunk then you're kinda fucked even if just small changes comes by just because modifying inside of a library doesn't have the same stability guarentees as the API, this is where i'd just say "lmao dont update your tools" or "you might wanna consider just branching off into your own thing if the library is already doing what it needs to do"
you dont need to completely drop the library at this point, it's sour enough for you not to continue in their updates but still keep your modifications and the library... mostly
demetrispanos
I think most people use libraries with the specific intent to benefit from rolling updates
this is true but in practice that promise isn't always kept, sometimes things get deprecated and after a while they just get removed
1
Just for concrete reference, @bvisness and I had to modify a markdown library because of an issue when compiling it to WASM. It was literally not workable without the change.
1
I agree with @demetrispanos, though, that this is how most people use libraries, and I think that is one place where the "Handmade narrative" explicitly disagrees with the general programming culture, at least in my mind. If code inside of your project is swept out from under your feet, and you have not tested with it, then in my mind it's quite unethical to ship that to users. Even if it's ostensibly "an improvement", that is really just a prediction, and that prediction is often wrong.
(edited)
1
So connecting with what @AsafGartner said w.r.t. "adaptability", I would posit that upgrades should always be explicit & directly adapted for, in a given project
yeah this is my point really, obviously I'm aware of the ways it can go wrong but this is by far the dominant mode of use
1
This is one reason why I'd suggest single-header libraries are so popular in the Handmade sphere also, because they fit this pattern of use - explicit upgrades
(edited)
(And also why things like NPM get a bad wrap rap? rep?, because they fit this anti-adaptability pattern)
(edited)
I'd also suggest that explicit upgrading, and considering a library as relatively stable, does offer you some better options (e.g. modifying the library without worry)
(edited)
single header libraries put a special kind of limitation on you usually, in practice they limit just how much code a sane person is willing to pack into one header, usually that's like 10 thousand lines of code one the higher end (im going to neglect the automatic stuff and the other messy stuff) but because of this it's far easier to drop the library, it's not that big
it also doesn't have the room to make as many overarching assumptions
because it's not that big it's probably something a person could in theory learn and maybe you dont need to drop it and we can approach with my based strat of forking stuff
there's not really a binary to keeping or dropping libs
it's just like any other piece of code
sometimes a library isn't even fully atomic, my experience for anything big is that it's really not, people just don't know how to slice it because it's hard
just chopping out the details you need and copying those into your project or rewriting them to accommodate feels like a relatively handmade-y narrative
single header just makes the chopping process simpler... usually
Yeah, I think sometimes this is necessary, especially when the "knowledge acquisition" aspect is important, but the offered API or other characteristics of a library fail to meet your constraints. I mean, obviously, always check the license on a library, but if it's MIT or something like that, then a very effective strategy is reusing & readapting the code that the library-author wrote for a given difficult/opaque data transform that deals with a difficult subject (reverse-engineering, undocumented formats, a subtle mathematical transform, etc.)
1
Sometimes documentation for a data format, for example, is very abstract or removed from the concrete problem, and there are popular projects that both parse or output that format correctly, and simultaneously have been partly responsible for the evolution of that data format over time
At that point, that codebase is where the knowledge lies, so "code reuse" is really just studying the problem in that case
But then you'd make your own?
AsafGartner
But then you'd make your own?
a new standard!
I mean your own implementation for parsing DWARF for instance.
ryanfleury
(need a DWARF emoji)
yea i was thinking you were talking about LLVM and Codeview but that works too lol
also all their parsers seem to have the most dogshit validation stuff
"error: success" is about the worst thing you wanna see on a broken file
(edited)
this is spot where they got their stuff out of the way
(edited)
NeGate
this is spot where they got their stuff out of the way
(edited)
but it's not a library most people might be willing to accept, i'd be willing to call it sour
AsafGartner
I mean your own implementation for parsing DWARF for instance.
Right
Because while the knowledge from e.g. Clang is useful, it's baked in with a bunch of other assumptions/constraints that are incorrect - e.g. assumptions that make things very slow, heavyweight, or unreliable
(edited)
Would you rather have that knowledge in the form of code that can be run/debugged, or in the form of a spec document?
AsafGartner
Would you rather have that knowledge in the form of code that can be run/debugged, or in the form of a spec document?
It's hard to say. I guess it depends on what kind of code
ryanfleury
Because while the knowledge from e.g. Clang is useful, it's baked in with a bunch of other assumptions/constraints that are incorrect - e.g. assumptions that make things very slow, heavyweight, or unreliable
(edited)
Clang can also be called sour but depending on how much you use it doesn't matter
I'm thinking of the kind that you'd end up replacing, but that's usually all there is.
My question is should people push for more knowledge sharing in the form of documentation?
Or is there value in concrete examples, even if you're just effectively reverse engineering them.
internals? probably not, file formats should always be documented imo
and if it's an open source project then the documentation of the file format should also be open
1
regardless of anything, someone's gonna need to manage it, im ok with that being internal docs like in the codebase not the API but it should still be there somewhere
I think if the code is reasonable and if it's possible to run/debug, that is indeed often preferable - documentation writing is a skill, and your documentation doesn't have a typechecker and you can't run/test it. That being said, I think the line between the two is more artificial than it could be - but that is perhaps a subject for another day
(edited)
3
AsafGartner
My question is should people push for more knowledge sharing in the form of documentation?
I think some documentation is important and useful but it could be redundant and verbose, a simple API cheatsheet and some commented code examples could work better
ryanfleury
I think if the code is reasonable and if it's possible to run/debug, that is indeed often preferable - documentation writing is a skill, and your documentation doesn't have a typechecker and you can't run/test it. That being said, I think the line between the two is more artificial than it could be - but that is perhaps a subject for another day
(edited)
maybe we should invest in proof assistants and formal spec langs, then the comments are actually type checked :P
i got a question for you guys, how sour is too sour?
most of us use subpar tools all the time, at what point should a rewrite happen?
like we might be inclined to say "when shit goes wrong" but like you probably dont wanna rabbithole on a year long project for a one day workaround
Well, there's an emotional component to it. Like, if I really hate using it.
1
AsafGartner
Well, there's an emotional component to it. Like, if I really hate using it.
then there's that
But primarily it's a question of how many bugs or other trouble is this going to introduce to my codebase over time.
At what point does the subpar-ness compromise the project, or my well-being, or my time? Like it is also sort of a constraint-solving problem. If I had the time, I'd want to rewrite a lot more than I do - this is surely partly because I do not have any commercial/business constraints on any of my side-projects
i kinda have the view of "i really dont think X should be THE library for Y job" as in i dont like it when the space gets uncompetitive
AsafGartner
But primarily it's a question of how many bugs or other trouble is this going to introduce to my codebase over time.
Agree, I had that experience with one raylib external library, many related issues/missing features all the time
Unfortunately, the custom implementation cost and maintenance is superior...
raysan5
Agree, I had that experience with one raylib external library, many related issues/missing features all the time
that's the brings up a fun one, as much as i'd hate to admit it, rewriting massive mature projects is an easy way to introduce bugs they got resolved a long time ago in the older product, also means you dont have the ecosystem to engage with testing and potentially not even be able to share some testing suites... brings up the evil side of the question "when is a rewrite too sour to be done?"
(edited)
A corollary question: How do you determine the limits of a library ahead of time?
AsafGartner
A corollary question: How do you determine the limits of a library ahead of time?
you simply dont have the same resources usually which is a good type of limitation to a degree, ideally it means you focus on your use case because once you're rewriting it it doesn't need to be a library anymore, at least it doesn't if you dont want it to be one
AsafGartner
A corollary question: How do you determine the limits of a library ahead of time?
It depends, I usually check for some "elements" that I consider useful, like no-memory allocations or memory allocation replacement option, same for external file accessors, option for memory accessors...
(edited)
So how far do you go when evaluating a library?
Looking at the docs is one thing, but do you also read much of the code?
AsafGartner
Looking at the docs is one thing, but do you also read much of the code?
I check the code, the exposed API and the internal API, personally I also check the coding conventions
I also check the provided examples, the organization and comments, and I try to run them, I think it's a good indicator of the care put into the library
physically recoiling at the idea of the code probably means you shouldn't rewrite, physically recoiling at the code is probably a reason to rewrite it, just out of being a moral person trying to make the world not suck as much
the former means it's probably not something you should waste invest your time into
NeGate
physically recoiling at the idea of the code probably means you shouldn't rewrite, physically recoiling at the code is probably a reason to rewrite it, just out of being a moral person trying to make the world not suck as much
There's also the idea of contributing to the library to make it better, but we can touch on that later.
that's a good point i was hoping Martins could pop by for but if not I mostly agree with him until i dont so i'll Steelman that side
And on the subject of vetting, what can you do to make your library more easily vettable?
On thing that I find is often missing from library docs is the mental model behind the library. But as @ryanfleury said, documentation is a skill and writing up the mental model is probably one of the harder aspects of it.
1
AsafGartner
And on the subject of vetting, what can you do to make your library more easily vettable?
Well, no allocators customization or no memory data access is something I don't like
when they force you to do file I/O via their stuff ::((((
i say this like i haven't done this... im the problem sometimes
this one and memory allocators are usually big ones
but allocation schemes can get messy so i understand why they might not in certain cases
because not all allocators are compatible with all structures
though there's something to be said about being flexible or just not allocating for the user in the first place
in which case you can avoid such issues
NeGate
when they force you to do file I/O via their stuff ::((((
that's horrible... but it depends on the type of library...
NeGate
i say this like i haven't done this... im the problem sometimes
me too
but not allocating for the user is also complex because every time you need extra memory you gotta go back and talk to them
NeGate
but not allocating for the user is also complex because every time you need extra memory you gotta go back and talk to them
either via some callbacks or our favorite... offbrand iterators
you keep calling a function until it says to stop and even time it returns it just asks you shit...
well it's an event loop but offbrand iterator makes it stand out more as a concept
the important thing to note is that callbacks or event loops or whatever else are usually indicative of assumptions you might not wanna make in your code
they tally up to "i really dont wanna deal with this bs"
it's fine if a library isn't completely perfect, none is, it's a time saver
you sacrifice a little bit of personal peace for the sake of getting some shit done or not needing to have the mental overhead of dealing with another person's crap
especially not a person or people who you can't directly contact
this is actually one of the biggest problems when it comes to libraries and part of the reason i might promote rewrites
the maintainers are simply not my friends, they're not my colleagues, if i need LLVM fixed i cry, if i need any huge library fixed i need to go learn their BS along with the original problem space
if i need a feature added, i also cry
Yeah, that seems to be a problem with big libraries and especially frameworks.
On the other hand if you can make a contribution it will have a larger impact.
library maintenance is a hard topic....
When you buy into a big library, you sell some personal responsibility away which can be good... until things go wrong
1
I think there's a point in every library where it stops making things easier and starts making things harder.
And the skill to master is anticipating where those points are, and if your projects fits in before you hit those points.
AsafGartner
I think there's a point in every library where it stops making things easier and starts making things harder.
From my experience, that happens when you try to give too much control to the user
A small group of experienced users will love it but many other users won't... I think it's difficult to find the right balance
as much as we might meme about weird dependency stacks i feel like this might be a "justified stack", the low level users have one library and the high level people can wrap over it... sometimes
Wouldn't different levels of granularity help with that?
NeGate
as much as we might meme about weird dependency stacks i feel like this might be a "justified stack", the low level users have one library and the high level people can wrap over it... sometimes
or in the reuse case, you factor out the BS to just handle your case
maybe the library lets you choose custom allocators but you dont care, so you have a function like JustGiveMeTheShtuffs(...)
this is neutral code reuse, it could be good, could be bad
it helps you which is the point but it also means that the library wasn't helping you
but since it's small it's fine
because it's small it's also the type of thing that might fit into a pull request
not all library mods have to be "selfish"
not all changes need to be rewrites
sometimes we can simply increment on the project but this gets really complicated to plan out
when do you guys think it's a good time to really just increment vs rewrite
It depends on the size of the rewrite.
And on the number of unknowns.
A lot of people are also worried about introducing bugs.
NeGate
that's the brings up a fun one, as much as i'd hate to admit it, rewriting massive mature projects is an easy way to introduce bugs they got resolved a long time ago in the older product, also means you dont have the ecosystem to engage with testing and potentially not even be able to share some testing suites... brings up the evil side of the question "when is a rewrite too sour to be done?"
(edited)
i brought up the side of "oh the rewrite might introduce bugs" but now i think i'll try to bring over the "a rewrite can be a chance to fix bugs" side
And one thing that we haven't explicitly discussed is the aspect of saving time by using libraries.
AsafGartner
A lot of people are also worried about introducing bugs.
It's a legit worry, because most contributions come with no maintenance
raysan5
It's a legit worry, because most contributions come with no maintenance
I was thinking more in terms of using "battle tested" vs writing your own from scratch.
1
AsafGartner
And one thing that we haven't explicitly discussed is the aspect of saving time by using libraries.
you usually save time... if everything goes as expected...
ideally any decently sized library has a process to which you contribute with, if it doesn't... don't contribute it's probably a mess or they're not ready
stuff like a testing suite and docs on the contrib process
it's usually not worth trying to increment on these projects just yet
might be better off with a rewrite or a fork
I think people have the impression that any decently-used library had it's bugs worked out.
AsafGartner
I think people have the impression that any decently-used library had it's bugs worked out.
this is true in our perfect world but a lot of projects will grow in stars and popularity far faster than the project can actually """scale to enterprise"""
1
AsafGartner
I think people have the impression that any decently-used library had it's bugs worked out.
it's not that they're all worked out, it's that many of them have been encountered
whereas your rewrite has never been put under stress at all
But that depends on age and the volume of contributions.
I've had "popular" libraries break just a few days after I started using them.
1
AsafGartner
I think people have the impression that any decently-used library had it's bugs worked out.
Completely agree, far from reality
yea that's one of the weird spots which is that a lot of the people who might contribute to the popularity of a project aren't necessarily contributing or seriously engaging with that project
1
sometimes they just think it "will" become cool so they hang around and star it
so you might see 5k stars and the project is still buggy and in prerelease
many/most libraries are bad / have bugs, but that's not really the relevant comparison point
your library will have bugs too
yes im aware, the point is that your bugs are bugs you can fix
you might not be willing to go dig into a million line project
but you might be willing to deal with 10k lines you factored out of said project
NeGate
but you might be willing to deal with 10k lines you factored out of said project
I agree but I think this is kind of changing the goalpost ... if you can replace a 1M library with 10kloc you control, obviously that's good
but the main effect there is you reduced the code by 99% and so all code-related problems are also reduced by 99%
demetrispanos
I agree but I think this is kind of changing the goalpost ... if you can replace a 1M library with 10kloc you control, obviously that's good
not quite, im not saying rewrite 1M into 1kloc, im saying take out the relevant details and just have that in your rewrite
(edited)
simply less surface are for bugs, this is why the main reason i recommend a rewrite is just if you have a domain specific stuff
So another point for library evaluation: "how debuggable does this look?"
AsafGartner
So another point for library evaluation: "how debuggable does this look?"
i'd just poke it with a debugger and if i cry then it's bad, it's really abstract for me
debuggers being able to let you step around code means you can build an intuition for the control flow of a program given some input
AsafGartner
So another point for library evaluation: "how debuggable does this look?"
That's a good reason to take a look to code formatting and conventions before using it, just in case... following some code bases could be really complex
I want to touch on the time saving aspect a bit more, as this is a primary reason people go for libraries in the first place.
I've had cases where libraries obviously saved a lot of time. Both in terms of knowledge acquisition, and in many cases just code writing time.
I also had cases where libraries turned out to be a time sink.
there's the age old metric of lines of code but it's not accurate and this stuff is particularly abstract... a lot of libraries have you learn their stuff rather than learn the problem space, this is fine, it's part of why we buy into libraries but if there's a disconnect where you needed to learn more to do their stuff than just learning the problem space, ideally as Handmade-y people we have an idea of what's behind the scenes to some degree, from there you might be willing to say "While i've learned to do things, i dont understand the problem and i haven't gotten any closer to solving my personal problems so i should probably re-evaluate my time"
(edited)
I think it depends on the scope of the library, for example, a file-format loading library has a narrower scope than a multiplatform window/graphics initialization library
AsafGartner
I want to touch on the time saving aspect a bit more, as this is a primary reason people go for libraries in the first place.
I think it's worth making the strongest possible argument for preferring existing libraries, so we can frame our criticisms or reactions
- save time coding it yourself, possibly a huge amount of time if it's a lot of code or if you'd need to learn a lot to do it
- get something higher quality than you would make yourself, because it is used in many different situations and has been subjected to selection pressure; also, if you are not an expert in the problem domain and the library author is
- get a built-in knowledge community you can ask for help with problems, since other people are using the library
- increase transferability of skills across teams, since the same external library can be used in many teams
4
The community aspect can be big for certain kinds of libraries.
We're at the two hours mark, so I'd like to switch gears and talk about the community's stance on library usage.
2
There's a perception that we are all about DIY-only, from scratch, etc.. but that's obviously not the case.
generally we view rewrites as a sort of moral good, there's always a reason that reinventing the wheel is valid considering how many mediocre wheels exist
When it comes to reinventing the wheel, I'm not sure I would even call it a rewrite.
sure do conflate those things which is something im sorta getting to
we sometimes treat a rewrite as a reinvention which is generally why things like the wheel reinvention jam have to clarify that it's more than just "X but good" or why we have so many memes about about "Y but good"
I think it's probably good to clarify some common beliefs and non-beliefs
for example, it's easy for newcomers to get the message "never use external libraries"
(partly because that's an easy read of some of what is said, and partly because that's what Handmade Hero did)
I think it's not about the rewrite or reinventing the wheel, that's just a very small percentadge of the work. When you create something new it requires a lot of side work
I think, at the very least, Handmade explicitly rejects the idea that you should rely on a library without having a fairly strong understanding of what it is doing, and being capable of at least learning how to replace it if necessary. Handmade promotes the idea of self-reliance and responsibility over what you ship to users, so if you use a library, you are responsible for what the library does
2
So it's very much opposed to a popular school of thought, which is that you only need to understand the abstraction, and the details will be taken care of by someone else
ryanfleury
I think, at the very least, Handmade explicitly rejects the idea that you should rely on a library without having a fairly strong understanding of what it is doing, and being capable of at least learning how to replace it if necessary. Handmade promotes the idea of self-reliance and responsibility over what you ship to users, so if you use a library, you are responsible for what the library does
I'd agree with this but I think it's a process not really an end-state
you can't just decide to be fully self-reliant
not instantaneously anyway
and no one can be fully self-reliant across all subjects
it's more about the philosophy than an exact practice
it builds a stronger character to be willing to replace things
rather than always spending time replacing things
Yeah. That's what I was trying to get at with "being capable of learning how to replace it"
It's not that you have to go study the PNG spec before using stb_image. It's that, if stb_image had some critical flaw for your purposes, you'd be capable of ditching it, when push comes to shove
(edited)
1
ryanfleury
It's not that you have to go study the PNG spec before using stb_image. It's that, if stb_image had some critical flaw for your purposes, you'd be capable of ditching it, when push comes to shove
(edited)
well, I created my own single-file header-only PNG loader...
raysan5
well, I created my own single-file header-only PNG loader...
Yeah, and that may have made sense for your purposes - and you could do it!
Main pourpose was mostly learning and understanding, curiosity about how something I've been using for years work in the inside
Actually I'm not using it, at least just yet
I think for example if you get off the ground with stb_image and then someday decide you need to really be robust against possibly-hostile inputs, at that point you should be able to drop down and replace it. Or, I mean, hire someone who does, for that matter
2
(So even if you aren't willing to invest the time to go figure out PNG yourself, you're familiar with the nuances of the issue and what you'd do to avoid certain flaws in a library)
3
Because I agree that fundamentally, you can't know everything - and certainly every time you use code you haven't fully vetted, you could be relying on someone else's bad decisions. That's sort of unavoidable. But I guess Handmade's "take" - if there is one - would be just pointing out this fact, and so minimizing the drawbacks wherever possible. Not just shipping random software that you naively trust to users.
(edited)
3
probably a good idea to poke a bit at the masters of code reuse
aka the package managers of the world
This feels like one of those definitional problems, where both extremes are clearly wrong - "no I must write my own PNG parser! But first, I must write my own GPU driver! But first, I must write my own compiler!", vs. "I will gladly import 1000+ dependencies into my project. They'll just get better, faster, and more secure/reliable over time, and I don't have to change anything!"
The problem, to me, is that such a large portion of the programming sphere is very much on the latter side. In some cases, it is explicitly practiced and taught.
(edited)
And so even though the former is not correct, the correct gradient is actually in the direction of the former
It's of course easy to overcorrect, and I think everyone should be mindful of that
It's not so much that there's a sweet spot between them, it's more that currently we're very much off to one side.
There's a big range of reasonability between them that doesn't imply NIH or NPM.
I also think, to some degree, you need people at various sweet spots in that range.
Like, the fact that a lot of people in the community have chosen to work on programming tools, when they originally started working on games, is not obviously a net loss
Many people might go "lol wow you started by trying to ship a game but then you went and wrote a useful library for programmers? lol loser"
But, I mean, that is just so ridiculous that it is not really worth addressing - having more people iterate on difficult, low-level parts of ecosystems and tooling is actually a net win
At the same time, you'll have people who do just want to make games, and they'll be a bit closer to the "NPM" side, and that's okay too
But I think Handmade's goal is just to act as some kind of counter-balancing force to the NPM-extreme, which seems pervasive across many areas. Or, at least, to provide a community for people who are not on board with that extreme.
(edited)
Computers should be programmable, and they should be as programmable as possible to as many people as possible.
And that can mean different things depending on where you are on that range.
AsafGartner
Computers should be programmable, and they should be as programmable as possible to as many people as possible.
the problem that comes up is that if you make certain things easier it doesn't mean people will do them less, it means people can do them faster and will do them more often
this is the NPM problem imo
this is what we sorta fight against
the weird thing is im not really sure how to fight that without some meme-y level of gatekeeping or having shitty tooling for installing dependencies, it's mostly a cultural thing so maybe there's something to be said about that
i dont think rhetorically we're ready to say "guys libraries good actually" as a general statement because the culture is still in NPM's hands
once we do push things back we can start to sound "normal"
Well I think it's not necessarily about changing everyone's minds by persuasion. Personally, I am sort of more in favor of forcing people to change their minds by competition. If Handmade ideas cannot reasonably outcompete everyone in an area, then hey, maybe people have just decided that the crappy drawbacks they're subject to are worth it, and the benefits from higher quality are not yet big enough.
(edited)
But, point is, you need people iterating on the quality part, because at some point, those wins will be big enough, and that point, everyone coasting on their crappy tech inertia will have a very difficult time competing. And so, just by market pressure & people working passionately on what they want, you'll see improvement over time.
(edited)
ryanfleury
Well I think it's not necessarily about changing everyone's minds by persuasion. Personally, I am sort of more in favor of forcing people to change their minds by competition. If Handmade ideas cannot reasonably outcompete everyone in an area, then hey, maybe people have just decided that the crappy drawbacks they're subject to are worth it, and the benefits from higher quality are not yet big enough.
(edited)
yea this is why i really like seeing new projects pop by and why i think that we should invest some time into incremental change, there's a lot of things that exist in the culture and if you start to show people that you're right, they can actually move towards you
this is where we get into a problem since we've built a sort of island in a lot of ways
which is probably why newcomers have such ideas of us
Right. Persuade by doing & outcompeting, not by arguing with people.
This is, I suppose, sort of a tangent for this fishbowl, so sorry for getting off topic again :)
"we're some Assembly, C and C++ nerds who like rewrites :P" we don't look that good in a lot of ways but if you show someone remedybg you might turn some heads
1
I think this is why it's also important to advertise when doing things differently. Like how raylib mentions that it uses no external dependencies, and how it clearly improves the build process.
1
NeGate
"we're some Assembly, C and C++ nerds who like rewrites :P" we don't look that good in a lot of ways but if you show someone remedybg you might turn some heads
NPM looks cool because you can do anything with it if you type the right install command, same with python and some imports... we dont have that sort of appeal of package management and we really dont need it... the solution to package management is just not to centralize it in my opinion, there doesn't need to be only one way to grab packages, another thing we run into is that we've separated stuff like cloning a repo from install a package and now it's more promoted to just use what you're being fed and not just fork and make your own changes or even really look into the codebase
i wonder about that stuff but i dont have any concrete views on the matter
(edited)
Ease of distribution is always a win, and package managers win on that.
Unfortunately that can lead to excessive distribution.
And to bring it back to responsibility, how do you take responsibility for all that code?
I think expertise in using a library, or libraries, is its own skill
and really, that's the dominant skill the market is selecting for right now in most application domains
so in the same way someone might be a competent C programmer (and you expect that means something about practices for memory safety) then someone can be a competent user of various important libraries
(and that implies knowing tradeoffs, common bugs, etc.)
demetrispanos
I think expertise in using a library, or libraries, is its own skill
being good at a library, especially a big and popular one is really just the first step to understanding the status quo and from the status quo is where you learn the problem and how we can move forward... it's rare that a big project has such deep complaints someone with a surface level understanding can levy
Well, we've gone for nearly 3 hours, so I think it's a good time to call it.
2
Thank you all for participating.
Thanks for organizing @AsafGartner, was a fun conversation!
1