Handmade Network»Fishbowls»The relationship of simplicity and performance

The relationship of simplicity and performance

In the community, we talk a lot about performance. We also talk a lot about having simple codeβ€”and the two feel somewhat intertwined. What relationship is there between simplicity and performance? Are there better ways to reason about "simplicity" with this in mind?

This is a fishbowl: a panel conversation held on the Handmade Network Discord where a select few participants discuss a topic in depth. We host them on a regular basis, so if you want to catch the next one, join the Discord!
Avatar
ryanfleury May 21, 2021 01:05 PM
Topic: The Relationship of Simplicity and Performance
13:05
Hello everyone! This fishbowl (topic: https://github.com/AsafGartner/hmn_fishbowl/discussions/30) was prompted by a conversation that occurred in the Dion Systems server. The topic arose when discussing the use of standard library containers, data structures, or other APIs, and comparing that with other approaches, like writing your own allocators and data structures. My argument was that something in, say, a standard library, is not necessarily an ideal version of, say, an allocator or generic data structure implementation. Such things are often intended to span across a very wide space of use-cases, and are therefore required to be more generic. This introduces cruft and performance problems in use-cases that do not require such genericism. For instance, memory arenas versus a traditional malloc and free API. Memory arenas will always allocate faster than malloc, and it isn't because they have had more engineering time poured into optimizing them, but rather because they have a much simpler job to do. In this way, abstractions are, to some degree, coupled with their underlying implementation. I was attempting to use this point to make the case that Handmade is not anti-abstraction, it's anti-trusting-abstractions-without-evidence. That kicked off the conversation regarding performance vs. simplicity. In this fishbowl I am hoping to discuss with you all a few questions, like "how do we define simplicity?", "what are characteristics of 'simple' code?", and "how does 'simpler' code, in the way that we mean it, perhaps lead to performance advantages?" So I guess we should start with the definition of 'simple', since that seems like it's overloaded on many fronts.
13:05
Adding roles now...
Avatar
bvisness May 21, 2021 01:06 PM
πŸ‘‹
13:08
I hope we don't spend the entire time discussing the definition of simplicity πŸ™‚ but I do think it's probably good to clear up yeah.
Avatar
ryanfleury May 21, 2021 01:08 PM
Right---that alone is a rathole that could last hours πŸ˜„
Avatar
bvisness May 21, 2021 01:09 PM
At the very least we have the Rich Hickey "Simple Made Easy" definitions, which are that simple means "not entangled", roughly, whereas easy means "close at hand". And those two things are pretty much completely independent. (edited)
13:10
And because this is HMN, I expect we all broadly have a decent shared understanding of what we mean by "simple". (edited)
Avatar
jfs (audience) May 21, 2021 01:42 PM
Maybe someone should point out the etymology for complex. I'm at least thinking of these other words: simplex, duplex, complicated. The latter one especially, I think in French plier is to fold
Avatar
demetrispanos (audience) May 21, 2021 01:42 PM
yeah this was hinted at by ben's comment about the rich hickey definitions, in which he does just that
13:42
but it's worth making it explicit
Avatar
bvisness May 21, 2021 01:43 PM
yeah the terminology Hickey uses is "complect", a word meaning "to entangle" or "to braid together"
Avatar
Allen4th May 21, 2021 01:10 PM
I was thinking about this definition of simplicity in anticipation of this conversation, and I do think it makes sense to outline some various ideas of what simplicity means to us. But I am going to propose we explore the way various kinds of simplicity relate to performance. That'll give us more to talk about and we don't have to agree on a single definition to get to good stuff.
πŸ‘ 1
Avatar
ryanfleury May 21, 2021 01:11 PM
Sounds good! I'm good with that route.
13:11
Yeah, I partly think that the simple or simplicity word is overloaded-enough to the point that it can be unhelpful or really muddied. But we often use it as a "hand-wavey" handle to certain ideas that, in my estimation, are more concrete.
Avatar
Allen4th May 21, 2021 01:13 PM
My take on it is probably a bit of a "vulgar" way to define simple. I think some folks would like simple defined in purely abstract information-theoretic terms. Something you could write proofs or religions about...
13:15
To me I just work backwards from the reality that I want good software, and a lot of people who know what they're talking about agree that something called simple is relevant, and I arrive at something like this. Something is simple when it is easy to work with, easy to implement, easy to use, easy to understand, etc, especially when it also solves a problem that can seem hard without this "something".
πŸ‘ 1
Avatar
bvisness May 21, 2021 01:16 PM
The last bit might be the most important
Avatar
demetrispanos (audience) May 21, 2021 01:13 PM
I think something similar to the hickey definition, but not quite the same, is "direct"
13:13
when you want to do X you do X, not X and several other things as well (edited)
this 2
Avatar
ryanfleury May 21, 2021 01:19 PM
I see, interesting. Yeah one concrete aspect I was going to bring up had to do with number of codepaths, particularly that do non-duplicative and uniform work, and I was hoping on relating that to performance by pointing out the reality that uniform work is easier to predict, easier to optimize than an equivalent large number of codepaths with many concepts muddied. And I think that ties into easy to implement, because fewer codepaths => less implementation work. easy to use, as fewer interfaces to fewer codepaths => fewer things to understand. etc.
13:20
It's not just number of codepaths I am trying to get at, though, and I am not sure if you guys have ideas or if it's not fruitful to rathole on that. But I often think of what I am describing as "orthogonality" of codepaths. When each codepath does one slice of the problem, and can be composed with other codepaths to form all the points you need in the "space of features" that you require for your problem. In that sense, each codepath is like an axis, and you can build a "point" by composing use of those codepaths, or the lack thereof, e.g. (codepath1, codepath2, 0, codepath4). And of course, all codepaths are parameterized too (or can be). (edited)
Avatar
bvisness May 21, 2021 01:21 PM
easy to implement, because fewer codepaths => less implementation work
This is why we have @rxi here, because one of the things I've noticed is that his code is both what I would consider very simple, and probably as a result, very small (compared to what others, including myself, would probably come up with)
(edited)
πŸ‘ 1
13:22
And that's something I'd like to probe into a bit at some point, because the design of e.g. microui goes beyond "solve only the problems you have" into "have fewer problems", somehow (edited)
Avatar
Allen4th May 21, 2021 01:23 PM
Right. I think that draws out the line between where I agree with what @ryanfleury said and where I disagree, or at least am a bit weary.
13:23
Number of codepaths in a literal sense does seems like a useful way to make literal the notion of simplicity to me - that's the part where I'm on board.
13:26
The analogy to orthogonality makes me more uncomfortable. Because that sounds to me like the mental framework goes: I sit down and identify N effects I need to achieve and therefore write the N simplest codepaths I can think of. But I think that's just the beginning. Overtime it might turn out that fewer codepaths actually could do it all.
13:27
If the idea is that a program is a set of points we want to bundle together, and our software is made up of some "spanning vectors", and as long as we can get to the points we want we've decomposed the problem, that would be awesome... but the intuition that kicks in seems misleading to me.
13:27
We don't have a test for "linear independence" between code paths.
13:28
(This feels only half related now that I finish the thought)
Avatar
ryanfleury May 21, 2021 01:29 PM
But I think that's just the beginning. Overtime it might turn out that fewer codepaths actually could do it all.
Yeah I agree with this quite strongly (and this exact process happens when I am doing lego-bricked code). I think that is why I like the number of codepaths as some kind of measurement. Because if you implement a smaller number of codepaths that produce the same space of effects, then you've found a simpler solution (because there are fewer codepaths, fewer axes, with the same number of points). Yeah I suppose the reason I don't like "number of codepaths" alone is that it feels like it is missing something. And thinking about it a bit more, I think what that "missing thing" is, is how this intersects with the set of features or properties desired by some software. So for example, if I only have one very specific feature in mind, and I don't want to parameterize it at all, then I just want a function that is like void DoTheThing(void) (or maybe it returns something). And that is the best solution for that particular space of features, right, since anything more would be wasting my time. But, if it turns out that I actually did want to parameterize it, or compose pieces of DoTheThing together, leave some out, etc., then this would not be adequate. I am not sure if that means it's "not simple", though. But I think it does mean that it's not always better to have fewer codepaths, because that could mean coupling of things you don't want coupled.
(edited)
Avatar
Avatar
Allen4th
The analogy to orthogonality makes me more uncomfortable. Because that sounds to me like the mental framework goes: I sit down and identify N effects I need to achieve and therefore write the N simplest codepaths I can think of. But I think that's just the beginning. Overtime it might turn out that fewer codepaths actually could do it all.
bvisness May 21, 2021 01:31 PM
As a concrete example of this, the command list design of microui simplifies drawing down to just a few small primitives, where otherwise you might have to implement custom drawing functions for buttons, windows, scrollbars, etc. (edited)
Avatar
ryanfleury May 21, 2021 01:32 PM
Yeah, actually, on this front, not to get OOP all wrapped up into this, but I think this is one huge problem that I think exists with OOP-style design or architecture, because it assumes the primitives, and gets them from a top-down design style (that is also mixed up with 'modelling the real world', or something that might superficially seem like a 'real world' thing). That is just an aside though, let's please not start talking about OOP, but I thought it was relevant. (edited)
Avatar
Allen4th May 21, 2021 01:33 PM
But I think it does mean that it's not always better to have fewer codepaths, because that could mean coupling of things you don't want coupled.
I have more I want to say about this, but we'd be hashing out the role of simplicity in making good software and a protocol for measuring complexity via codepath count - which is not the topic at hand, so maybe we can put a pin in it for now?
(edited)
Avatar
ryanfleury May 21, 2021 01:34 PM
Yeah that makes sense, I think this probably gives us a good feeling for the "sub-space" in which we have similar intuitions for simplicity.
13:34
Unless another participant has more to say?
13:35
If not we can start relating some of these ideas to performance.
Avatar
bvisness May 21, 2021 01:36 PM
@rxi I'm curious to what end you have "simplicity" in mind as you're designing something (however you define it)
πŸ‘ 1
Avatar
rexollycounty May 21, 2021 01:36 PM
I think i am alright with that definition of simplicity, i think much of what i would consider simple would be to have the least amount of accidental complexity as possible for the problem (which @gingerBill often mentions). I think few codepaths would be a good measure of that
πŸ‘ 2
Avatar
bvisness May 21, 2021 01:37 PM
I'm guessing it's not as heavy on the linear algebra as Ryan's thought process, but I could be wrong πŸ™‚
Avatar
ryanfleury May 21, 2021 01:37 PM
No, and to clarify, my mental model here is not what I am thinking about as I am programming, but it's rather after-the-fact trying-to-make-sense-of what I am doing. Or trying to digest the lessons that I encounter into a boiled-down form that explains them all. (edited)
13:38
And for whatever reason, the linear algebra is really intuitive for me when trying to do that πŸ™‚
Avatar
Allen4th May 21, 2021 01:40 PM
We can start talking about performance and sprinkle in additional simplicity-defining talk when we need to again, right?
Avatar
ryanfleury May 21, 2021 01:41 PM
Yeah I think that's inevitable, especially if the topic title is accurate and there is indeed a "relationship" between them. We'll have to bounce back and forth (which is fine) (edited)
Avatar
Allen4th May 21, 2021 01:42 PM
So I think I first realized all this a while back in a specific 4coder problem, thought it might be interesting to break down that problem and show how I made performance worse by being complicated.
πŸ‘ 1
4coder_icon 1
Avatar
bvisness May 21, 2021 01:42 PM
that sounds very interesting
Avatar
Allen4th May 21, 2021 01:43 PM
The most useful thing I took away from that moment was the irony that it was complicated because I was trying to worry about performance.
13:43
Alright, one minute to sketch it out...
🍿 4
13:48
In 4coder I have contents for a buffer stored in UTF-8. I want to interpret that UTF-8 at least a little bit to get codepoints. I also need to parse that UTF-8 to get tokens. Then when I wanted to add virtual whitespace for the first time, I needed to run a light parser to determine indentation for a few cases, and emit wrap points. Once I had all of that I can render a buffer with virtual whitespace.
Avatar
Allen4th May 21, 2021 01:52 PM
The first time I tried to do this, it was simply an accumulation of little layers over time. First it was a loop over ASCII that goes straight to the screen. After a bit it had the ability to wrap, and so the ASCII loop got just a bit more complex for tracking and implementing that. Then it got a bit more complex when tokens were added for syntax highlighting, and the position in the ASCII had to track with a position in a token buffer. Then it got a bit more complex when I was pulling one unicode codepoint from a buffer where I used to just read a single byte of ASCII. Then finally I worked in the parser that is running on that stream of tokens that is tracking with the UTF-8 reader loop and emitting smarter wrap points and indents. -- The point is it was all one big loop trying to avoid any intermediate passes and just pull everything out in one go.
13:53
The reason I did it that way was partially because it was always the least work thing to do from moment to moment (although my maintenance burden was getting out of control), but also because I thought that making code go fast meant tolerating no slop everywhere. So it was all setup to do exactly the amount of work that was needed to advance the state of the rendering logic across all layers, and no more than that.
13:55
In later iterations I have broken things down into phases. This means:
13:55
1. Things are simpler, I don't have to track down N locations where I might have gotten parse logic wrong, because it's not interleaved with anything else
13:56
2. I can do a better job of optimizing things, because I can actually take advantage of pipelining and wrap my head around something well enough to work on it
13:57
3. When I have bugs I can more easily track down which part of my solution first introduced the issue, because I can just examine intermediate values and walk through the code responsible for the first bad intermediate
Avatar
bumbread May 21, 2021 01:48 PM
hi
πŸ‘‹ 4
🍞 1
Avatar
bumbread May 21, 2021 01:59 PM
So it might "seem" like simplicity makes code slow, but what actually happens is that the simplicity help manage and optimize in the future. Maybe it's worth investing in simplicity instead of optimization that may turn out premature
Avatar
Allen4th May 21, 2021 02:01 PM
Mhm. I think that is definitely a pre-mature optimization story. I also think it also highlights that doing the simple thing right now might be more work than hacking in one more thing, but it really doesn't take long for you to get to the point where you've spent more on maintenance than you would have on the simple thing.
Avatar
bvisness May 21, 2021 02:02 PM
You mentioned that there was a performance issue with how you had it before?
14:02
Even though there was "no slop"?
Avatar
Allen4th May 21, 2021 02:03 PM
Right. So for one thing it's a pipeline issue.
14:04
If I do all the UTF-8 to Unicode in one go, I can do smart stuff to get a good ratio on that.
14:05
Think of pulling one codepoint out at a time, between tracking tokens, tracking a parser state, emitting various channels of output, etc.
14:06
The other is that, since the maintenance burden was so high, my iteration time was very slow. In the less tightly wound version I have now a cache was much easier to add without causing bugs. I have no idea if I could have pulled off a cache in the first version. And it turns out a cache does a lot for this problem.
Avatar
bvisness May 21, 2021 02:07 PM
So in what sense would you consider your new design "simpler" than your previous one? It does have more layers after all πŸ™‚
Avatar
ryanfleury May 21, 2021 02:07 PM
Yeah I was going to comment on that. It does seem a lot simpler, but it has a larger number of steps (and maybe that qualifies as a larger number of codepaths?).
Avatar
Allen4th May 21, 2021 02:08 PM
There aren't really less layers. For instance in pulling one unicode codepoint out a time, I still needed to abstract that because this wasn't the only place I needed that translation.
14:08
So the way it got pulled out was a weird set of macros and structs that track the translation state.
14:09
It's "simpler" because if you just accept the idea "we're going to have some slop, first we'll do translation then we'll operate on it" it turns out you can then have a reusable translation that is cranked up to a way higher speed.
14:10
Instead of having like two or three composable macros you have one function (fewer codepaths).
Avatar
ryanfleury May 21, 2021 02:10 PM
Ahhh okay interesting.
14:12
I was also going to comment on the caching side. One thing I've noticed is that there is usually not too large of a distance between recompute-everything-on-every-frame or every-call, and a cache. So framing things as "queries", over some mutable state, but not necessarily storing exactly what you need, leads to caching more easily. And, I've often heard the "recompute everything all the time" described as "simpler" than trying to store things, and I have that intuition as well.
14:14
Or, it is the kind of code that arises out of saying something like "I don't care about performance here yet, I am just going to do the simplest thing and get everything computing properly", and I think it's interesting that there seems to be some connection between this kind of "simple" approach and a path to layering good optimizations over it.
14:14
And more of an allergy to introducing state that needs to be tracked is also probably a good way to keep the space of possible state permutations smaller, which also feels "simpler" to me in a way.
Avatar
bvisness May 21, 2021 02:17 PM
This is all interesting to me because it feels like it goes a bit against the advice that is usually given around these circles.
14:17
Or really, against how people often take that advice.
14:18
Like in your case Allen, it sounds like you were doing exactly the usual Handmade Approved Pathℒ️ of just solving a very specific problem very specifically (edited)
14:18
When in fact the best solution still solves a pretty specific problem but in a way that is some ways less specific
14:19
The ultimate result in your case, though, was less code to maintain, a smaller surface area of your program, fewer "codepaths" whatever that might mean
14:20
And amusingly to me it feels like your final solution is not that far off a level of "abstraction" that many people might arrive at if they didn't go through the very specific thing first.
14:20
But, your abstraction works, and a naive attempt might not.
14:20
I realize I'm deviating from the topic of performance here 😬 (edited)
14:21
I need to get back to that
Avatar
ryanfleury May 21, 2021 02:21 PM
Well Allen made a really important point here that I don't want to miss, which is that something that is easier to maintain implies that it requires less time and less effort to maintain, which implies a larger number of iterations and a more manageable set of tasks to layer optimizations into.
☝️ 1
Avatar
bvisness May 21, 2021 02:21 PM
ah yeah
14:22
on that note, I suppose we could also explore our experience with systems that are hard to get good performance out of
Avatar
Allen4th May 21, 2021 02:22 PM
This is it. If I can't get anything else across from that story, I would hope this still makes it. Your capacity for handling ever increasing difficult problems is a finite resource.
πŸ‘ 4
Avatar
ryanfleury May 21, 2021 02:23 PM
We can indeed not expect humans to evolve past our current capabilities for dealing with complexity, at least any time soon! (And there is no obvious reason to suspect that such evolution would occur, at least in my eyes) And we do not already have infinite capabilities there, so clearly there is some limit to the complexity we can manage reasonably well. So the hardest problems are about taking all that we know and compounding it to a simpler set of concepts, so that the "complexity ceiling" is a bit higher from where we start. (edited)
Avatar
bvisness May 21, 2021 02:24 PM
sure, and even something as "simple" as very directly solving a specific problem might be locally more complex and difficult to deal with
Avatar
Allen4th May 21, 2021 02:26 PM
So I guess my answer to how simplicity and performance can go hand in hand is that both of them are hard earned refinements. Making something simpler without giving up performance isn't easy to accomplish, but if you can do it, you'll give yourself more room in your complexity budget, with the extra room in that budget you might find it's not so hard to get performance somewhere new.
Avatar
Avatar
bvisness
on that note, I suppose we could also explore our experience with systems that are hard to get good performance out of
ryanfleury May 21, 2021 02:31 PM
Trying to brainstorm about this. Frankly I don't have a whole lot of experience when it comes to this kind of thing, mostly because most of the programming problems I have dealt with in life have been invented by me, so I can conveniently change the constraints, or avoid nasty performance problems more easily. But let's say you're working with, you know, some file format designed by some people somewhere... hypothetically speaking... in some theoretical scenario. You're now locked into those constraints if you hope to be working with files stored in that format. The characteristics I think of here have to do with "distance", between some pattern of data extraction, and how that data is stored. Maybe what I am getting at is how much work goes into decompression.
Avatar
bvisness May 21, 2021 02:31 PM
Well so I could talk directly about laughably slow things at work if we need to πŸ™‚
Avatar
ryanfleury May 21, 2021 02:31 PM
So for instance, let's say some VERY HYPOTHETICAL FILE FORMAT you're dealing with happens to be storing a tree of some kind.
Avatar
ryanfleury May 21, 2021 02:32 PM
And in order to decode, say, the strings at each node in the tree, you have to do more-or-less all of the work required by a full decode.
14:32
In other words, there's a larger "distance" between the compressed form, and the decompression of just the strings and the hierarchy, but nothing else.
14:33
That is a lot harder to do something simpler, and therefore something that requires less work and is more performant, because the compressed data was not compressed with this particular use-case in mind, it seems.
14:33
So now, to get good performance for the problem of "at light speed, decode this file and just do a real quick gather of all the strings and which things they relate to in the tree, so we can do a very quick search over massive files in this format", it's a whole lot more complicated.
Avatar
Avatar
bvisness
Well so I could talk directly about laughably slow things at work if we need to πŸ™‚
ryanfleury May 21, 2021 02:34 PM
I am interested in this too πŸ™‚
Avatar
bvisness May 21, 2021 02:35 PM
sure, so in your case there's just a lot of friction in the way of doing a new thing that the designers didn't intend
14:35
I guess what I'd ask, though, is how much of that is because of "simplicity"
Avatar
ryanfleury May 21, 2021 02:36 PM
Well, I suppose I was thinking of it as compression. The more you compress, the more distance you add between a given decompression and the compressed data. Compression is also not free, and in that sense is "less simple". It also requires a larger number of codepaths on either side, because it's adding more decompression code. (edited)
Avatar
bvisness May 21, 2021 02:38 PM
Unfortunately I don't know enough about HYPOTHETICAL FILE FORMAT to really probe much further
14:38
(Ryan and Allen work at RAD on the new debugger, so you can imagine what we're talking about)
Avatar
ryanfleury May 21, 2021 02:38 PM
I am personally referring to DWARF, yes
Avatar
Allen4th May 21, 2021 02:39 PM
A part of what makes this an interesting example would start with picking apart how data pipelines influence performance.
Avatar
bvisness May 21, 2021 02:40 PM
sure, sounds interesting
Avatar
Allen4th May 21, 2021 02:41 PM
I'll give the quick outline, but here's the additional reading if you really want to understand: https://fgiesen.wordpress.com/2018/03/05/a-whirlwind-introduction-to-dataflow-graphs/
πŸ‘ 1
14:42
The performance issue with DWARF is that it packs things together in such a way that the only way to "discover" things is by fully unpacking everything in sequence.
14:42
It's like if you store things in a linked list.
14:43
With a linked list you can't start doing work on node N until you've discovered node N - 1
14:43
With DWARF it's not links you need, but sizes of information that is only determined by decoding the compressed representation that @ryanfleury was talking about.
14:44
Either way the concept here is that there is some work which is not negligible to do, and it not only "blocks" you from getting the info for this element, but it also "blocks" you from discovering the next element.
14:45
Therefore the only way to find "all strings" as Ryan was saying is to do all of the work.
14:46
That's the performance issue. Is this lack of simplicity? I haven't thought about that as much. You could certainly reduce the number of cases you have to unpack in DWARF and make it simpler, but it would still have this problem.
14:46
But what is pretty clear I think, is that you could get that performance back without increasing the complexity.
14:46
So it is at least another example that Complexity + RunningTime is not a conserved quantity. There are win-wins. (edited)
πŸ‘ 1
Avatar
ryanfleury May 21, 2021 02:50 PM
I am not exactly sure right now how "coupling" relates to all of this, but I've found myself thinking about it as well (when describing how codepaths can be orthogonal, and how that relates to expected sets of features). What is sort of interesting is how, by having a dependency between one full task and another full task, is that you introduce a "dependency", or... "couple?" those two tasks. So, in one use case, you actually are decoding everything. And the coupling for that is probably fine (although for pipelining reasons you are still pessimized to a degree). But in another, when you aren't, the "coupling" or "dependency chain" hurts you more, and ideally things were broken down further into more "orthogonal pieces". For example, one way of encoding hierarchies + strings, plus another layer of describing the rest of the "full information" of each node for instance (and maybe each "header" has a pointer into here), within this example. (edited)
14:52
That matters a lot less if you are just doing a full-parse, but if you want the task of getting a "breadth-first-scan" of everything to be in the "space of features you'd like", then ideally you have another "orthogonal codepath" that can be used for "header information". I don't know if that makes any sense, I am still trying to grapple with this!
Avatar
Allen4th May 21, 2021 02:54 PM
See, I wouldn't take it that way. And this is where I think orthogonality isn't doing you any favors. Doing more shouldn't mean more code paths by default. (edited)
14:55
In this case, the thing that would solve the scan would also accelerate the full parse in one system.
14:56
The problem is you'd have to be willing to shake up the whole DWARF thing, and end up with something incompatible. A quick fix couldn't do both.
14:57
(Take all of the claims about DWARF that I make with an implicit "as far as I know")
Avatar
ryanfleury May 21, 2021 02:58 PM
Well, when I say "another orthogonal codepath", I also mean subtracting from the existing codepath, and the full thing being a composition of both of them. So right now, you have one codepath that is like: unparsed file -- (full parse) --> decoded info And, for instance, with an acceleration layer on top, you "decouple" certain pieces from (full parse), with the full parse being now a composition of (header parse) -> (rest of info parse)
Avatar
bvisness May 21, 2021 02:59 PM
I think I am getting lost in the mines of moria, as it were
Avatar
ryanfleury May 21, 2021 02:59 PM
But what that means is that you can stop short of (rest of info).
Avatar
Avatar
bvisness
I think I am getting lost in the mines of moria, as it were
ryanfleury May 21, 2021 02:59 PM
Yeah maybe this is too in the weeds :P
Avatar
Allen4th May 21, 2021 03:00 PM
Yeah I think this is probably a bit out there. Hard to talk concretely about DWARF without first spending a really long time on the introduction to the problems.
Avatar
bvisness May 21, 2021 03:00 PM
It sounds to me, though, like the complexity of DWARF and your specific performance issues with it aren't actually too closely related
15:01
at least not if Allen thinks that you could get performance gains without really changing how "simple" the thing is
Avatar
Avatar
Allen4th
Yeah I think this is probably a bit out there. Hard to talk concretely about DWARF without first spending a really long time on the introduction to the problems.
ryanfleury May 21, 2021 03:01 PM
Yeah, probably not the best example, I just brought it up because that is the most notable example I've had of "systems that seem complicated that have performance problems", and I was trying to brainstorm. All of my personal programming work has not really dealt with this nearly as much. (edited)
Avatar
Allen4th May 21, 2021 03:01 PM
What I mean to say is that you don't have to increase complexity to get performance gains.
15:02
You could sort of just shuffle the complexity around add some take some away. But I think this statement raises an interesting general point:
the complexity of DWARF and your specific performance issues with it aren't actually too closely related
15:03
It's hard for me to put my finger on exactly the issue, but it's something like "arguing from lack of imagination" (not meant as an insult)
15:05
What I'm trying to get at is that if there is a relationship between these, we wouldn't necessarily see it now.
15:05
We would tend to understand it in light of a newer simpler format that completely sidesteps the performance problem.
Avatar
bvisness May 21, 2021 03:05 PM
sure, I can't say for sure that they are unrelated
15:06
but at least for now it's hard to say if they are
15:06
here's the sort of question I have in mind though
Avatar
Allen4th May 21, 2021 03:06 PM
Agreed.
Avatar
ryanfleury May 21, 2021 03:06 PM
Yeah I agree. Sorry to derail it with that, I was just searching my head for problems that might be related.
Avatar
bvisness May 21, 2021 03:07 PM
The question: For whatever problem you're working on, what is stopping you from making it more performant?
15:07
If you can tell that extra work is being done, what is stopping you from changing that
15:08
maybe in the DWARF case the answer is "the format isn't well-designed for my use case", and that's maybe related to simplicity, maybe not
15:08
I can tell you though that for some of what I've had to deal with, it's very related to simplicity
Avatar
ryanfleury May 21, 2021 03:10 PM
This is why I think "orthogonality" and "coupling" are actually important. If there is only one "axis" that produces some set of features, then if you wanted a subset of those features produced by a subset of that "code axis", you're out of luck. But if the format was designed with a few orthogonal pieces of information (requiring a few orthogonal codepaths), you can go along each "axis" independently, thus avoiding work implied by traversing along the other axes. And, you can still get to the full thing, by traversing all N orthogonal axes (whereas in the 1-axis case, that is the only possible path). (edited)
15:10
But I will set that aside (edited)
Avatar
Avatar
bvisness
I can tell you though that for some of what I've had to deal with, it's very related to simplicity
ryanfleury May 21, 2021 03:10 PM
I am interested to hear more on this
Avatar
bvisness May 21, 2021 03:11 PM
Really it just goes back to what Allen was talking about in his 4coder example
15:11
how difficult is it to a) conceptualize the problem, and b) actually apply a change
15:12
So if I find that something is slow, it might be hard to even find out what is going on (often the case in larger codebases made by many people over time, and especially in heavily OOP stuff (although I don't think OOP is exactly to blame))
15:13
But even when I do know what is going on, it might be really difficult to actually change something, either because it's huge, or because it's hard to change things without breaking unrelated things.
15:13
And both of these are issues of complexity.
Avatar
ryanfleury May 21, 2021 03:14 PM
Okay I know I said I'd set this aside, and I promise to stop, but hard to change things without breaking unrelated things sounds like N things that are not orthogonal!
15:14
(Okay now I am done, continue)
Avatar
bvisness May 21, 2021 03:15 PM
well in this case it's more like "hey let's just reuse this function because it seems convenient"
15:16
"oops I guess our use case is kinda different let's just special-case it"
15:16
etc.
Avatar
ryanfleury May 21, 2021 03:16 PM
It reminds me a lot of Casey's lecture he gave recently about the grass planting algorithm in The Witness, did you guys see that lecture? I don't think he uploaded it anywhere, unfortunately... (edited)
Avatar
bvisness May 21, 2021 03:17 PM
But as in the case Allen talked about, there's value to reusing things, especially from a maintenance perspective. I've just been going through the app adding a new permissions check to basically everything because it wasn't well-centralized before.
15:17
Nope, didn't see it unfortunately
Avatar
ryanfleury May 21, 2021 03:17 PM
Well, I'll give (my recollection of) the gist of what he was getting at.
15:19
The high level problem was that grass planting was taking a really long time, it took something like 30 seconds to place a "plant radius" (or whatever) and have the grass fully planted. And one of the big performance problems, I guess, was actually the raycasting, which was being used to project a ray from something in the up-direction onto the ground, so that it was easy to find out which up-axis position each grass thing needed to be planted at.
15:20
And digging into what was happening in that raycast function, it was a very generic codepath that did things like gather the list of entities (allocation) that could possibly be hit, test the ray against all of them, etc. etc.
15:21
And the way to make this fast for the problem at hand (grass planting) required decoupling a few of the pieces that made up the generic raycast.
15:22
So at first it was like:
// Z is up/down for each planting X/Y position in this square { get ray from -up- to the ground Array(Entity) entities = find all entities that could possibly be hit (broad phase) for each entity in entities { test ray against entity (narrow phase) if hit { set planting Z as the ray endpoint's Z break; } } }
15:24
But you can actually shift a bunch of work out of this loop, and have it only occur once---notably, finding the set of entities that can be intersected with. You aren't just casting a bunch of generic rays, you're actually casting a very specific set of rays, in one direction, against a uniform set of entities, in one particular place in the world, and you don't even necessarily want to care about all possible entities. Maybe you just want to care about the ground entities, or something like that. (edited)
15:25
So I think this is sort of a similar example of hey let's just reuse this function because it seems convenient
Avatar
bvisness May 21, 2021 03:26 PM
yeah, sure
15:27
That's pretty easy to keep in your head though, and easy to directly make a change to. Pretty simple overall.
Avatar
ryanfleury May 21, 2021 03:28 PM
(Well, the actual code that was really being hit originally was not very simple, there was a ton of stuff happening across several different codepaths, I am boiling it down quite a bit, from my recollection... But in effect the boiled down loop version is where the performance issue lies, and it required collapsing everything to notice) (edited)
Avatar
bvisness May 21, 2021 03:28 PM
Even so, it's direct, to use Demetri's term
15:30
What so often ends up happening at work, and I think elsewhere, is that there are many layers involved in the implementation, and each has their own constraints and expectations, and they just don't play well with each other
15:31
and it's hard to bypass these layers (and doing so incurs a maintenance penalty down the line)
15:31
and it's hard to change what the layers expect or how many layers there are (because they're used everywhere)
15:33
and maybe each layer seems mostly fine in isolation (here's your ORM, and here's your models that expect to use the ORM features, and here's your "services" that expect access to all the models, and here's your "controllers" that expect all the services, etc.)
15:33
a lot of the time performance problems arise at the boundaries, or sometimes just because of the boundaries
15:34
maybe you need to fetch way more from the database than you need, because the next layer up can't handle missing data
15:34
Or maybe you have a diamond problem where something at the "service" layer can't get optimizations from lower down because they have to go through two different models along the way.
15:35
Even so, I'm not sure I would say this is because of the complexity; you could write equally wasteful stuff very directly.
15:35
But the complexity is what makes it hard to fix.
15:36
So that's kind of my whole thesis on the issue. I don't think simplicity and performance are directly related much at all. But I think simplicity is critical if you want to actually have any hope of addressing performance problems. (edited)
πŸ‘ 1
Avatar
ryanfleury May 21, 2021 03:38 PM
Yeah that all makes sense, and I agree with the thesis to a degree, although I think what I would say is that simplicity can often impact constraints (by removing some, for example), and fewer constraints can mean more performant code.
15:38
For example, the malloc and free vs. memory arenas question.
15:39
The issue here, in my estimation, is what I mentioned in #fishbowl-audience, which is basically that "simplicity" is a bundle of things and not really that helpful of a word frankly, so it doesn't make a lot of sense to me to say it is or is not related---but rather that the relationship is complicated, because it's comprised of many pieces.
15:41
On the malloc/free vs. arenas question, I think the "simple" part of arenas is that which removes constraints, and this is reflected in an API for an arena, and the implementation. The implementation can be super small before it is useful (and in my experience, remains nearly trivial). Usage of an arena is also simple, where you just push things onto it, and this is because the constraint of "the user must be able to free anything they have allocated individually", which is present in the design of malloc and free, has been removed.
15:43
And in that sense, the problem statement is simpler because there are fewer constraints, which leads to the implementation being simpler, which leads to the API being simpler, which ultimately means less work happening, which ultimately means better performance. (edited)
Avatar
bvisness May 21, 2021 03:45 PM
Avatar
ryanfleury May 21, 2021 03:53 PM
Alright looks like we've about exhausted #fishbowl talk, maybe time to digest the conversation! Thanks everyone for participating, was really interesting. πŸ‘‹
Avatar
bvisness May 21, 2021 03:53 PM
Yeah this was really good to dig into!
Avatar
ryanfleury May 21, 2021 03:55 PM
[End of Fishbowl] The beginning of this conversation has been pinned, so if you'd like to read it from the start, go there! See you next time.
πŸ‘‹ 1