Jonathan Blow array

spaskeasm

#23360

September 4, 2020

I have been reading braid cleanup articles from Jonh's blog:
http://number-none.com/blow/blog/...6/07/17/braid_code_cleanup_2.html

and noticed Array<char*> type.

Does anybody know the general implementation of Jonh's Array. Is like std::vector that reallocates when it get's full or something else?

Edited by spaskeasm on September 4, 2020, 8:42am Reason: Initial post

Abner Coimbre

#23362

September 4, 2020

That's likely not public, but chances are very high this is a simpler std::vector. You can find similar alternatives out there:

1. Sean's stretchy buffer

2. Write your own stretchy buffer.

3. For a structured interface, see Hanson's Array ADT.

Once you've tried a couple, you'll be able to better see how to implement stuff like Array<char*>, or decide to stick with a stretchy buffer (which is what I do.)

spaskeasm
reallocates when it get's full

Yeah. Most of these implementations typically reallocate and double the amount of space internally available. That's not set in stone though! To learn about all possibilities, custom allocators are a superpower. See Ginger Bill's lovely articles.

Edited by Abner Coimbre on September 4, 2020, 7:02pm Reason: Typo.

spaskeasm

#23363

September 4, 2020

Abner
1. Sean's stretchy buffer

2. Write your own stretchy buffer.

3. For a structured interface, see Hanson's Array ADT.

I'll take a closer look at that.

Looking for something to replace all the things that I don't need from std::vector.

The problems I have with std::vector:
1) Throws std::bad_alloc when it can't allocate more memory. I don't use exceptions, so this terminates.
2) Always allocates new buffer even though realloc can extend the current memory
3) Has a bunch of stuff to handle std::move, destructors, constructors... I don't need any of that since all my types are POD
4) Tanks compile times
$ echo "#include <vector>" | gcc-10 -std=c++20 -P -E -x c++ - | wc -l
$ 19831

$ echo "#include <vector>" | gcc-10 -std=c++17 -P -E -x c++ - | wc -l
$ 10262

Maybe you have something to add to these?

Abner Coimbre

#23364

September 4, 2020

1) Throws std::bad_alloc when it can't allocate more memory. I don't use exceptions, so this terminates.

Yeah, Hanson's Array version does that and it's not even C++! (It raises a memory "exception" by setting setjmp / longjmp.)

2) Always allocates new buffer even though realloc can extend the current memory

Indeed. The stretchy buffer does a realloc:

"Unlike C++ vector<>, the stretchy_buffer has the same semantics as an object that you manually malloc and realloc."
-Sean Barrett

3) Has a bunch of stuff to handle std::move, destructors, constructors... I don't need any of that since all my types are POD
4) Tanks compile times

Man, I hear you. Modern C++ tends to very casually bring in hidden complexity. There's a well-known graphics API tutorial that goes into de-duplicating vertices from a mesh by:

(a) Importing an unordered map
(b) Overloading several operators, and finally,
(c) Designing a template specialization for a custom hash function.

which lets it store unique vertices through a vertex data type acting as an index to the unordered map:

// Loop over mesh shapes
// ...

if (uniqueVertices.count(vertex) == 0) {
    uniqueVertices[vertex] = static_cast<uint32_t>(vertices.size());
    vertices.push_back(vertex);
}

// ...
// End loop

Meanwhile, I went with vanilla C: a loop and memory compare over structs. Runtime performance was rather excellent. Over time, simpler decisions made my program compile faster and the codebase is more readable.

I mean, we'll need to introduce complexity as needed, like if / when my models grow in the million-vertices range. That's the thing though: the complete tutorial series never once went into this range and the complexity detracted from teaching us the API.

However (!) When we switch to vanilla C a funny phenomenon is people immediately expect more out of us? For example, the tutorial from before uses TinyOBJ to load a 3D model, like so:

1
2
3

if (!tinyobj::LoadObj(&attrib, &shapes, &materials, &warn, &err, MODEL_PATH.c_str())) {
    throw std::runtime_error(warn + err);
}

Contrast it with the vanilla C version of TinyOBJ:

1
2
3

int result = tinyobj_parse_obj(&attrib, &shapes, &num_shapes, &materials, &num_materials, MODEL_PATH, get_file_data, flags);
if (result != TINYOBJ_SUCCESS)
    die("failed to parse obj file: '%s'!\n", MODEL_PATH);

Notice in particular the new get_file_data and flags. The first is a callback -- the C version requires us to register one to read model data without any parsing or modification. My implementation does an mmap over the file, which is OS-dependent.

The flags parameter is usually set to TINYOBJ_FLAG_TRIANGULATE to treat our shape's faces as triangles. I believe the C++ version always defaults to triangulation so LoadObj() has the value as a default argument.

Edited by Abner Coimbre on September 5, 2020, 2:59am Reason: Minor typos / fixes.

spaskeasm

#23365

September 5, 2020

Abner

1
2
3

if (!tinyobj::LoadObj(&attrib, &shapes, &materials, &warn, &err, MODEL_PATH.c_str())) {
    throw std::runtime_error(warn + err);
}

Dude, is this from that Vulkan tutorial?

I am going to be a TA in the Computer Graphics course in the fall semester.
Graduating in a few days.

I was looking at different tutorials online to get the sense of what I can teach, and when I saw that Vulkan tutorial decided it was not going to be that.

I am probably going to stick with OpenGL, because it seems to be the best option right now to convey the core principles behind graphics.

The original question is part of getting all the data structures that we are going to need in the course to be simple to read, understand, use, and have reasonable runtime performance. Jonh showed part of his hashmap implementation in the stream with Sean Barett, and I was wondering what he does for Array<>.

Any suggestion in terms of material to cover, useful data structures, anything that comes to mind is appreciated. Thanks :)

Edited by spaskeasm on September 5, 2020, 9:41am Reason: typos, grammar

Abner Coimbre

#23368

September 6, 2020

Congrats! You mean you're going into a graduate degree now?

spaskeasm
Dude, is this from that Vulkan tutorial?

You guessed it.

spaskeasm
I was looking at different tutorials online to get the sense of what I can teach, and when I saw that Vulkan tutorial decided it was not going to be that.

I agree it's too overwhelming for an introductory graphics course.

Things could change if there were decent helper libraries in the API's native language (C99). Unfortunately the market for that is small as Khronos' main "customers" are engine people who have rendering layers custom to them.

spaskeasm
I am probably going to stick with OpenGL, because it seems to be the best option right now to convey the core principles behind graphics.

For sure. OpenGL isn't going anywhere anytime soon, but I'd encourage you to stick to the modern core profile, as that will make a transition to Vulkan that much easier (assuming part of your teaching goal is priming students for Vulkan.)

spaskeasm
Any suggestion in terms of material to cover, useful data structures, anything that comes to mind is appreciated. Thanks :)

I'd be surprised if you don't know about Joey's LearnOpenGL. His C++ style is more sane to me too. Go with that, and maybe design your own example programs?

Edited by Abner Coimbre on September 6, 2020, 7:55am Reason: Typo.

spaskeasm

#23370

September 6, 2020

Abner
Congrats! You mean you're going into a graduate degree now?

Thank you! One more exam left. (Computational Topology, my favorite this semester! :))
I live in Europe so we have Bachelor -> Masters -> PhD. I am going into my master's degree.

Abner

For sure. OpenGL isn't going anywhere anytime soon, but I'd encourage you to stick to the modern core profile, as that will make a transition to Vulkan that much easier (assuming part of your teaching goal is priming students for Vulkan.)

My plan was to go through most of learnopeng lessons with custom made examples and ImGui so that students
can easily experiment with tweaking different parameters.
Lab exam is project-based, and there is not enough time to go through everything so some part of Advanced OpenGL will be on students to figure out for themselves and apply. I think it's a good way to learn. Someone shows you the basics and how it all works, and you then figure out some other part of it.

Abner Coimbre

#23371

September 7, 2020

Sounds like a reasonable plan.

Oliver Marsh

#23388

September 12, 2020

I was wondering what's the best way to avoid storing pointers as references into the array, as they point to wrong memory if the array reallocs?

Just be careful or avoid storing pointer references?

Do c# & java keep the reference updated in the case the array reallocs? Is this a smart pointer?

Thankyou

Mārtiņš Možeiko

#23389

September 12, 2020

You can store array as pointer to struct { T* data; int length; ...} - this way even if data changes, your pointer to whole struct doesn't. That's pretty much what managed languages like Java and C# do.

Edited by Mārtiņš Možeiko on September 12, 2020, 7:49am

Miles

#23393

September 12, 2020

OliverMarsh
Do c# & java keep the reference updated in the case the array reallocs? Is this a smart pointer?

I don't know about C#, but in Java, you can only ever have an array of pointers/references (or of primitives, but you can't take pointers/references to those). Thus every object referenced in an array is individually allocated on the heap and therefore never moves, so even if the array gets reallocated, the objects referenced by it don't. That's one of the strengths of Java's restrictive memory management - if you can only take references to individually heap-allocated objects, and objects are only ever freed by garbage collection, reference invalidation becomes impossible, at least until you start dealing with JNI and such.

Dawoodoz

#23395

September 12, 2020

In VB .NET, reallocating an array while one of the elements is passed by reference gives a run-time exception to allow holding value elements and still pass them by reference between reallocation. A more functional approach is to separate between value and reference types and only pass value types by value so that the change has to be assigned again (essentially banning pass by reference for value elements).

Mārtiņš Možeiko

#23404

September 12, 2020

notnullnotvoid
Thus every object referenced in an array is individually allocated on the heap and therefore never moves, so even if the array gets reallocated, the objects referenced by it don't.

That's only true for arrays with reference types (class'es). For primitive types like int or float the array will have one big contiguous memory, just like in C.

Edited by Mārtiņš Možeiko on September 12, 2020, 9:01pm

Miles

#23417

September 13, 2020

mmozeiko
That's only true for arrays with reference types (class'es). For primitive types like int or float the array will have one big contiguous memory, just like in C.

Yes, I literally said that immediately before the part you quoted:

notnullnotvoid
in Java, you can only ever have an array of pointers/references (or of primitives, but you can't take pointers/references to those)