Table-driven C++?

Tables in C++ has always been quite ugly and error-prone with separate enums and arrays. Matching these can become difficult when more rows are added.

Making a code generator might be overkill for something this simple, because it affects the whole build system and usually ends up becoming a new compiler taking more time to maintain than the project itself. Generating lists from macros looks worse than redundancy. Plain arrays and lists cannot handle opaque handles and indices, because side-effects are forbidden in global construction. Dynamically loaded files don't allow naming core members with identifiers (needs slow run-time search on every access). Using a function for each property is slow and makes it hard to add another table row across each attribute.

Maybe there's a better solution for simple non-redundant tables in C++?

Edited by Dawoodoz on Reason: Initial post
What is "table" in C++?
mmozeiko
What is "table" in C++?


A collection of data entries. For table-driven programming, both rows and columns should be accessible by identifiers. Usually an array where each index has a named value in an enumeration.

Some people just use an enum and start hard-coding if statements in the code while hoping that the enumeration never grows. "if (t == a || t == c..." Which later breaks when more cases are added. Others gather those derived attributes in functions. "If (property(t))" But even here one might forget to include new input values. Some use classes. "if (t->property())" But allocating heap memory is a lot more expensive than passing an integer around. Other developers use table-driven code generators creating the enumeration and array from a 2D text notation which allow seeing all attributes in one place, but this comes at a high maintenance cost on the build system. I tried writing a programming language for this, but then most of the C ++ libraries became inaccessible and everything had to be rebuilt. So a cleaner way to solve this problem in standard C++ would be neat.

I suspect that constexpr and template programming could be a part of a re-usable table solution.

Edited by Dawoodoz on
I guess you're talking about something like this ("ID" is kind of the key of the record and you access the other fields as offsets):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
enum ANIMATION_DATA
{
    anim_bub_idle = 0,
    anim_bub_idle_bubble = 4,
    anim_bub_jump = 8,
    anim_bub_dead = 12,
    anim_bub_run = 16,
    anim_bub_moving_bubble = 20,
    
    anim_bubble_spawn = 24,
    anim_bubble_normal = 28,
    anim_bubble_bounce = 32,
    anim_bubble_alert = 36,
    anim_bubble_alert_bounce = 40,
    anim_bubble_pop = 44,
    
    anim_banebou_normal = 48,
    anim_banebou_captured = 52,
    anim_banebou_captured_alert1 = 56,
    anim_banebou_captured_alert2 = 60,
    anim_banebou_angry = 64,
    
    anim_invader_normal = 68,
    anim_invader_angry = 72,
    anim_invader_captured = 76,
    anim_invader_captured_alert1 = 80,
    anim_invader_captured_alert2 = 84,
};

i32 animation_list[22*4] = // ID, first, last, duration
{
    0, 0, 1, 20, // bub_idle
    4, 2, 3, 10, // bub_idle_bubble
    8, 4, 7, 8, // bub_jump
    12, 8, 11, 20, // bub_dead
    16, 12, 18, 6, // bub_run
    20, 19, 25, 50, // bub_moving_bubble
    
    24, 9, 15, 6, // bubble_spawn
    28, 0, 1, 10, // bubble_normal
    32, 1, 2, 20, // bubble_bounce
    36, 3, 4, 30, // bubble_alert
    40, 5, 6, 60, // bubble_alert_bounce
    44, 7, 8, 40, // bubble_pop
    
    48, 0, 3, 20, // banebou_normal
    52, 4, 7, 20, // banebou_captured
    56, 8, 11, 20, // banebou_captured_alert1
    60, 12, 15, 20, // banebou_captured_alert2
    64, 16, 19, 20, // banebou_angry
    
    68, 0, 1, 20, // invader_normal
    72, 2, 3, 20, // invader_angry
    76, 4, 7, 20, // invader_captured
    80, 8, 11, 20, // invader_captured_alert1
    84, 12, 15, 20, // invader_captured_alert2
};


If that's the case, yeah it's difficult to maintain. But I'm not aware of another solution and would like to know myself if there's one.
immortalx
If that's the case, yeah it's difficult to maintain. But I'm not aware of another solution and would like to know myself if there's one.


In my table-driven language (Basic inspired), you simply write like this using pipes to separate columns and commas to add identifier aliases. An animation could then have first and last aliases for each sequence. But introducing tables to the C++ standard would probably never happen due to C++ being a customizable language.

If one could overload the dot operator (maybe a future C++ version) to take a string with the attribute name in compile time, we could let the enum type refer to its table using a compile-time search for the identifier by inheriting from a table meta type. Then one could write "myIndex.attribute" to get "table[myIndex].attribute" or "myIndexA.indexB.attribute" to get "tableB[tableA[myIndexA].indexB].attribute" like in my table-driven language.

Edited by Dawoodoz on
Oh, I see it supports multiple types. That's pretty cool! Would you mind sharing some more info about your language?
immortalx
Oh, I see it supports multiple types. That's pretty cool! Would you mind sharing some more info about your language?


Yes, and you use function pointers instead of virtual calls so that data and actions are separate. The name of a function is used as both the type and default value to prevent calling null.
https://dawoodoz.com/steamroller.html
http://uu.diva-portal.org/smash/get/diva2:752088/FULLTEXT01
I'm thinking about making a clean rewrite of the whole compiler before giving out a first 1.0, so that it doesn't end up like Python. The initial prototype is flashy but requires Direct3D 11 just to start, so using my software renderer and being cross-platform would probably be better for a first release.

I can send you the source code if you want to play around with it. It's just a proof of concept with a messy file format and major need for refactoring, but you can compile to C99.
That seems like a very well-thought and massive project. Hats off to you!
I'm nowhere near the level of yours or other people that frequent this forum to understand source code of a big project (I even have difficulty understanding mine :p), so I'll better be waiting for a release.
I've been a vb6.0 user so I love the syntax! The guys over at vbforums would definitely appreciate this (I certainly do), as there have been many attempts at a vb replacement.
I haven't seen a mention of a technique called X-macros, which I use frequently, and which covers most of my table needs.

What you do is you define a macro that makes itself a number of calls to functions that are not defined.

Then you use this macro at various places, by first defining the undefined functions as macros and then simply writing the macro's name.

As an example, here's an excerpt from my OpenGL loader code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#define OpenGL_Procs \
OpenGL_Proc( PFNGLATTACHSHADERPROC,             glAttachShader ) \
OpenGL_Proc( PFNGLBINDBUFFERPROC,               glBindBuffer ) \
OpenGL_Proc( PFNGLBINDVERTEXARRAYPROC,          glBindVertexArray ) \
OpenGL_Proc( PFNGLBUFFERDATAPROC,               glBufferData ) \
OpenGL_Proc( PFNGLCOMPILESHADERPROC,            glCompileShader ) \
OpenGL_Proc( PFNGLCREATEPROGRAMPROC,            glCreateProgram ) \
OpenGL_Proc( PFNGLCREATESHADERPROC,             glCreateShader ) \

#define OpenGL_Proc(type, name) static type name;
OpenGL_Procs
#undef OpenGL_Proc

static const struct { const char *name; void (**funcptr)(void); } opengl_procs_to_load[] = {
#define OpenGL_Proc(type, name) { # name, (void(**)(void))&name },
OpenGL_Procs
#undef OpenGL_Procs
};


It's crufty, admittedly, but there's very little boilerplate involved and no extra build system integration is needed.

Edited by jstimpfle on
jstimpfle
I haven't seen a mention of a technique called X-macros, which I use frequently, and which covers most of my table needs.

What you do is you define a macro that makes itself a number of calls to functions that are not defined.

Then you use this macro at various places, by first defining the undefined functions as macros and then simply writing the macro's name.

As an example, here's an excerpt from my OpenGL loader code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#define OpenGL_Procs \
OpenGL_Proc( PFNGLATTACHSHADERPROC,             glAttachShader ) \
OpenGL_Proc( PFNGLBINDBUFFERPROC,               glBindBuffer ) \
OpenGL_Proc( PFNGLBINDVERTEXARRAYPROC,          glBindVertexArray ) \
OpenGL_Proc( PFNGLBUFFERDATAPROC,               glBufferData ) \
OpenGL_Proc( PFNGLCOMPILESHADERPROC,            glCompileShader ) \
OpenGL_Proc( PFNGLCREATEPROGRAMPROC,            glCreateProgram ) \
OpenGL_Proc( PFNGLCREATESHADERPROC,             glCreateShader ) \

#define OpenGL_Proc(type, name) static type name;
OpenGL_Procs
#undef OpenGL_Proc

static const struct { const char *name; void (**funcptr)(void); } opengl_procs_to_load[] = {
#define OpenGL_Proc(type, name) { # name, (void(**)(void))&name },
OpenGL_Procs
#undef OpenGL_Procs
};


It's crufty, admittedly, but there's very little boilerplate involved and no extra build system integration is needed.


I used macro based lists for defining the DLL interface of my Direct3D 11 engine, but other programmers were overwhelmed by trying to figure out what it meant.

Maybe I can make a more powerful pre-processor for C++ that allow extending existing languages with new constructs such as tables defined in a separate parsing language with type safety and error messages. Then it would be a little more added value to justify the cost of setting it up.
Since C99, you can include a "designator" in array initialization literials. language reference

Missing entries will be zero initialized.
In the example below, OrchInputCommand_hint_box_strings[OrchInputCommand_nav] == NULL
Its a little verbose using the designator but it also self documents at the same time.
This also works for nested/multidimensional arrays.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
typedef uint8_t OrchInputCommand;
#define OrchInputCommand_nav 0
#define OrchInputCommand_replace 1
#define OrchInputCommand_insert 2
#define OrchInputCommand_move 3
#define OrchInputCommand_delete 4
#define OrchInputCommand_modify 5
#define OrchInputCommand_undo 6
#define OrchInputCommand_redo 7
#define OrchInputCommand_locate 8
#define OrchInputCommand_validate 9
#define OrchInputCommand_save 10
#define OrchInputCommand_close 11
#define OrchInputCommand_COUNT 12
extern char* OrchInputCommand_hint_box_strings[];

char* OrchInputCommand_hint_box_strings[] = {
    [OrchInputCommand_replace] = "replace",
    [OrchInputCommand_insert] = "insert",
    [OrchInputCommand_move] = "move",
    [OrchInputCommand_delete] = "delete",
    [OrchInputCommand_modify] = "modify",
    [OrchInputCommand_undo] = "undo",
    [OrchInputCommand_redo] = "redo",
    [OrchInputCommand_locate] = "locate",
    [OrchInputCommand_validate] = "validate",
    [OrchInputCommand_save] = "save",
    [OrchInputCommand_close] = "close",
};

Edited by HRose on
HRose
Since C99, you can include a "designator" in array initialization literials.


Nice! That takes care of synchronizing indices with data when you erase an element in the middle of a long list, and you don't have to run it as side-effects from main when it's all inside of an array initialization.
This should take care of a few more cases by replacing the enum with sparse indices that are generated from 12-character names at compile time using ID(name) as the index. The values can also be stored in an enum for the often used cases where typos should be avoided, but new enums can be added for dynamic tables. Not sure how to get safety against bad input, but this at least works for the cases where functions are used instead of a dense array.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Pre-conditions:
//   name contains at most 12 characters before the null terminator
// Codes:
//   1..26 is a..z or A..Z (case insensitive)
//   _ is 27
constexpr uint64_t getSparseIndex(const char* name) {
	uint64_t result = 0U;
	for (uint64_t i = 0U; i < 12; i++) {
		char c = name[i];
		if (c == '\0') {
			return result;
		} else if (c >= 'a' && c <= 'z') {
			result = (result << 5) | (c - 'a' + 1);
		} else if (c >= 'A' && c <= 'Z') {
			result = (result << 5) | (c - 'A' + 1);
		} else if (c == '_') {
			result = (result << 5) | 27U;
		}
	}
	return result;
}
#define ID(NAME) getSparseIndex(#NAME)

Printing the indices is just the inverse in whatever format your library is using.

Edited by Dawoodoz on