Entity system techniques

Hi! I have a nooby question about entity systems. Let's say I have a common entity struct to represent my various entities. How should I store data that only certain entity types will use?

A simple case is a game like Pac-Man: the pellets, the ghosts, and Pac-Man himself all share similiar data for physics, animation, etc. But one must also store data relevant only to ghosts, like who they are (e.g. Blinky, Pinky, etc.) and whether or not they are in that blue vulnerable state. Pac-Man and the pellets obviously don't need this data, so that's why I'm curious about this.

Some solutions:

1. Just add the ghost-specific data to the common entity struct regardless
  • Not much to say about this other than it works and is not very memory-aware

2. Use inheritance to derive specialized entity structs from the common entity struct
  • I cannot do this in C :(

3. Set up a composition(?) scheme wherein a ghost struct contains the ghost-specific data as well as a pointer to a common entity instance
  • I tried this and it worked well until I needed to reach ghost data when all that I had was an entity, which leads to the next solution..

4. Store the ghost data in a table of some sort, so that an entity can register itself as a ghost and query the ghost data from that table as needed
  • This is the inverse to (3) in a way, since it goes from common -> specialized. Similarly to (1), this would work but with enough specialized entities the number of tables would be very large


What techniques do you all use in this scenario? I know there are a couple of "this works for now" options up there, but I still thought it would be neat to see what alternatives you clever folk go with. Apologies if I overlooked anything too.

Edited by twelvefifteen on Reason: Typo
Multiple tables for multiple root types, then share variables
If two types have little overlap in variables and don't need to be in the same list, you can have different roots and store in separate collections, but there's a lot of room for variation within those root types using settings. The only drawback is redundancy in a few unused variables, but practice can make you find more generic names for the variables so that multiple types can reuse them. 80% variable use is good enough and the rest can have N/A values to make debugging easier. All weapons in Doom were created using only different images, sounds and variables. A shotgun just increased the number of projectiles per shot, made reloads slower, increased the spread varaible...

Why not try procedural C++ without classes? It has overloading, namespaces, references, array-based lists and lambdas for a clean and simple functional style. This will still give you fast and robust compilation by not exposing an entangled mess of classes in headers.

If you don't like std::vector, I have a wrapper that makes it faster and easier to use for beginners by pre-allocating a larger buffer by default, accessing elements by signed indices (to allow looping backwards without the dangers) and hiding the error-prone iterators.
https://github.com/Dawoodoz/DFPSR...er/Source/DFPSR/collection/List.h

Unless you absolutely must place unrelated types in the same list of entity classes (1990s OOP style), all you need is overloading for individual types. Overloading of global methods get the same perks as multiple inheritance, but without the entangled mess of having dependencies across unrelated classes. Define one type for static (standing still) items containing an index or pointer to the type. Define another type for dynamic (moving) items containing the type, team, health, et cetera. Then define interactions for pairs of dynamic items so that ghosts can eat the player and such. Another set of global functions define dynamic to static interactions so that the player can eat the items and receive perks.

Static-dynamic separation is much faster than having a single list of all types, because there's no point in handling (S+D)² possible interactions (all to all, including the impossible static to static interactions) instead of D + D². You can have a simpler broad-phase and still get much faster results than with polymorphism. Any special differences between the types of dynamic items can be expressed using variables in the type table or even lambda expressions for custom methods.

Pseudo C++:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
struct StaticItemType {
	// Physics properties
	CollisionEnum collisionType;
	// Lambdas defining what happens when interacting with it to be called from interact
	EffectFunction playerEffect;
	EffectFunction enemyEffect;

}

...

// Table describing different pick-ups and obstacles
List<StaticItemType> staticItemTypes;
// Table describing players and enemies using properties and lambda functions
List<DynamicItemType> dynamicItemTypes;

// All items that won't move
List<StaticItem> staticItems;
// All players, emenies, particles...
List<DynamicItem> dynamicItems;

void interact(DynamicItem &a, StaticItem &b) {
	...
}

void interact(DynamicItem &a, DynamicItem &b) {
	...
}

for (int item = 0; item < dynamicItems.length(); item++) {
	// Interact with any static items nearby using a fixed broad-phase (grid / quad-tree)
	...
	// Interact with any dynamic items nearby using a dynamic broad-phase (grid / sorting)
}

Edited by Dawoodoz on
Thank you for the reply! I'll be honest and say that it mostly went over my head, but that is okay because I can refer back to it. As a first step I think I'll take a look at the DOOM source code because that first part sounded interesting.
Any book about tables and normal forms in databases will also be a useful source of learning for table-driven programming. You may treat a 2D/3D vector as one value and disregard the first normal form's rule, because adding another dimension would not really be difficult for this little data. But when a weapon's projectile is describing the traits of a rocket, you refer to another table for particle types, so that an upgraded rocket launcher can be consistent about the ammunition properties without copying and pasting lots of child-settings. Then the rocket may refer to a third table of damage types using an index saying "explosive", which makes it easy to say that a certain enemy has a resistance or weakness to rockets and grenades at the same time.

Edited by Dawoodoz on
@twelvefifteen The answer to your question is most likely highly depending on what you are actually making and how many entities you have. I would suggest not trying to make a generic system that fit all needs (or future proof system), make something that fits your specific case at the time.

I prefer to start with a single struct with common things in the struct and type specific things in a union inside the struct. If I don't have performance issues than I don't think having a more complex system has advantages.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
typedef struct entity_t {
    
    vec2 position;
    vec2 velocity;
    
    u32 type;
    
    union {
        struct {
            u32 points;
        } player;
        
        struct {
            u8 behaviour;
        } ghost;
    };
    
} entity_t;


You seem to worry about the memory waste. If you measure the memory waste ( max entity size * max entity count ) - ( min entity size * max entity count ) depending on your game you can see if it's actually an issue. If you have 1024 entities, you'd need to waste 1024 per entity to waste 1Mo. Is that an issue ? Is wasting less memory more important that keeping the system simple ?

If you ran into performance issues, the issue might be about CPU cache usage. In that case, you may want to think about what is hot/cold in the entity instead of thinking of what is common/uncommon to all entities.

For example, you may use the name of a ghost once per frame so it's cold, but you use the position a lot per frame so it's hot.

You'll most likely iterate over all entity positions in the collision system, so you want to have positions tightly packed in memory so you have a better cache usage. So you would pull out positions into an array containing only the positions of all entities, and have some way to point to that in the entity struct. This could be a pointer, or having a intermediate array so the link can be both way or any other scheme. You could also keep the entity struct intact and make a local copy of all entities position when the collision system start its update, and copy out when it exits.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
typedef struct entity_t {
    
    vec2* position;
    vec2 velocity;
    
    u32 type;
    
    union {
        struct {
            u32 points;
        } player;
        
        struct {
            u8 behaviour;
        } ghost;
    };
    
} entity_t;

...
vec2 positions[ max_entity_count ];



Thank you, I found that very helpful. I really shouldn't try to preempt issues before I actually encounter them. I'll go with the tagged union approach.

Also, if you don't mind my asking, what does the two-way intermediate array scheme look like from your point on SoA? The other ones you mentioned make a lot of sense.
I depends on the relation of the different pieces. In the simple example above, the easiest way to have a way to get the entity from the position would be to always use the same index for the two arrays. In that case the pointer to position in the entity struct could be omitted because we always can get the index.
1
2
3
4
5
6
7
8
9
entity_t entities[ n ];
vec2 positions[ n ];

/* Index in position array is the same as in entities array */
entity_t* get_entity_from_position( vec2* position ) {
    umm index = position - positions;
    entity_t* result = entities + index;
    return result;
}


If you need to have one to many or many to many relations (I don't know if those are the actual terms in English) and have them go both way, You could use an table that links things together.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
typedef enum link_type_t {
    link_type_none,

    link_type_position,
    link_type_something,

    link_type_count
} link_type_t;

typedef struct link_t {
    u32 type;
    entity_t* entity;
    union {
        vec2* position;
        something_t* something;
    };
} link_t;

entity_t* get_entity_from_position( vec2* position ) {

    entity_t* result = 0;

    for ( umm i = 0; i < link_count; i++ ) {
        link_t* link = links + i;
        if ( link->type == link_type_position && link->position == position ) {
            result = link->entity;
            break;
        }
    }

    return result;
}


It might be better to have a different array for each type instead, so you iterate on the type you're interested in, have one less condition and better cache usage since you don't have the type field. (All this is conjecture, unless you measure it you don't know for sure what's best).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
typedef struct link_t {
    entity_t* entity;
    union {
        vec2* position;
        something_t* something;
    };
} link_t;

link_t position_links[ n ];
link_t something_links[ n ];

entity_t* get_entity_from_position( vec2* position ) {

    entity_t* result = 0;

    for ( umm i = 0; i < position_link_count; i++ ) {
        link_t* link = position_links + i;
        if ( link->position == position ) {
            result = link->entity;
            break;
        }
    }

    return result;
}

I would argue that pellets don't need to be entities at all. They are a tile state with the uneaten berry being a slightly special tile (read one flag to indicate pellet vs. berry and one flag to differentiate eaten-not eaten).

That shrinks your entity list for pacman to 5 At that point you may as well give each their own variable and struct.

The big thing to consider here: if you don't have many common operations on a set of entities, why shoehorn them into a single entity array?

There is no reason why you cannot have multiple entity arrays one for each fundamental type of entity.

I would argue that a particle system is an example of that, it's a separate array of state for the particle "entities" which uses separate logic from all the other entities to update and render.
I'm using the union of components technique as the entity system in my game as well, and for the most part I really like it. The one aspect of it I wrestled a bit with is how to best handle customization for each entity type. In my game, several entities need custom behaviour both in their update methods and on-collision methods. At first, I just had this custom behaviour as simple switch statements in the entity class' update/collision methods, but it started to get quite large and messy, so I wanted to find a way to separate out that specialized logic. And really the only way to do that while still keeping things relatively simple, is to use function pointers.

However, instead of putting the function pointers int the entity struct itself, I made an external table that hashes the entity ID to its corresponding update/collision methods.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
struct EntityDefinition {
    void(*update)(Entity*,GameState*,Input*,float);
    void(*collision)(Collision*,GameState*,float);
};

extern HashMap<EntityType, EntityDefinition> entityDefinitions;

...

void Entity::update(GameState *gameState, Input *input, float dt) {
    EntityDefinition def = entityDefinitions[entityType];
    if (def.update) {
        def.update(this, gameState, input, dt);
    }

    // Common entity update behaviour...
}


Then I just have a separate file in which I define/register the function pointers for each entity type that needs it:
1
2
3
4
5
6
HashMap<EntityType, EntityDefinition> entityDefinitions = {
    {EntityType_Something, {
        &update_something, &collision_something
    }},
    // etc.
}


So far this has been working pretty well for me, and it's very quick and easy to just update the table when I add new entities that require custom behaviour. One slight annoyance is that I can't use the syntax .update = &update_something for assigning the function pointers because MSVC 2015 doesn't yet support that. It's a very small number of functions though, so it's not really a big deal.
@mrmixer I appreciate the perspective, and I'll definitely consider structuring things like that should the need for a performance boost ever come up.

@Flyingsand Thanks for sharing! It seems like an interesting solution and I am glad that it has worked for you so far.