Performance penality of calling a function through function pointer

Hi all,

I'm currently developing a (very rudimentary) artificial intelligence for my game, and I choosed to use the "need-driven" method: every possible action is evaluated and then the best is picked.
In my game there will be something like a hundred of different actions, if not more... and for each of them I would like to have a different way of calculating its value.
For example, the value of attacking could be something like:

1
myLifePoints - opponentLifePoints * myDamage;


while the value of eating something could be:

1
myNeedOfEat * commestibilityOfObject + 1.0f;



the first thing I though of was some sort of struct, something like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
enum OperationType
{
  Operation_sum,
  Operation_mul,
  ...
};

struct Operand
{
  real32 value;
  uint32 attributeID;
};

struct AttributeCalculation
{
  Operand startingValue;
  
  OperationType* operations;
  Operand* operationsValues;
};


This approach would work quite well I guess, the only problem is that its not very flexible: what if I want the value of attacking to be 0 if the target is of my same type?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
inline real32 EvaluateAttackAction( Entity* actor, Entity* target)
{
   real32 result = 0;
   if( actor->type != target->type )
   {
       result = actor->lifepoints - target->lifePoints * actor->damage;
   }
   
   return result;
}



So I've decided to throw away that idea, and return to what I think are the most common solutions: switch statement or function pointers.

I've profiled the function pointer things, and its 216 cycles vs the 96 cycles that the direct version has. (Of course for now there is only one possible action, so the switch statement doesn't exist)
How bad is that? But how bad would it be to have that very big statement?
Of course the only way to know it for sure is to measure it, but maybe someone can avoid me the bother and tell me "don't use function pointers in this situation, they will not hold".

edit: thinking more about it, 216 cycles for the function pointer version is not too much looking at the 96 cycles at the other end: really the huge switch statement would cost me LESS then 120 cycles overall?

edit2: Yes it was totally dumb to do the question without making a decent test. With a "fake" switch statement of only 10 cases, the cycles have gone up to 238.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
uint32 value = 5;                     
switch (value)
{
  case 0:
  {} break;
  case 1:
  {} break;
  case 2:
  {} break;
                            
  ...
  
  case 5:
  {
    score = testing( regionEntity, entity );
  } break;

  ...                        
                            
                            
}


To give you an idea, the code I have in mind is something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
internal void AnalyzeWorldAndSelectBestAction( World* world, Entity* e )
{
    entity* target = 0;
    uint32 targetAction = 0;
    
    real32 bestScore = 0;
    for( uint32 entityIndex = 0; entityIndex < world->countEntity; entityIndex++ )
    {
        entity* possibleTarget = world->entities + entityIndex;
        
        if( possibleTarget != e )
        {
            for( uint32 actionIndex = 0; actionIndex < Action_count; actionIndex++ )
            {
                   FunctionPointer* ActionEvaluatePointer = world->actionPointers + actionIndex;
                   score = ActionEvaluatePointer( e, possibleTarget );
                   if( score > bestScore )
                   {
                       bestScore = score;
                       target = possibleTarget;
                       targetAction = actionIndex;
                    }
                }
            }
        }
    }
    
    if( target )
    {
        Vec3 toDest = target->regionPosition - e->regionPosition;
        e->acceleration = toDest;
        e->destAction = targetAction;
        e->destEntity = target;
    }
}


I guess that the loop could be inverted to limit the cache misses for the function pointer?
How much could that influence the final result?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
internal void AnalyzeWorldAndSelectBestAction( World* world, Entity* e )
{
    e* target = 0;
    uint32 targetAction = 0;
    
    real32 bestScore = 0;
    
    for( uint32 actionIndex = 0; actionIndex < Action_count; actionIndex++ )
    {
       FunctionPointer* ActionEvaluatePointer = world->actionPointers + actionIndex;
       for( uint32 entityIndex = 0; entityIndex < world->countEntity; entityIndex++ )
       {
          entity* possibleTarget = world->entities + entityIndex;
        
           if( possibleTarget != e )
           {
               score = ActionEvaluatePointer( e, possibleTarget );
                
               ...



Leonardo


Edited by erpeo93 on
Do you really need to change those formulas at runtime? Do they need to be so dynamic?
Why not simply put those calculations in C code? If you need to manage them easier or actual formulas look more complex than it is written (needs to access various data structures, etc), then you could simply have them in separate file that you can parse and generate C code for actual calculation.

What if I want the value of attacking to be 0
In such case you can create simple interpreter for evaluating arbitrary expressions. Here's the basic idea:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
enum ValueType
{
    ValueConstant,
    ValueVariable,
    ValueBinaryOp,
    ...
};

enum BinaryOp
{
    BinaryOpAdd,
    BinaryOpMul,
    // ...
};

struct Value
{
    ValueType Type;
    union
    {
        float Constant;
        float* Variable;
        struct
        {
            BinaryOp BinOp;
            Value* Left;
            Value* Right;
        };
        // ...
    };
};

float Eval(Value* value)
{
    switch (value->Type)
    {
    case ValueConstant:
        return value->Constant;
    case ValueVariable:
        return *value->Variable;
    case ValueBinaryOp:
        switch (value->BinOp)
        {
        case BinaryOpAdd:
            return Eval(value->Left) + Eval(value->Right);
        case BinaryOpMul:
            return Eval(value->Left) * Eval(value->Right);
        default: abort();
        };
    default: abort();
    }
}

int main()
{
    // myNeedOfEat * commestibilityOfObject + 1.0f;
    float myNeedOfEat = ...;
    float commestibilityOfObject = ...;

    Value values[5];

    values[0].Type = ValueBinaryOp;
    values[0].BinOp = BinaryOpAdd;
    values[0].Left = &values[1];
    values[0].Right = &values[2];

    values[1].Type = ValueBinaryOp;
    values[1].BinOp = BinaryOpMul;
    values[1].Left = &values[3];
    values[1].Right = &values[4];

    values[2].Type = ValueConstant;
    values[2].Constant = 1.0f;
    
    values[3].Type = ValueVariable;
    values[3].Variable = &myNeedOfEat;

    values[4].Type = ValueVariable;
    values[4].Variable = &commestibilityOfObject;

    float result = Eval(&values[0]);
    // ... use result
}


In a sense this is interpreter for mini scripting language.
You can add custom conditions/function types there to decide on specific extra things (like entity types).

Edited by Mārtiņš Možeiko on
No, I absolutely don't need them to be dynamic... I just want to define one time how the "attack action" value is calculated and store it somewhere.

Your idea is pretty nice, but I have a doubt:
Where does the "linking" between the Eval variables and my entities variables happens? and how?
Again, suppose my function is something like:

1
EvaluateAction( World* world, Entity* actor, uint32 actionID, Entity* target )


Let's assume the world has a pointer to a Value struct for every actionID... so it's constant to retrive the correct Value struct to use.
But know, let's suppose my Value has 2 "ValueVariable" operands inside it: where and how the link happens?

One thing I can image could help is to store all the entity attributes in an array, and then just store the "attributeID" instead of the real variable pointer... something like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
enum Attribute
{
   Attribute_1,
   Attribute_2,
   

   Attribute_count,
};

struct Entity
{
  Vec3 position;
  Vec3 acceleration;
  
  real32 attributes[Attribute_count];
}


But know, what if I want to say "you can attack this entity only if your Lenght(acceleration) is greater than its Lenght(acceleration)?
I have to specially handled those cases right?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
enum SpecialValue
{
   Special_acceleration,
   Special_velocity,
   Special_position,
   ...
};

enum ValueType
{
    ValueConstant,
    ValueAttribute,
    ValueSpecial,
    ValueBinaryOp,
    ...
};



Sorry, but I am still a bit confused...


Last year I worked on a card game (like Hearthstone) for a few month, and each card had a special ability. I tried to do a generic data system (a plain text file) with each card specifying its attribute and actions type (since several cards had "common but not exactly the same" behaviors). It worked ok but was hard to maintain or to add new cards. I think using a huge switch with each case being the exact code needed for a card would have worked better. So I would suggest you to try the switch method.

As for the perfs, you may want to make the game work before thinking about perfs since it will probably never be a problem.

Edited by Simon Anciaux on Reason: Typo
that change from 216 to 96 cycles may seem significant may seem significant but it's only part of what is being done,

You only save 0.06 microseconds (or 60 nanoseconds) on a 2 GHz machine per function call. Is that really worth the headache?
The loop is best inverted yeah. But I agree with ratchetfreak--it probably won't matter. If that code becomes a problem performance-wise, you'll probably ask "what can I do to select common actions quickly without evaluating every action every time?" and not "How can I shave off a few cycles?" And at that point you'll have to invert the loop again to make the new logic work

The difference to me at this point seems like the difference between 100 action evaluation functions and a 100 case switch. i.e. not that much
Yes, it seems to me that for now the best thing to do is to stay simple and go with the switch.

One thing that I'm thinking about is: let's say I have a particular Entity that can eat stones... at that point I would like to have a special "evaluating function" that works only for that Entity type... because all the other entities will not eat stone.

Considering this, now the choice between the two methods become more difficult in your opinion?

Imagine this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
enum EntityType
{
    Entity_standard,
    Entity_likesStone,
    Entity_count,
};

enum ActionType
{
    Action_move,
    Action_eat,
    
    Action_count,
};


real32 EvaluateEatActionStandard()
{
   ....
}

real32 EvaluateEatActionSpecial()
{
   ...
}


struct ActionFuctionPointers
{
    FunctionPointer pointers[Action_count];
};

ActionFunctionPointers entityTypePointers[Entity_count];


real32 EvaluateAction( Entity* e, ActionType desiredAction)
{
    ActionFunctionPointers* entityActions = entityTypePointers + e->type;
    FunctionPointer* toCall = entityActions->pointers + desiredAction;
    toCall();
}



The other option is to make special cases inside the switch cases:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
  ...

  case Action_eat:
  {
      if( entity->type == Entity_special )
      {
           //special calculation here
      }
      else
      {
           standard calculation here
      }
  } break;

  ...


But for the moment I can't think of a LOT of cases where I would like those pointers to be different from one entity to another, so I guess its best to stick with the switch and handle special cases like this inside the switch case itself.
graeme
The loop is best inverted yeah. But I agree with ratchetfreak--it probably won't matter. If that code becomes a problem performance-wise, you'll probably ask "what can I do to select common actions quickly without evaluating every action every time?" and not "How can I shave off a few cycles?" And at that point you'll have to invert the loop again to make the new logic work

The difference to me at this point seems like the difference between 100 action evaluation functions and a 100 case switch. i.e. not that much


yes, this is also something I planned on implementing :)

As an example, I want an "alchemyst" goblin to know that most of the time what he will do is craft potions and give them to other goblins. On the other end, a warrior goblin knows that the most common action for him is to attack every player as soon as they can.

So not only I would like to say that every entity has a "pool" of common actions that are evaluated first, but also some sort of "early out": if an alchemyst discover that crafing a certain potion retain a value greater than a fixed value, than do that and exit the loop immediately.


Edited by erpeo93 on
I was more thinking of not having function pointers at all.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
enum EntityType {
    Entity_standard,
    Entity_likesStone,
    Entity_goblin,
    Entity_goblinAlchemist,

    Entity_count,
};

void evaluate( Entity* e ) {

    switch ( e->type ) {
        
        case Entity_standard: {
            move( e );
            eat( e );
        } break;

        case Entity_likesStone: {
            move( e );
            eatStone( e );
        } break;

        case Entity_goblin: {
            move( e );
            attack( e );
        } break;

        case Entity_goblinAlchemist: {
            makePotion( e );
            sellPotion( e );
        } break;
    }
}
It sounds like your going for a utility-based AI system. There are some great talks and articles about a system that Dave Mark uses. Check out some of his talks. He calls it "Infinite Axis Utility AI." You might be able to find some articles about the performance of such a system as well.

Oh, and he wrote a book too.
mrmixer
I was more thinking of not having function pointers at all.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
enum EntityType {
    Entity_standard,
    Entity_likesStone,
    Entity_goblin,
    Entity_goblinAlchemist,

    Entity_count,
};

void evaluate( Entity* e ) {

    switch ( e->type ) {
        
        case Entity_standard: {
            move( e );
            eat( e );
        } break;

        case Entity_likesStone: {
            move( e );
            eatStone( e );
        } break;

        case Entity_goblin: {
            move( e );
            attack( e );
        } break;

        case Entity_goblinAlchemist: {
            makePotion( e );
            sellPotion( e );
        } break;
    }
}



Yes, it could work also that way. But I think the way you wrote it has a foundamental problem: if for every action considered (move, attack, ecc) the entity is going to "scan" the world around it, then this way it will scan the world multiple times.
Furthermore, I'm pretty sure I will have more then 100 hundreds different "entity types" (rocks, goblin, trees, ecc), but I'm not sure how many "actions" there will be. So for the moment I think the best loop is a loop that switch on the different actions, not on the different entity types.

I'm not totally sure about the switch vs pointers... I will start with the switch for sure, and lately if I found myself writing a lot of "switch inside switch" I will consider passing to function pointers.
CaptainKraft
It sounds like your going for a utility-based AI system. There are some great talks and articles about a system that Dave Mark uses. Check out some of his talks. He calls it "Infinite Axis Utility AI." You might be able to find some articles about the performance of such a system as well.

Oh, and he wrote a book too.


Man, I've read that book, it's just gold.
I'm totally going for a utility based AI, this discussion is just about the "how much this action satisfy my needs?"

At the end, the real formula will be something like:
1
2
3
4
5
6
7
8
real32 actionValue = EvaluateAction(); //this is the function we are talking about in this thread

real32 finalValue = actionValue * 0.3f + distance * 0.2f + abilityAtAciton * 2.0f + ...

if( finalValue < currentBestValue )
{
   ...
}


I'm going to instantly watch the talk, thank you very much for the links. :)

Edited by erpeo93 on