handmade.network » Wiki » The history of opengl vertex data

Starting out

In the beginning there was glVertex. The model was simple; for every vertex the client called a function and set the attributes when there were enough vertices for a primitive the graphics driver rasterized it and emitted the triangle to screen.

1
2
3
4
5
6
7
foreach(mesh: meshes){
    glBegin(GL_TRIANGLES);
    foreach(vertex: mesh.vertices){
        glVertex(vertex.x, vertex.y, vertex.z);
    }
    glEnd();
}

Why this was bad: For every attribute of the vertex a function needed to be called. When mesh sizes and memory grew most applications had a loop running over the mesh data this then became a bottleneck.

Introducing vertex attribute arrays

So the next step is to let the client point the driver to a chunk of memory and specify how to read it. This is glVertexPointer the fixed function variant of glVertexAttribPointer.

1
2
3
4
5
glEnableClientState(GL_VERTEX_ARRAY);
foreach(mesh: meshes){
    glVertexPointer(3, GL_FLOAT, 0, mesh.vertices.data());
    glDrawArrays(GL_TRIANGLES, 0, mesh.vertices.length());
}

Why this was bad: The client is allowed to change the data pointed to at any time which means that the driver needed to read all the data during the glDraw* call to make sure it had a correct independent copy of the data even if the client doesn't change it. For glDrawElements this includes another step as the driver would need to read all indices to make sure it knows the upper bound of the data read. This became worse when the graphics cards got their own memory and became more asynchronous.

Introducing vertex buffer objects

Then came VBOs. This lets the driver encapsulate the data changes. They jury-rigged it by adding binding points like GL_ARRAY_BUFFER that you can bind a buffer to. Then the gl*Pointer calls referred to the bound buffer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
//init
foreach(mesh: meshes){
    glGenBuffers(1, &mesh.vbo);
    glBindBuffer(GL_ARRAY_BUFFER, mesh.vbo);
    mesh_data d = load_mesh_data( mesh.id );
    glBufferData(GL_ARRAY_BUFFER, mesh_data.size, mesh_data.data, GL_STATIC_DRAW);
    free_mesh_data(d);
}

//draw
glEnableClientState(GL_VERTEX_ARRAY);
foreach(mesh: meshes){
    glBindBuffer(GL_ARRAY_BUFFER, mesh.vbo);
    glVertexPointer(3, GL_FLOAT, 0, NULL);
    glDrawArrays(GL_TRIANGLES, 0, mesh.vertices.length());
}

Why this is bad: Every time you want to draw a different mesh you need to call glVertexAttribPointer for every attribute you use. Some GPUs had also evolved to use software vertex pulling which means that every time the format changes they need to recompile the vertex shader. (BTW this is also why opengl tends to get a bad reputation because the driver needs to patch the program for a lot of state changes that were "free" in older version)

Introducing vertex array objects

So they came up with half the solution (with a horrible name): Vertex Array Object. This caches the glVertexAttribPointer calls so that you can switch between meshes in a single call. When introduced in 3.0 it contains something like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
struct VAO{
    GLInt element_buffer;
    struct{
        bool enabled;
        bool normalized;
        GLInt type;
        GLInt count;
        GLInt stride;
        GLInt offset;
        GLInt buffer;
    } attributes[MAX_ATTRIBUTES] ;
}

It is used something like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
//init
foreach(mesh: meshes){
    glGenVertexArrays(1, &mesh.vao);
    glBindVertexArray(mesh.vao);
    glGenBuffers(1, &mesh.vbo);
    glBindBuffer(GL_ARRAY_BUFFER, mesh.vbo);
    mesh_data d = load_mesh_data(mesh.id);
    glBufferData(GL_ARRAY_BUFFER, mesh_data.size, mesh_data.data, GL_STATIC_DRAW);
    free_mesh_data(d);
    glVertexAttribPointer(posLoc, 3, GL_FLOAT, false, 0, NULL);
    glEnableVertexAttribArray(posLoc);
}

//draw
foreach(mesh: meshes){
    glBindVertexArray(mesh.vao);
    glDrawArrays(GL_TRIANGLES, 0, mesh.vertices.length());
}

Why this is not enough: the driver still needs to match the vertex format against cached compiled programs, It can still be faster than calling glVertexAttribPointer again for every attribute depending on how smart the driver is (YMMV™). There is also no way to change just the buffer binding or offset or in other words, the pointer to where the data is.

Separating vertex format from vertex buffer

The latest solution is the separated attribute format, this lets the client change where the data is orthogonal to the format of the data. The vao structure is updated to allow this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
struct VAO{
    GLInt element_buffer;
    struct{
        bool enabled;
        bool normalized;
        GLInt type;
        GLInt count;
        GLInt offset;
        GLInt binding;
    } attributes[MAX_ATTRIBUTES];

    struct{
        GLInt stride;
        GLInt buffer;
        GLInt offset;
    }bindings[MAX_ATTRIBUTES_BINDINGS];

}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
//init
glGenVertexArrays(1, &vao);
glBindVertexArray(vao);
glVertexAttribFormat(posLoc, 3, GL_FLOAT, false, 0);
glVertexAttribBinding(posLoc, 0);
glEnableVertexAttribArray(posLoc);

foreach(mesh: meshes){
    glGenBuffers(1, &mesh.vbo);
    glBindBuffer(GL_ARRAY_BUFFER, mesh.vbo);
    mesh_data d = load_mesh_data(mesh.id);
    glBufferData(GL_ARRAY_BUFFER, mesh_data.size, mesh_data.data, GL_STATIC_DRAW);
    free_mesh_data(d);
}

//draw
glBindVertexArray(vao);

foreach(mesh: meshes){
    glBindVertexBuffer(0, mesh.vbo, 0, sizeof(vertex));
    glDrawArrays(GL_TRIANGLES, 0, mesh.vertices.length());
}