Pixel art fragment shader

I'm trying to implement the pixel shader Casey discusses in the episode "Pixel Art Games and nSight Shader Analysis" but I'm having some trouble:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#version 330 core
out vec4 fragColor;
  
in vec2 txC;
in vec4 col;

uniform sampler2D ourTexture;

float handleCoord(float c, float dc)
{
    float result = 0;

    // calculate span of texel
    float min = c - 0.5 * dc - 0.5;
    float max = c + 0.5 * dc - 0.5;

    // get floor values
    float floorMin = floor(min);
    float floorMax = floor(max);

    // if the span is inside a single texel
    if (floorMin == floorMax) min = 0;

    // casey's equation
    result = (floorMax - 1) + (floorMax + min) / dc;

    return result;
}

void main()
{
    vec2 atlasSize = vec2(512,512);

    // uv coords times atlas size to get texel space coords
    vec2 uv = txC * atlasSize;

    // calculate size of a texel
    vec2 duv = 1.0 / atlasSize;

    uv = vec2(handleCoord(uv.x, duv.x), handleCoord(uv.y, duv.y));

    // divide back the uv coords to normalized space
    fragColor = texture(ourTexture, uv / atlasSize) * col;
} 


That does render the textures somewhat correctly but I get fuzzy borders when there should be solid pixel borders.

Also, the calculations I have here are sampling a bit to the left and top of the wrong texture.

All my textures have a 1px transparent border and they are packed together by an asset preprocessor that gets me the uv coords inside the atlas for each sprite.

So I'm thinking I have the numbers wrong in the shader, since there is enough transparent space between textures for this technique work as I unerstand.

I can't figure out what I'm doing wrong in the shader though, so any help much appreciated!

Thanks

Edited by Rafael on
There was good article describing this technique in details here: https://jorenjoestar.github.io/post/pixel_art_filtering/
(follow the other articles it links to - they're good to).

Try replacing your shader with this:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
void main()
{
    vec2 size = textureSize(ourTexture);

    vec2 uv = txC * size;
    vec2 duv = fwidth(txC);
    uv = floor(uv) + vec2(0.5) + clamp((fract(uv) - vec2(0.5) + duv)/duv, vec2(0,0), vec2(1,1));
    uv /= size;

    fragColor = texture(ourTexture, uv) * col;
}

I haven't tried this as is, just ported from hlsl code I had around. Make sure texture has bilinear sampling set, and does not have mipmaps.
Trial and error by tweaking formulas won't solve the problem, because each OpenGL driver interprets the coordinate system differently with an error margin of half a texel. If you can calculate the expected result by hand and then get the same from the shader, you might however have something that will work on another computer.

While the divisions might be optimized away on your OpenGL drivers from being a constant, it's best to avoid doing divisions per pixel with uniforms and instead send the pre-divided reciprocal dimension as a uniform to multiply with. Otherwise someone else might get worse performance for no reason. This also allow using a dynamic atlas size for no significantly added cost.

If you can upgrade to OpenGL 4.2, it's much better to use the imageLoad function using integer pixel coordinates to get rid of multiplications too, and all the non-deterministic sampling in between. Texture coordinates can be represented in whole pixels from the altas. Note that some OpenGL drivers will produce random noise if you try to do any math operations using integers, but converting directly from float to integers usually work on most graphics cards.
https://www.khronos.org/registry/...Refpages/gl4/html/imageLoad.xhtml

If you don't want to rely on vertices or triangulation at all, you can use a full quad with an integer clip region and only sample with screen coordinates transformed by a texture transform matrix directly to the pixel shader. Then perform pixel clipping when outside of a rotated sprite of any shape. Thus eliminating ugly seams of missing background pixels on bad drivers.

Edited by Dawoodoz on
Hi, first of all thank you mmozeiko for that blog post - it is very informative and I haven't yet read it all with the attention and time I need to.

I've been trying out a few solutions from there and they do work but there are some things bothering me that I don't know if are related or not.

My renderer takes pixel coordinates top down since I use the following projection matrix:

1
2
3
4
5
6
7
8
    // Pixel top down space
    float projection[16] =
    {
        2.0f / windowDim.x, 0, 0, 0,
        0, -2.0f / windowDim.y, 0, 0,
        0, 0, 1, 0,
        -1, 1, 0, 1
    };


I push render commands to a command list in the game side.

In the platform, there is a loop that goes through the command list and "builds" a vertex array that then is used to batch render it all.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
static void
opengl_sprite_list_output(uint32_t atlas, render_command_sprite_list_t* spriteList,
    sprite_t* spriteTable, sorter_t sorter, vec2_t windowDim, gl_context_t* glContext,
    memory_region_t* tMem)
{
    uint32_t verticesCount = spriteList->used * 32;
    uint32_t verticesSize = verticesCount * sizeof(float);
    float* vertices = push_array(tMem, float, verticesCount);

    uint32_t indicesCount = spriteList->used * 6;
    uint32_t indicesSize = indicesCount * sizeof(uint32_t);
    uint32_t* indices = push_array(tMem, uint32_t, indicesCount);

    // 1 px size in texture space
    // used to eliminate transparent 1px border around the sprite
    float dc = 1.0f / 512.0f;

    for (uint32_t index = 0;
        index < sorter.entryCount;
        ++index)
    {
        uint32_t sortIndex = sorter.entries[index].listIndex;

        render_command_sprite_t* command = &spriteList->array[sortIndex];
        transform_t t = command->transform;

        float r = command->color.r;
        float g = command->color.g;
        float b = command->color.b;
        float a = command->color.a;

        // default values just in case something went wrong and the sprite will not be found
        float u0 = 0;
        float v0 = 0;
        float u1 = 1;
        float v1 = 1;

        // If spriteTable is available get sprite data
        if (spriteTable)
        {
            sprite_t sprite = sprite_get(spriteTable, command->sprite);
            u0 = sprite.uv.x + dc;
            v0 = sprite.uv.y + dc;
            u1 = sprite.duv.x - dc;
            v1 = sprite.duv.y - dc;
        }

        // If passed uv and duv coords to this function
        // then, calculate new uv coords relative to the original uvs
        vec2_t uv = command->uv;
        vec2_t duv = command->duv;
        if (uv != V2_0 || duv != V2_0)
        {
            float originalU = u0;
            float originalV = v0;
            float newWidth = (u1 - u0);
            float newHeight = (v1 - v0);

            u0 = originalU + uv.x * newWidth;
            u1 = originalU + duv.x * newWidth;
            v0 = originalV + uv.y * newHeight;
            v1 = originalV + duv.y * newHeight;
        }

        // transformations
        mat4_t scale = scale_matrix(t.scale.x, t.scale.y);
        mat4_t translate = translate_matrix(t.position.x, t.position.y);

        mat4_t modelview = translate;

        if (t.rotation)
        {
            mat4_t zRotation = z_rotation_matrix(t.rotation);

            modelview = zRotation * modelview;
        }

        modelview = (scale * modelview);

        // final vertices
        vec4_t va = vec4_t{ -0.5f, -0.5f, 0, 1 } * modelview;
        vec4_t vb = vec4_t{ +0.5f, -0.5f, 0, 1 } * modelview;
        vec4_t vc = vec4_t{ +0.5f, +0.5f, 0, 1 } * modelview;
        vec4_t vd = vec4_t{ -0.5f, +0.5f, 0, 1 } * modelview;

        // build temp versions
        float tempV[] =
        {
            va.x, va.y, u0, v0, r, g, b, a,
            vb.x, vb.y, u1, v0, r, g, b, a,
            vc.x, vc.y, u1, v1, r, g, b, a,
            vd.x, vd.y, u0, v1, r, g, b, a
        };

        uint32_t tempI[] =
        {
            (index * 4) + 0, (index * 4) + 1, (index * 4) + 2,
            (index * 4) + 2, (index * 4) + 3, (index * 4) + 0
        };

        // copy temp versions into place
        memcpy((void*)(vertices + 32 * index), tempV, sizeof(tempV));
        memcpy((void*)(indices + 6 * index), tempI, sizeof(tempI));
    }

    // set up opengl
    gl_shader_t* shader = &glContext->shaders[0];
    opengl_shader_use(shader);

    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    glViewport(0, 0, (int)windowDim.x, (int)windowDim.y);

    tex2d_bind(atlas);

    glUniformMatrix4fv(shader->uniforms[0], 1, GL_FALSE, projection);

    vbo_subdata(shader->vbo, vertices, verticesSize);

    ebo_subdata(shader->ebo, indices, indicesSize);

    // draw triangles
    triangles(indicesCount, 0);

    tex2d_bind(0);
}


It is all working fine - meaning it will render the sprites as expected to but there are a few artifacts, hence the pixel shader experimentation.

The pixel art I'm trying to work with has blocky tiles that need to be right next to each other, and when I'm using most of the solutions for the pixel shader the objects that are right next to each other can let background through depending on the camera zoom value (very few "zoom positions" can make the objects really be right next to each other without let any background show through).

That will go away when rounding vertices but that defeats the whole purpose of subpixel rendering if I understand it correctly:
1
2
3
4
5
6
7
   float tempV[] =
   {
       fast_round(va.x), fast_round(va.y), u0, v0, r, g, b, a,
       fast_round(vb.x), fast_round(vb.y), u1, v0, r, g, b, a,
       fast_round(vc.x), fast_round(vc.y), u1, v1, r, g, b, a,
       fast_round(vd.x), fast_round(vd.y), u0, v1, r, g, b, a
   };


So I guess I'm wondering what is the solution here.

Also, if there are other obvious problems with the renderer please tell me.

Dawoodoz

Trial and error by tweaking formulas won't solve the problem, because each OpenGL driver interprets the coordinate system differently with an error margin of half a texel.


That's the reason for the lengthy post here - I want to get the renderer issue out of the way first to make sure there isn't other underlying problems making the shader solution impossible to begin with.

Dawoodoz

If you can upgrade to OpenGL 4.2, it's much better to use the imageLoad function using integer pixel coordinates to get rid of multiplications too, and all the non-deterministic sampling in between. Texture coordinates can be represented in whole pixels from the altas. Note that some OpenGL drivers will produce random noise if you try to do any math operations using integers, but converting directly from float to integers usually work on most graphics cards.


Interesting stuff, I will use that function when available then.
Actually I am using opengl 4 already but the reason that shader had version 330 was just a mistake on my part - I have corrected that.
There is no need for imageLoad for that. imageLoad is meant for different purpose.

a) you can simply do nearest sampling mode with floored coordinates (+0.5 if you want to be extra safe) - that will load exact texel without any errors
b) on GL 3 you can simply use texelFetch that will get texel value from texture at exact coordinates

Anyways - what I want to say, there is no need to worry about magic "error margin of half a texel". That is not a real thing, except maybe in some old broken drivers. But in such cases there can be so many other broken things so I would not worry about that at all.

I don't know what is solution to your problem, because I don't know what you are rendering. But using that pixel-art shader in correct way should not give you any artifacts on borders or anything.
Grid
The pixel art I'm trying to work with has blocky tiles that need to be right next to each other, and when I'm using most of the solutions for the pixel shader the objects that are right next to each other can let background through depending on the camera zoom value (very few "zoom positions" can make the objects really be right next to each other without let any background show through).

That will go away when rounding vertices but that defeats the whole purpose of subpixel rendering if I understand it correctly.


It sounds like you might be rendering in a way that creates overlapping alpha edges, which are then composited together. It's a ubiquitous problem that alpha edges don't composite correctly. This isn't specific to the shaders you're using, it shows up all over the place:



The problem in this case is with using alpha to approximate coverage. With coverage, you know that for example, the left half of the pixel is covered by one sprite and the right half is covered by another sprite, and you can use that positional information to know that the entire pixel is covered and composite things correctly. When converting coverage to alpha, you lose the position information, so now all you know is that the pixel is "half covered", but not what half is covered. Alpha compositing does a best-effort approximation by assuming that an alpha value of 0.5 means the entire pixel is covered with half opacity, which in this case isn't exactly correct, and so gives a slightly wrong result.

There are at least two decent ways of solving this problem:

1. Instead of emitting alpha, use multisampling (MSAA) to get coverage. The downside is that you only have so many coverage samples available per pixel (usually 16 at most, and as few as 4 on some systems), so the blending of sprite edges will be somewhat "quantized", as you can see in this image:



2. Draw all the sprites at pixel-perfect 1:1 scale into a texture first, and then render that texture to the screen at a larger size with the fancy shader, instead of rendering the sprites directly to the screen. The downside is that this method only allows rendering things in perfect alignment with the pixel grid, though in my opinion pixel art games look best when they stick to the grid anyway. If you want some things to be rendered off-grid, you can also use a hybrid approach where the tile grid is rendered with this method first, and then other entities are rendered on top with either method #1, or with the original alpha-blended approach (which is less problematic for sprites whose edges you don't expect to be perfectly aligned).

Edited by Miles on Reason: come to think of it, no reason to even consider supersampling, multisampling is just better for this use case
notnullnotvoid
It sounds like you might be rendering in a way that creates overlapping alpha edges, which are then composited together.


I think this is not the case but maybe I'm wrong so I'll try to elaborate with images.

There are two textures being used:

They are packed together in a texture atlas but I can assure that they are rendered correctly in the atlas.

In the game, a "close to ideal" case:


You can see that in the image above, the first three red tiles to the left are right next to each other, but the last one is letting through a bit of a white line (the cloud background).

A more aggravated case:


A very zoomed in case:


The cases above led me to believe I was doing something wrong in the renderer, but now I'm thinking it's perhaps the shader indeed:


You can see these dark lines on the edges of the cloud are also artifacs.

The shader used to render these images:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#version 450 core
out vec4 fragColor;
  
in vec2 txC;
in vec4 col;

uniform sampler2D ourTexture;

vec2 uv_iq( vec2 uv, ivec2 texture_size )
{
    vec2 pixel = uv * texture_size;

    vec2 seam = floor(pixel + 0.5);
    vec2 dudv = fwidth(pixel);
    pixel = seam + clamp( (pixel - seam) / dudv, -0.5, 0.5);
    
    return pixel / texture_size;
}

void main()
{
    vec4 texelCol = texture(ourTexture, uv_iq(txC, ivec2(512))) * col;

    fragColor = texelCol;
}


That said,

- to be sure - I should not round or floor the vertices in the renderer to integer pixel positions if I want subpixel rendering right?

notnullnotvoid

Draw all the sprites at pixel-perfect 1:1 scale into a texture first, and then render that texture to the screen at a larger size with the fancy shader, instead of rendering the sprites directly to the screen. The downside is that this method only allows rendering things in perfect alignment with the pixel grid, though in my opinion pixel art games look best when they stick to the grid anyway.


I'm not sure if I understand it - are you saying to render the game in a tiny texture attached to a framebuffer, then scale that up and render it to the screen? Though if I do understand that correctly, doesn't that mean all the movement would happen in texel space, so the subpixel shader would be meaningless? Maybe I am confusing things.

notnullnotvoid

If you want some things to be rendered off-grid, you can also use a hybrid approach where the tile grid is rendered with this method first, and then other entities are rendered on top with either method #1, or with the original alpha-blended approach (which is less problematic for sprites whose edges you don't expect to be perfectly aligned).


So I could render the tiles using the tiny texture method, then whatever is off-grid is rendered "normally" on a second pass?

Again, I think I misunderstood what you said about rendering all the sprites at a pixel perfect 1:1 scale, could you elaborate on that?

Thanks
Those screenshots confirm it's definitely an alpha issue. The dark outline around the clouds is also alpha-related, but it's a different issue, specifically related to the fact that you are using straight alpha (as opposed to premultiplied alpha) and the transparent areas are black. This article does a good job of explaining why that causes problems and how you can fix it (with either bleeding or premultiplication):

http://www.adriancourreges.com/bl.../09/beware-of-transparent-pixels/

To be clear, using premultiplied alpha will fix the dark outlines, but it will not fix the apparent gaps between adjacent tiles.

Grid
I'm not sure if I understand it - are you saying to render the game in a tiny texture attached to a framebuffer, then scale that up and render it to the screen? Though if I do understand that correctly, doesn't that mean all the movement would happen in texel space, so the subpixel shader would be meaningless?


The sprites themselves will be locked to a pixel grid with that method, but you will avoid crawling or distortion of pixels when the camera zooms and pans by non-integer scale factors, which is what the shader is for. This shadertoy example gives a side-by-side comparison:

https://www.shadertoy.com/view/ltBGWc
notnullnotvoid

Those screenshots confirm it's definitely an alpha issue. The dark outline around the clouds is also alpha-related, but it's a different issue, specifically related to the fact that you are using straight alpha (as opposed to premultiplied alpha) and the transparent areas are black.


You are right - Now I am using premultiplied alpha and the sprites are looking much better! :D

notnullnotvoid

The sprites themselves will be locked to a pixel grid with that method, but you will avoid crawling or distortion of pixels when the camera zooms and pans by non-integer scale factors, which is what the shader is for.


I will try experimenting with rendering 1:1 and scaling it up as well to try and solve the other issue.

Thanks a lot!
mmozeiko

Anyways - what I want to say, there is no need to worry about magic "error margin of half a texel". That is not a real thing, except maybe in some old broken drivers. But in such cases there can be so many other broken things so I would not worry about that at all.


It's quite a common bug in OpenGL.
Dawoodoz
It's quite a common bug in OpenGL.


If you want to be believed (especially given your track record of making similar incorrect claims on this site), feel free to point to specific drivers or hardware that implement this incorrectly, with evidence that they do so. Unless and until you can do that, readers of this thread should be advised that the way opengl interprets texture coordinates and screen coordinates is guaranteed by the standard, and has been correctly implemented in mainstream drivers for a long time.