Reading TTF files and rasterizing them using a handmade approach, Part 2: Rasterization

I streamed this entire post writing on twitch.tv/sharlock93 please do drop by if you want to see someone stumble their way into writing code and learning stuff. If you have any questions don't hesitate to ask here or on twitter @Sharlock93 more than happy to answer everyone.

Please let me know if you find any mistakes in this post, there are probably a few.

Introduction

This is the 2nd part to my previous article about Font reading and Rasterization, this part is about rasterizing the outlines.
The overall general topics covered in this part is:

creating an OpenGL context on Windows natively,
loading our needed functions for the OpenGL context,
learning some math about Bezier curves and how to draw font outlines
then for our final section we will be doing a scanline rasterization of our font outlines.

With that out of the way, lets begin.

Notes:
The Scanline rasterizer was what I learned and picked up from Sean Barret's(@nothings on twitter) stb_truetype.h library and how it worked:
http://nothings.org/gamedev/rasterize/ this is how stb_truetype.h works, we are going to implement a scuffed version of version 1 of the rasterizer.
https://github.com/nothings/stb/blob/master/stb_truetype.h the actual single header library.

Part 1: OpenGL Context Creation on Windows

Most platforms will have their own approach to managing a window and initializing an OpenGL context, you could drop in a library like (GLFW) and it would make your life easier but for the sake of not dealing with external dependencies we are going to simply do it ourselves, also because its fun.

On Windows to get OpenGL going you have to link against three libraries, user32.dll, gdi32.dll and opengl32.dll, all of these are part of windows natively. Our application start with including the Windows.h header and
here is the code snippet of those:

#include <Windows.h>
#pragma comment(lib, "user32.lib")
#pragma comment(lib, "gdi32.lib")
#pragma comment(lib, "opengl32.lib")

the #pragma here is used to link against the libraries instead of doing them on the command line.

1.1 Microsoft's Shenanigans

If its your first time opening a window on Windows its gonna be a bit complicated, if its your nth time you are copy-pasting the setup code right about now and having some PTSD moments.

Having to deal with OpenGL is going to add some extra complexity so we are going to take this one chunks at a time, the overall steps are:

Create a Window
Choose a pixel format that is hardware accelerated
Make an OpenGL rendering context
use OpenGL commands.

However the problem with those steps are we won't get a modern version of OpenGL we get version 1.1 as that is the only one microsoft supports natively, for newer versions of OpenGL you need drivers from your GPU vendor, most vendor drivers will provide functions to do the last two steps which will give you a context that supports newer versions of OpenGL however there is a weird caveat in that in order to load the GPU vendor provided API functions you need to have an OpenGL Context, so in essence you need to create a rendering context that is the native OpenGL 1.1 rendering context (a fake context) then using that context you load the API functions from your vendor and that will let you make a rendering context that supports newer versions of OpenGL, weirdly enough you only need the context to load the API functions, calling them doesn't require the context.

Note: All of this sounds weird I know blame Microsoft because I think back in the old days they dropped support for OpenGL to help push DirectX but don't quote me on that.

So now our last two steps become something like this

Make an OpenGL 1.1 rendering context.
Load the GPU vender functions that are used to make a better context.
Destroy the 1.1 Context.
Use the loaded functions to make a new context that has modern OpenGL support.
Use newer OpenGL commands
???
Some sort of financial gain.

Lets go through these steps (and the motions while we are at it)

1.1.1 Creating a window

before we start with all of these steps, lets setup some of our code, so far we have the includes and library linking, we will add a few global variables & typedefs that we will use that are going to be used across the program, here is our code so far, in this few sections we are going to fill the init_window function:

#include <Windows.h>
#pragma comment(lib, "user32.lib")
#pragma comment(lib, "gdi32.lib")
#pragma comment(lib, "opengl32.lib")


// these are just the normal datatypes redefined to make typing easier
typedef unsigned char u8;
typedef char i8;
typedef unsigned short u16;
typedef short i16;
typedef unsigned int u32;
typedef int i32;
typedef float f32;
typedef double f64;


#define WIDTH 500 // set to desired size for your window
#define HEIGHT 500 // set to desired size for your window
HWND global_window; //this is the handle to our window
HDC global_dc; //handle to the DC for our window

void init_window(i32 width, i32 height) {}

int main(int argc, char **argv) {
  init_window(WIDTH, HEIGHT);
}

Creating a window involves a few steps:

First register a Window Class by filling a struct called "WNDCLASSEXA" then call RegisterClassEx function to register it.
A Window Class is basically a factory for making windows it also defines the style that is going to be shared by all the windows you open up.
During the Window Class definition you also define a function that will process the received events/inputs such as keyboard, mouse events from any of the windows that are made under that Window Class
After defining & registering your Window Class you can make windows by calling the CreateWindowEx function.

Here is what the struct used to define a window class WNDCLASSEXA (straight from MSDN):

typedef struct tagWNDCLASSEXA {
  UINT      cbSize;
  UINT      style;
  WNDPROC   lpfnWndProc;
  int       cbClsExtra;
  int       cbWndExtra;
  HINSTANCE hInstance;
  HICON     hIcon;
  HCURSOR   hCursor;
  HBRUSH    hbrBackground;
  LPCSTR    lpszMenuName;
  LPCSTR    lpszClassName;
  HICON     hIconSm;
} WNDCLASSEXA, *PWNDCLASSEXA, *NPWNDCLASSEXA, *LPWNDCLASSEXA;

We will ignore cbClsExtra, cbWndExtra, hIcon, lpszMenuNmae, hIconSm, hInstance as they are not needed for our purposes

Here are some notes on the fields:

cbSize you need to set it to sizeof(WNDCLASSEXA)
style is a flag where you specify the class style/properties, we will set it to (CS_OWNDC | CS_HREDRAW | CS_VREDRAW)
CS_OWNDC tells Windows that each window under this class will have its own Device Context and not a shared one between all of the windows,
HREDRAW and VREDRAW tells Windows to issue redraw events when our window size changes horizontally or vertically.
lpfnWndProc takes a WndPorc function and every Event/Message that is issued for a window of that class will go through that function, it basically handles input as such things.
hCursor cursor look when the cursor enters your window.
hbrBackground sets the color of the background for newly created windows.
lpszClassName aptly named for name of the class, must be referenced when making a window under this class.

Once we filled the struct we register it via RegisterClassEx and then we can call CreateWindowEx and use the "Class Name" as a reference.

Here is what WndProc function prototype looks like, we can copy paste this and just change the name of the function.

LRESULT CALLBACK MainWndProc(
    HWND hwnd,// handle to window
    
	// the message recieved, check the link below, under Message Types table, in the last last section(General) you will find all the possible different WM messages categorized. we will need some of them.
    UINT uMsg,        
    
    WPARAM wParam,    // first message parameter
    LPARAM lParam     // more message info
)

Window Messages

we will add this function to our code but leave the body empty for now:

1 2	LRESULT CALLBACK rasterize_proc( HWND hwnd, UINT uMsg, WPARAM wParam, LPARAM lParam) { }

The function signature for CreateWindowEx looks like this

HWND CreateWindowEx(
  DWORD     dwExStyle,    // Extended Window Styles, possible values found in the link below
  LPCSTR    lpClassName,  // name of the Window Class we registered
  LPCSTR    lpWindowName, // the title of the window
  DWORD     dwStyle,      // a bunch of different styles for how the window should look, possible values in the link below
  int       X,            // initial position of the window
  int       Y,            // initial position of the window
  int       nWidth,       // width of the window
  int       nHeight,      // height of the window
  HWND      hWndParent,   // if this is a child window, pass its parent
  HMENU     hMenu,        // if you want your window to have a menu like (File, Edit, View..etc) you create a Menu and pass the handle here
  HINSTANCE hInstance,    // the instance of the application to associate with the window (an application can be ran multiple times to create multiple windows, hInstance distingutions these instances of the application)
  LPVOID    lpParam       // a pointer to a CREATESTRUCTURE struct, this is passed to the WndProc function through the lParam parameter
)

Window Extended Styles

Window Styles

Now lets register a Window Class and make a window, Here is the code:

WNDCLASSEXA wnd_class = {
		sizeof(WNDCLASSEXA),
		CS_OWNDC|CS_HREDRAW|CS_VREDRAW,
		rasterize_proc,
		0, 0,
		NULL, NULL,
		LoadCursor(NULL, IDC_ARROW),// loads a predefined cursor
		(HBRUSH)COLOR_BACKGROUND,//
		NULL,
		"rasterizer",
		NULL};

	RegisterClassEx(&wnd_class);

	RECT rect_size = {0, 0, width, height};

	//this function adjusts the Rect passed to make sure we can get a client rect (drawable area) that is width,height
	AdjustWindowRect(&rect_size, CS_OWNDC, 0);

	global_window =  CreateWindowEx(
			0, //No extended styles are needed
			"rasterizer", //the name of our registered class
			"rasterizer", //the name of our window title, riviting stuff.
			WS_VISIBLE, // because we want our window to be visible when its first created
			0, 0,// put our window at the top left corner of the monitor
			rect_size.right, rect_size.bottom, 
			NULL, NULL, NULL, NULL);

1.1.2 Choosing the pixel format

A pixel format is basically a description of a set of pixels/drawing surface, i.e: think of it as a describing a bitmap/picture, you specify how the pixels are laid out, their size in memory..etc.
In this context a pixel format is associated with OpenGL, the pixel format describes the area OpenGL will draw on, so basically it sort of definies our monitor/screen in this case.

To begin this process we first get a Device Context (DC), a DC is a structure that describes attributes related to GDI drawing, it also describes the region where we draw our stuff, so when we use OpenGL we also need a DC for our window to specify the region we are drawing on.

We get a (DC) for the current window by calling GetDC.
We then set the pixel format for the DC.
To set the Pixel Format properly, we fill a PIXELFORMATDESCRIPTOR structure and ask windows to give us the closest PixelFormat to the one we filled out.
We do this by calling ChoosePixelFormat with our filled struct and it will return the ID/Index of the closest pixel format and then we can pass that to SetPixelFormat to set it.

the PIXELFORMATDISCRIPTOR struct is this

PIXELFORMATDISCRIPTOR

typedef struct tagPIXELFORMATDESCRIPTOR {
  WORD  nSize;
  WORD  nVersion;
  DWORD dwFlags;
  BYTE  iPixelType;
  BYTE  cColorBits;
  BYTE  cRedBits;
  BYTE  cRedShift;
  BYTE  cGreenBits;
  BYTE  cGreenShift;
  BYTE  cBlueBits;
  BYTE  cBlueShift;
  BYTE  cAlphaBits;
  BYTE  cAlphaShift;
  BYTE  cAccumBits;
  BYTE  cAccumRedBits;
  BYTE  cAccumGreenBits;
  BYTE  cAccumBlueBits;
  BYTE  cAccumAlphaBits;
  BYTE  cDepthBits;
  BYTE  cStencilBits;
  BYTE  cAuxBuffers;
  BYTE  iLayerType;
  BYTE  bReserved;
  DWORD dwLayerMask;
  DWORD dwVisibleMask;
  DWORD dwDamageMask;
} PIXELFORMATDESCRIPTOR, *PPIXELFORMATDESCRIPTOR, *LPPIXELFORMATDESCRIPTOR;

The only fields you need to worry about are these

WORD  nSize; // must be set to the size of the struct, sizeof(PIXELFORMATDISCRIPTOR)
WORD  nVersion; // should be set to 1
DWORD dwFlags; // should be set to PFD_DRAW_TO_WINDOW | PFD_DOUBLEBUFFER | PFD_SUPPORT_OPENGL
BYTE  iPixelType; // should be set to PFD_TYPE_RGBA, basically saying each pixel is 4 bytes, red, green, blue, and alpha
BYTE  cColorBits; // should be 24, this sets the size of the pixels, this excludes the alpha bits as per MSDN so its 24 instead of 32 bits.

here is the code for these two steps:

// getting the device context for our window
global_dc  = GetDC(global_window);

// fill out our struct
PIXELFORMATDESCRIPTOR x = {0};
x.nSize = sizeof(PIXELFORMATDESCRIPTOR);
x.nVersion = 1;
x.dwFlags = PFD_DRAW_TO_WINDOW | PFD_SUPPORT_OPENGL | PFD_DOUBLEBUFFER;
x.iPixelType = PFD_TYPE_RGBA;
x.cColorBits = 24;

int pixel_format = ChoosePixelFormat(global_dc, &x); // this will give us the closest supported pixel format to the one we chose
SetPixelFormat(global_dc, pixel_format, &x); // set the pixel format

with a few more lines of code we will see our window show up

first we create an infinite loop to process events(msgs) that our window receives:

MSG _msg = {0};
while(!quit) {
    while(PeekMessage(&_msg, global_window, 0, 0, PM_REMOVE)) { 
		TranslateMessage(&_msg); 
		DispatchMessage(&_msg); 
	}
}

here we created another global variable (quit) to tell our program when to exit the infinite loop, next at the start of every loop we peek and see if we have any events(msgs) we have received, we pass the msg to TranslateMessage, this will translate "virtual key" to "characters", basically means when you press a key, the key is translated to an ASCII character that you can deal with, when pressing a key your keyboard produces a "KEYDOWN" and "KEYUP" msgs which your program recieves as "WM_KEYDOWN" and "WM_KEYUP" messages, if this up & down events are coming from keys that can be mapped to ASCII i.e: pressing the key A on your keyboard then TranslateMessage will generate a "WM_CHAR" message and tells you what button was pressed.

DispatchMessage is a lot easier, it basically sends the message to your windows procedure, in our case that would be the rasterize_proc function.

once our function has received this event we process it and then we must tell windows that we have processed this message and we do that by returning a value from our function, you can return '0' to indicate that you have handled the message. We aren't going handle every message so to let windows handle the messages we won't handle we will use the function DefWindowProc, it has the same signature as our function so we will pass the params directly, for now we will tell windows to handle all the messages.

1
2
3

LRESULT CALLBACK rasterize_proc( HWND hwnd, UINT umsg, WPARAM wparam, LPARAM lparam) {
	return DefWindowProc(hwnd, umsg, wparam, lparam);
}

now if you compile your application and run it, you will see that you have a window going :D

you will realize that you cannot close this window unless you kill its task in task manager..that is because our loop doesn't end. We can fix it by setting quit to 1 when the ESC key is pressed.

LRESULT result = 0;
if(umsg == WM_KEYUP && wparam == VK_ESCAPE) {
	quit = 1;
} else {
	result = DefWindowProcA(hwnd, umsg, wparam, lparam);
}
return result;

once we receive a "WM_KEYUP" event, we check if its from the escape button by checking the wparam parameter and if its VK_ESCAPE then we set quit to 1 and that is it.

Have fun playing with a window (into your soul, that is why its empty)

1.1.3 Creating an OpenGL context

This is a good time as any to discuss how we are going to load the vendor specific functions, we will do this by calling a function named wglGetProcAddress with the name of the function that we want.

Of course every function will have a different prototype and because OpenGL is always changing, the governing body over the OpenGL standard (Khronos Group) have provided headers that includes all the function prototypes that you can use to load your functions and all the typedefs used in OpenGL
here is where you can find these files:

khrplatform.h specifies data types and defines for specific platforms.
glcorearb.h this contains the function prototypes in the "core profile" for OpenGL
the "Core Profile" means that support for legacey code is not available so you can't use fixed functions.
wglext.h windows specific extensions that are needed.

all of these files are provided by the (Khronos Group), can you make these files yourself? Yes, will we do it here? No.

Khronos Group Website

using that link, you can get all three files.
due note that we have to include the khrplatform.h as <KHR\khrplatform.h> and the glcorearb.h as <GL\glcorearb.h>
the wglext.h can be just included normally.

so lets start on this, first we include glcorearb.h which in turn includes khrplatform.h then we include wglext.h too.

here is the code:

1 2	#include <GL\glcorearb.h> #include "wglext.h"

pretty impressive stuff don't you think?

just as a note, I compile this on the command line and not through Visual Studio, be sure to add the proper locations to your header path in your compiler to make sure it compiles.

Every function prototype for OpenGL has a similar structure when it comes to its data type mainly this: PFN + function name in uppercase + PROC, so in order to load the function wglCreateContextAttribARB (which we will rename to wglCreateContextAttribs basically removing the ARB) we will do so as this

1	PFNWGLCREATECONTEXTATTRIBSARBPROC wglCreateContextAttribs = (PFNWGLCREATECONTEXTATTRIBSARBPROC) wglGetProcAddress("wglCreateContextAttribsARB");

however if you go ahead and run that code right away you see that the function will basically return NULL and the reason for that is because we don't have an active OpenGL Rendering Context, what the rendering context does is basically track what the current configuration of the OpenGL state machine is and its implicit in all OpenGL related calls and drawing commands thus no OpenGL command will work if you don't have a context set. OpenGL has a lot of settings and parameters that can be set all of these different parameters and settings are tracked via the rendering context basically.

To create one we will call wglCreateContext that is provided by Windows and set it as the current context by passing the returned context to wglMakeCurrent, however wglCreateContext is not flexible and won't allow us to request things like a specific OpenGL version, once we have that context we will load the wglCreatecontextAttribsARB function and create a rendering context that will support OpenGL 4.5. here is the code:

1 2	HGLRC dummy_context = wglCreateContext(global_dc); wglMakeCurrent(global_dc, dummy_context);

Now if we load our function the result won't be NULL.

Alright we have a function that we can use to create a better context, to delete our current context we first have to remove it from being the current context by calling wglMakeCurrent with NULL parameters and then calling wglDeleteContext with our context handle dummy_context

In order to create a context with specific configuration we fill out an array and pass it to our loaded function wglCreateContextAttribs

the array data type is GLint and we fill it with attribute-value pairs, in order to indicate the end of the array we set the final element of the array to 0 this way we can avoid sending in the length of the array to the function.

the attributes we are going to request for our context is simple, first we will request OpenGL 4.5 by using the attribute names WGL_CONTEXT_MAJOR_VERSION_ARB and WGL_CONTEXT_MINOR_VERSION_ARB, we will also only ask for a "core profile" (basically we don't want any backwards compatibility with older version of OpenGL and only want the latest core functions) we do this by using the attribute WGL_CONTEXT_PROFILE_MASK_ARB with the value WGL_CONTEXT_CORE_PROFILE_BIT_ARB, here is what the array looks like for OpenGL 4.5 and "Core Profile"

GLint attr[] = {
		WGL_CONTEXT_MAJOR_VERSION_ARB, 4,
		WGL_CONTEXT_MINOR_VERSION_ARB, 5,
		WGL_CONTEXT_PROFILE_MASK_ARB,
		WGL_CONTEXT_CORE_PROFILE_BIT_ARB,
		0
	};

notice that this isn't a multidimensional array, we set the attribute by immediately following the attribute name with the value we want.We will then pass this array to our function and et voila we have the context we need. here is the code:

GLint attr[] = {
	WGL_CONTEXT_MAJOR_VERSION_ARB, 4,
	WGL_CONTEXT_MINOR_VERSION_ARB, 5,
	WGL_CONTEXT_PROFILE_MASK_ARB,
	WGL_CONTEXT_CORE_PROFILE_BIT_ARB,
	0
};

HGLRC gl_context = wglCreateContextAttribs(global_dc, 0, attr);
wglMakeCurrent(global_dc, gl_context);

that was a lot of boilerplate code...unfortunately there is more.

1.1.4 Loading the OpenGL functions
As we go on and write the rest of our code, we will be using some more OpenGL functions, so we have to load each of these functions before they are useable (except a couple of them which are already provided by windows), in the interest of not having to say "we need to load this function too" every time I use a new OpenGL code I will write all of the used functions here:

//related to Uniforms
PFNGLUNIFORM4FVPROC glUniform4fv;
PFNGLUNIFORM1IPROC glUniform1i;
PFNGLUNIFORMMATRIX4FVPROC glUniformMatrix4fv;

//related to program creation
PFNGLCREATEPROGRAMPROC glCreateProgram;
PFNGLLINKPROGRAMPROC glLinkProgram;
PFNGLUSEPROGRAMPROC glUseProgram;
PFNGLGETPROGRAMIVPROC glGetProgramiv;
PFNGLGETPROGRAMINFOLOGPROC glGetProgramInfoLog;

//related to shaders
PFNGLCREATESHADERPROC glCreateShader;
PFNGLSHADERSOURCEPROC glShaderSource;
PFNGLCOMPILESHADERPROC glCompileShader;
PFNGLATTACHSHADERPROC glAttachShader;
PFNGLGETSHADERIVPROC glGetShaderiv;
PFNGLGETSHADERINFOLOGPROC glGetShaderInfoLog;

PFNGLVERTEXATTRIBPOINTERPROC glVertexAttribPointer;
PFNGLENABLEVERTEXATTRIBARRAYPROC glEnableVertexAttribArray;

//related to VAO
PFNGLGENVERTEXARRAYSPROC glGenVertexArrays;
PFNGLBINDVERTEXARRAYPROC glBindVertexArray;

//related to buffers
PFNGLISBUFFERPROC glIsBuffer;
PFNGLGENBUFFERSPROC glGenBuffers;
PFNGLBUFFERDATAPROC glBufferData;
PFNGLBINDBUFFERPROC glBindBuffer;
PFNGLDELETEBUFFERSPROC glDeleteBuffers;

I have separated them into groups and how the functions go together.

The loading of these function is the same as how we loaded wglCreateContextAttribsARB so I made one function called load_funcs and put all the function loading in it, here it is:

void load_funcs() {
	glUniform4fv = (PFNGLUNIFORM4FVPROC) wglGetProcAddress("glUniform4fv");
	glUniform1i = (PFNGLUNIFORM1IPROC) wglGetProcAddress("glUniform1i");
	glUniformMatrix4fv = (PFNGLUNIFORMMATRIX4FVPROC) wglGetProcAddress("glUniformMatrix4fv");

	glCreateProgram = (PFNGLCREATEPROGRAMPROC) wglGetProcAddress("glCreateProgram");
	glLinkProgram = (PFNGLLINKPROGRAMPROC) wglGetProcAddress("glLinkProgram");
	glUseProgram = (PFNGLUSEPROGRAMPROC) wglGetProcAddress("glUseProgram");
	glGetProgramInfoLog = (PFNGLGETPROGRAMINFOLOGPROC) wglGetProcAddress("glGetProgramInfoLog");
	glGetProgramiv = (PFNGLGETPROGRAMIVPROC) wglGetProcAddress("glGetProgramiv");

	glCreateShader = (PFNGLCREATESHADERPROC) wglGetProcAddress("glCreateShader");
	glShaderSource = (PFNGLSHADERSOURCEPROC) wglGetProcAddress("glShaderSource");
	glCompileShader = (PFNGLCOMPILESHADERPROC) wglGetProcAddress("glCompileShader");
	glAttachShader = (PFNGLATTACHSHADERPROC) wglGetProcAddress("glAttachShader");
	glGetShaderInfoLog = (PFNGLGETSHADERINFOLOGPROC) wglGetProcAddress("glGetShaderInfoLog");
	glGetShaderiv = (PFNGLGETSHADERIVPROC) wglGetProcAddress("glGetShaderiv");


	glVertexAttribPointer = (PFNGLVERTEXATTRIBPOINTERPROC) wglGetProcAddress("glVertexAttribPointer");
	glEnableVertexAttribArray = (PFNGLENABLEVERTEXATTRIBARRAYPROC) wglGetProcAddress("glEnableVertexAttribArray");

	glGenBuffers = (PFNGLGENBUFFERSPROC) wglGetProcAddress("glGenBuffers");
	glBindBuffer = (PFNGLBINDBUFFERPROC) wglGetProcAddress("glBindBuffer");
	glBufferData = (PFNGLBUFFERDATAPROC) wglGetProcAddress("glBufferData");
	glDeleteBuffers = (PFNGLDELETEBUFFERSPROC) wglGetProcAddress("glDeleteBuffers");
	glIsBuffer = (PFNGLISBUFFERPROC) wglGetProcAddress("glIsBuffer");

	glGenVertexArrays = (PFNGLGENVERTEXARRAYSPROC) wglGetProcAddress("glGenVertexArrays");
	glBindVertexArray = (PFNGLBINDVERTEXARRAYPROC) wglGetProcAddress("glBindVertexArray");
}

very simple and straight forward.

1.2 Setting Up OpenGL

As if that wasn't enough, we have some more to go to setup OpenGL (fun fact: Vulkan is a lot more verbose)

Modern GPUs are designed to highly parallel CPUs and they are really good at running small programs in parallel. these small programs are called Shaders, these shaders are written in their own specific programming languages and are compiled and ran on the GPU, this programming language for OpenGL is called OpenGL Shading Language GLSL. Think of modern OpenGL as a pipeline, each section of this pipeline is called a "stage" and each stage has a specific function, their order is predefined and you cannot add more "stages"

Each stage has a particular name and a shader must be created for it, then these shaders are "conntected" (linked) together to create a "program", the output of one "stage" is fed into the next "stage", we will tell OpenGL to use this "program" for rendering.
this makes OpenGL very flexible and we can create multiple programs that will each treat our data different and produce different outputs from the same input.

the two necessary stages that are required for all programs are the "vertex shader" and "fragment shader", the vertex shader will be ran on all our "vertex" (think of it like a point) inputs and generate "fragments" then the fragment shader is ran on all the fragments generated and the output is displayed on screen. Our only focus will be those two stages and we will be doing pretty simple stuff.

the vertex shader is the first stage and its a necessary to have to create a valid GPU program, the vertex shader is the one that will get our "input" data, everything after that is passed from the vertex shader to the other stages.

think of the vertex shader as a box with a number of "holes" ,called generic vertex attributes, in it, the number of generic vertex attributes depends on GPU and OpenGL version, we can assign specific data to each of these attributes, usually the number of these input is 16. you can check yours by using the following code:
(of course you must )

1
2
3

i32 attribs_count = 0;
glGetIntegerv(GL_MAX_VERTEX_ATTRIBS, &attribs_count);
printf("%d\n", attribs_count);

in client code (the program you write is client and the GPU is the server) these inputs are referenced by number IDs, 0 to 15, in the GPU side you can reference them by using variables. more on this in a bit.

All of our shader code will be in files (you can have them as strings in your C source code too) so here is what we will do:

read shader source code from files.
compile these shader source codes.
link them together to make a program.
use this program to render.

1.2.1 Writing & Reading shader source code from files.

Nothing too complex here, we will write a function to read an entire file and return that for us, this is very simple as we already have the code from our previous article called read_file

Vertex Shader: all shader programs must have a main function where execution starts and each stage of the pipeline has specific functionality, here is the code for a very simple vertex shader

#version 450
layout(location = 0) in vec2 vertex_points;

void main() {
	gl_Position = vec4(vertex_points.x, vertex_points.y, 0, 1);
}

the #version 450 specifies the version of GLSL and its important to have here, version 450 means GLSL 450 and used with OpenGL 4.5
layout(location = 0) in vec2 vertex_points;

first this is a variable in the shader program, the variable name is vertex_points which is a vec2. GLSL has a set of internal data types vec2,3,4 basically these are vector data types for 2, 3, and 4 dimensional vectors as most code graphics code heavily relies on these data types as everything is math oriented. The in indicates that data is coming into this shader. we also have an out which indicates data going out of the shader and into the next stage.

layout(location = 0) basically specifies that we can reference this specific "input" at index zero in our client code, we can change this to any number we want. Usually you wouldn't write the number explicitly but rather ask OpenGL which index the variable vetex_points is located at but for our use directly assigning the index ourselves is a lot easier.

void main() {} very straight forward, the entry point for the shader.

gl_Position = vec4(vertex_points.x, vertex_points.y, 0, 1);

the variable gl_Position is a special internal variable in OpenGL that you assign the points of a polygon to, if you make a triangle you would assign the points of the triangle to this variable and this is how OpenGL knows which of the data we have sent into the vetex shader is the one used to actually represents a polygon.
The data type of this variable is "vec4" so we need to convert our "vec2" data coming in, we can convert this to "vec4" by assigning the x, y of the vec2 to the x, y of a vec4 and setting z,w of the vec4 to 0, 1 respectively, z is set to zero because we aren't dealing with 3D and w is set to 1 to indicate that is vertex data is actually a "point" and not a "direction". (the concept of point and vectors and how they are represented in computers is outside the scope of this post).

alright now here is the Fragment Shader

#version 450
out vec4 fragment_color;
void main() {
	fragment_color = vec4(1,1,1,1);
}

this is called a "passthrough" shader as there is nothing in it, only one out variable that is assigned the value vec4(1,1,1,1) for us right now we will just have one output variable and this fragment_color variable basically outputs a color for our fragments, in this case its pure white.

now that we have both of our shaders, we can go ahead and write them in a file. I have vertex_shader.glsl and fragment_shader.glsl

before we go further, we will dedicate all our initial OpenGL setup in a function setup_opengl we will call this right after we have loaded all our OpenGL functions as needed.

1 2	void setup_opengl() {}

so to load our two shaders, we will pass their name to read_file and that is about it.

1 2	char vertex_shader_source = read_file("vertex_shader.glsl"); char fragment_shader_source = read_file("fragment_shader.glsl");

1.2.2 Creating the Shaders & Compiling them

creating shaders is also very straight forward here are the functions that we will need:

glCreateShader(GLenum shader_type);
glShaderSource(GLuint shader, GLsizei count, const GLchar **string, const GLint *length);
glCompileShader(GLuint shader);
glGetShaderiv(GLuint shader, GLenum pname, GLint *params);
glGetShaderInfoLog(GLuint shader, GLsizei maxLength,GLsizei *length, GLchar *infoLog);

we create the shader variables (they are just integer handles/indices) by calling glCreateShader with our needed shader type enum mainly these two GL_VERTEX_SHADER, GL_FRAGMENT_SHADER, after that we attach the source codes of these shaders by using glShaderSource, because we have one giant string of null terminated string as the source we will set the count parameter to 1 and set length to NULL, then we call glCompilerShader to...you know..compile the shader as the name sugests.

The shaders might contain errors so in order to check that we use glGetShaderiv with the parameters GL_COMPILE_STATUS and we get the result by passing an int* as the 3rd parameter.

if the result is GL_TRUE then we have no error, otherwise we will ask for these errors, given the errors are reported as strings we will put them in a buffer so we will first ask what the size of the generated error log is by again calling glGetShaderiv but this time with the value GL_INFO_LOG_LENGTH after we have the size we can create an appropriate sized buffer and pass it to glGetShaderInfoLog, we can put all of this into a neato lil function that will make a shader for us and compile it.
here it is:

u32 create_shader_compile(char *shader_filename, GLenum shader_type) {
  char *shader_source = read_file(shader_filename, NULL);
  u32 shader = glCreateShader(shader_type);
  glShaderSource(shader, 1, &shader_source, NULL);
  glCompileShader(shader);
  
  u32 shader_iv_result = 0;
  glGetShaderiv(shader, GL_COMPILE_STATUS, &shader_iv_result);
  if(shader_iv_result != GL_TRUE) {
    glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &shader_iv_result);
    char *log = (char*) malloc(shader_iv_result);
    glGetShaderInfoLog(shader, shader_iv_result, &shader_iv_result, log);
    printf("%s\n", log);
    free(log);
    return 0;
  }
  free(shader_source);  
  return shader;  
}

and now our shader creation is simply this:

1 2	u32 vs = create_shader_compile("vertex_shader.glsl", GL_VERTEX_SHADER); u32 fs = create_shader_compile("fragment_shader.glsl", GL_FRAGMENT_SHADER);

if we have any errors we will know and it will be printed on the console.

1.2.3 Making a Program and linking shaders

now that you know how to make shaders, guess what? you know how to make a program, same basic idea just slightly different functions, here are our functions:

glCreateProgram();
glAttachShader(GLuint program, GLuint shader);
glLinkProgram(Gluint program);
glGetProgramiv(GLuint program, GLenum pname, GLint *params);
glGetProgramInfoLog(GLuint program, GLsizei maxLength, GLsizei *length, GLchar *infoLog);

same exact process as the shaders so we will make a function for this too:

u32 create_program_link_shader(u32 *shaders_to_link, i32 size) {
  u32 program = glCreateProgram();
  for(int i = 0; i < size; i++) {
    glAttachShader(program, shaders_to_link[i]);
  }
  
  glLinkProgram(program);
  
  i32 program_iv_result = 0;
  glGetProgramiv(program, GL_LINK_STATUS, &program_iv_result);
  
  if(program_iv_result != GL_TRUE) {
    glGetProgramiv(program, GL_INFO_LOG_LENGTH, &program_iv_result);
    char *log = (char*) malloc(program_iv_result);
    glGetProgramInfoLog(program, program_iv_result, &program_iv_result, log);
	printf("%s\n", log);
	free(log);
	return 0;
  }
  
  return program;
}

straight forward isn't it? and to use this program for with OpenGL? call glUseProgram(program).

this is what our setup_opengl looks like now:

void setup_opengl() {
	u32 vs = create_shader_compile("vertex_shader.glsl", GL_VERTEX_SHADER);
	u32 fs = create_shader_compile("fragment_shader.glsl", GL_FRAGMENT_SHADER);
	
	u32 shaders[] = {vs,fs};
	u32 program = create_program_link_shader(shaders, 2);
}

1.2.4 Miscellaneous & important OpenGL stuff

VBO: Vertex Buffer Objects, think of these as pointers to memory on the GPU, you generate these and use them to upload data to the GPU. here are the related functions we need for creating and uploading data:

1
2
3

glGenBuffers( GLsizei n, GLuint * buffers);
glBindBuffer( GLenum target, GLuint buffer); 	
glBufferData( GLenum target, GLsizeiptr size, const GLvoid * data, GLenum usage);

we will use them in the next section

once we upload some data to the GPU we have to tell the GPU how to interpret this data and what exactly the buffer contains, we do this by using this function:

1	glVertexAttribPointer( GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const GLvoid * pointer);

we will basically call this function to tell the GPU: "hey, the arry of data you are receiving on the index generic attribute array and each entry has this size components (vec2 3 or 4), its of this data type, the stride between the values is this many bytes, and the data we want starts at this pointer location", for all our needs we will set normalized parameter to GL_FALSE

imagine if you upload a colored triangle and upload one array that contains both the triangle vertices and its colors something akin to this:

1	triangle_data = [point1, point2, point3, color1, color2, color3]

then to tell the GPU where the data for index 1 comes from in the array, assuming point is a struct with 2 float numbers x, y, you would call the function like this:

1	glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 0, 0);

however if the color and point were interleaved such as this:

1	triangle_data = [point1, color1, point2, color2..etc]

now we have a stride as the consecutive values of the points is apart by the size of color data type so the function call is this:

1	glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, sizeof(color), 0);

VAOs: Vertex Array Objects, think of it as storing the description of your data, i.e: all your calls to glVertexAttribPointer and its related function, every time you want to setup all the vertex attributes and how the data inside them is laid out, you have to call glVertexAttribPointer again, in our situation we only call it once here and there but bigger applications will have multiple vertex attributes and need to call these a lot so you create a VAO, call the functions once and then just reuse that VAO and it will store all the configuration.

you generate a VAO by calling glGenVertexArrays and use "bind" them by...you guessed it glBingVertexArray. we will use this later.

Uniforms: as you know we send an array of data to the GPU, whether this data is colors of vertices or some other attribute, but what if we want to send one piece of data that will be common among all the vertices, think if you fully color a triangle with one color, you shouldn't be sending 3 color values to the GPU if you can afford to send just one! right?
this is were Uniforms come in, these are values that you can set that will be shared across all stages and all data sent to the GPU.

the way you specify a uniform is by adding the keyword uniform to the variable declaration, for example a vec4 uniform would be like this uniform vev4 <variable_name> = <default_value>

And in order to provide data to these uniform variables you use the glUniformWXYZ functions:

W can be either "Matrix" or nothing to indicate that we are specifying data to a matrix or not.
X is the number of components the variable has, "1" indicates a scalar variable like int, float...etc 2, 3, 4 indicate a vector of the same component. if its a matrix this indicates the size of the matrix 2 means a 2x2 matrix...etc.
Y indicates the data type, it can be one of (i, f, ui) to indicate ("integer", "float", "unsigned integer") respectively.
Z can be the letter 'v' or nothing, to indicate that we are specifying an array of these variables or a singular one.

so in order to specify data for this example variable
layout(location = 1) uniform vec2 test = vec2(0, 0)
we would do it like this: glUniform2f(1, x, y), we can also use the v function if we set the number of array to 1, like so

1 2	float v[] = {1, 2}; glUniform2fv(1, 1, v);

the number of parameters changes based on the function signature of course, we will be using some of these uniform functions.

Coordinates & Orthographic projection: OpenGLs rasterizer expects vertex coordinates to be inside a cube that is centered on the origin and has sides of length 2 in each of the axis so it goes from (-1,-1,-1) to (1,1,1), what this means is that if you send in any coordinate point that is outside this range it will not be viewed however we want to send in coordinates that are a bit more intuitive for us i.e: if I have a window of 500x500 pixels something in its center is at (250, 250), to fix this we will create a function that will map our coordinates from the screen's range of 0 to 500 to -1 to 1, its basic algebra but the idea is this: if we have our x axis go from 0 to 500 we have this inequality 0 < x < 500 we need to find some scale such that when multiplied by both sides of the inequality it becomes -1 < x < 1

we first divide all sides by 500 => 0 < x/500 < 1
then multiply all sides by 2 => 0 < x/250 < 2
we then subtract -1 => -1 < x/250 - 1 < 1
so if we divide every x value by 1/250 and then subtract 1 our x value will be mapped to -1 to 1

you can apply the same idea to the y axis too however on windows Y starts at the top left corner and increases down, to make it increase upwards you have to multiply everything by -1
making it something like this -1 < -y/250 + 1 < 1

we can either use these two scales and map all of x and y or we can shove them into a matrix and upload the matrix to the GPU and let the GPU do the matrix multiplication. This matrix where it maps the axis 1-to-1 is called an orthographic projection, here is the code to construct such a matrix:

void ortho_projection(float* m, float left, float right, float bottom, float top, float pnear, float pfar) {
	float width = (right-left);
	float height = (top - bottom);

	m[0 + 4*0] = 2.0/(width);
	m[1 + 4*1] = 2.0/(height);
	m[2 + 4*2] = -2.0/(pfar - pnear);
	m[3 + 4*3] = 1;

	m[3 + 4*0] = - (right + left) / (right - left);
	m[3 + 4*1] = - (top + bottom) / (top - bottom);
	m[3 + 4*2] = - (pfar + pnear) / (pfar - pnear);

}

You can find the derivation of this matrix here:
Projection Matrix Derivation

once we upload this matrix we can directly multiply it with the vec2 data right in the shader, so our vertex shader changes to something like this:

#version 450
layout(location = 0) in vec2 vertex_points;
layout(location = 1) uniform mat4 projection = mat4(1);

void main() {
	gl_Position = projection*vec4(vertex_points.x, vertex_points.y, 0, 1);
}

and we can upload our matrix like this

1	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);

the GL_TRUE is passed as the transpose parameter, meaning our matrix will be transposed when going into the shaders, why it must be transposed is due to some math stuff that is outside the scope of this post, reading some linear algebra will help clarify this. I highly recommend this series on YouTube:
Essence of Linear Algebra by 3Brown1Blue

1.2.5 Drawing a Triangle

We have all the parts in place, all we need is to combine everything into setup_opengl and we can draw our triangle,
here is the code for setup_opengl:

void setup_opengl() {

	glGenVertexArrays(1, &vao);
	glBindVertexArray(vao);

	u32 vs = create_shader_compile("vertex_shader.glsl", GL_VERTEX_SHADER);
	u32 fs = create_shader_compile("fragment_shader.glsl", GL_FRAGMENT_SHADER);

	u32 shaders[] = {vs,fs};
	u32 program = create_program_link_shader(shaders, 2);
	glUseProgram(program);
	
	glGenBuffers(1, &vbo);

}

I have added two global variables i32 vao, vbo; one of them is to contain the VAO and the other one is a VBO, usually you will have more than one VBO and VAO in any application but for us one is enough.

we are going to define some more data types to help us, here they are :

typedef struct point2 {
  float x; 
  float y;
} point2;

typedef struct color {
  i32 r,g,b,a;
} color;

the data for our triangle is this:

point2 triangle[] = {
  {0, 0},
  {100, 100},
  {200, 0}
};

in order to upload this to the GPU, we will first store in the VBO, then describe how the layout of the data, then tell the GPU to draw it

first the upload:

1 2	glBindBuffer(GL_ARRAY_BUFFER, vbo); glBufferData(GL_ARRAY_BUFFER, 3*sizeof(point2), triangle, GL_STATIC_DRAW);

the glBindBuffer will bind a buffer, in our case a vertex buffer (there are other types), to something called a target, per the specification a "target" defines how the VBO is going to be used, GL_ARRAY_BUFFER means our VBO contains vertex related data like "position" of the vertex or "color" of the vertex and such. there are other targets with more advance usage but that is for another post.

we do the actual upload by using glBufferData, we upload the data to the target our VBO is bound to, we specific the size of the data in bytes and also specify how our data is "used", this is sort of like a hint to OpenGL to make some optimizations around the data, say for example you upload a big atlas of glyphs once and keep reusing it for that you can specify GL_STATIC_DRAW, the other types are used for different usages.

after we upload the data we specify its "format" and which "generic attribute" its going to in the shader by using glVertexAttribPointer like we discussed before, our triangle points are only x and y so only 2 components, their type is floats, the stride between them is 0 because there is no other data interleaved into them and they start at 0, they are going into attribute 0 in our shader so the function call is

1	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);

finally we call glDrawArarys and specify what type of basic geometric primitive we want to draw with our data, at which vertex does our data start and how many vertices is it, for our triangle you would specify "GL_TRIANGLES" and it starts from vertex 0 and goes for 3 vertices. so

1	glDrawArrays(GL_TRIANGLES, 0, 3);

oh least I forget, we actually need to enable the "generic attribute" we want, i.e: attribute 0 must be enabled in our case. we do this by calling glEnableVertexAttribArray(0), very nice naming

so all together here is our triangle code:

void draw_triangle() {

	point2 triangle[] = {
		{0, 0},
		{100, 100},
		{200, 0}
	};

	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, 3*sizeof(point2), triangle, GL_STATIC_DRAW);
	glEnableVertexAttribArray(0);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);

	glDrawArrays(GL_TRIANGLES, 0, 3);
}

we call this code inside our main function infinite loop, after the messages have been processed and after we do the rendering because we are using a double buffer we are going to swap the buffers by calling SwapBuffers(global_dc) the global_dc variable is our Device Context we created earlier as part of the window creation.

so here is the main function:

int main(int argc, char **argv) {
	init_window(WIDTH, HEIGHT);
	setup_opengl();

	MSG _msg = {0}; 
	while(!quit) {
		glClear(GL_COLOR_BUFFER_BIT);

		while(PeekMessage(&_msg, global_window, 0, 0, PM_REMOVE)) { 
			TranslateMessage(&_msg); 
			DispatchMessage(&_msg); 
		} 

		draw_triangle();

		SwapBuffers(global_dc);
	}
}

et voila triangle.

there are other concepts such as textures that I will discuss later as this is enough OpenGL for now.

Part 2 The other side of the coin

all of that so far was just so we can draw on the screen, quite a lot and my verbosity in writing doesn't help it.

Now we come to the actual meat of the post (the previous part was just the skin so you can totally remove it).

Here are the steps we are going to take to rasterize the outlines:

get a glyph outline (done in previous post)
because the outlines have Bezier curves, we are going to generate actual points that are on the glyph's outline based on these curves.
make edges from the generated points, this is a simply taking 2 consecutive points and making them into a line.
use a scanline algorithm to rasterize the glyph into a bitmap.
upload the bitmap as a texture to the GPU and draw it.
end this god damn post.

2.1 Bezier curves and how to draw them them

this part will cover the Bezier curves, both quadratic and cubic Bezier and how to generate points on said Bezier curve.

2.1.1 Basic of Bezier curves

Bezier curves are basically a set of points that all together describe a curve, they are usually used in computer graphics and similar fields.

the most basic Bezier is a linear line going from point P0 to P1, described by the function B(t) = (1-t)P0 + tP1 which is just a linear interpolation between P0 and P1 over the variable t, when t = 0 you get P0 and when t = 1 you get P1, this is the bases of all other higher dimension Bezier, the only difference is the function

for Quadratic Bezier you have the points P0, P1, P2 where P0 is the start point P1 is called the control point and P2 is the end point, it has the following function

1	B(t) = (1-t)^2P0 + 2(1-t)tP1 + t^2P2

Cubic Bezier is the same but one order higher and has 4 points.
Wikipedia has nice animations and general formulas for the important Bezier Curves.
Bezier Curves on Wikipedia

Most of our code will deal with Cubic & Quadratic Bezier, one important fact about Cubic Bezier is that it can be broken down into 2 separate Quadratic Bezier, here is what a Cubic Bezier looks like.

Yours truly can't create anything remotely nice.

as you can see we have 4 points P0 to P3, P0 and P3 are the start and end points while P1 & P2 are considered the control points, the way we can convert this Cubic Bezier to a Quadratic is to find a "middle point" between P1 & P2 this new found point Pmid can be used as the end point for the Quadratic Bezier P0, P1, Pmid and the start point for the 2nd Quadratic Bezier Pmid, P2, P3, in this way we have broken down a Cubic Bezier to Quadratic.

I'm touching on this idea here because in the glyph outlines we will encounter Cubic Bezier curves and we will break them down into Quadratic to make life easier.

2.1.2 Generating points on the Bezier Curve

In order to actually draw the curve you have to generate points on said curve and connect these points as straight lines, this is basically tessellation, the more points you find the more lines you will produce and the better the approximation.

There are techniques out there that will allow you to approximate the curve a lot better and more efficiently than what we will do but ours is easiest to follow.

we will simply take the formula for the bezier and simply subdivide t into the number of steps we want and generate points that way, lets write some code for this:

 void tessellate_bezier(point2 *output, i32 *output_size, point2 p0, point p1, point p2) {
    i32 subdiv_into = 5;
    f32 step_per_iter = 1.0/subdiv_into;
    i32 out_size = 0;
    for(int i = 0; i <= subdiv_into; i++) {
      f32 t = i*step_per_iter;
      f32 t1 = (1.0 - t);
      f32 t2 = t*t;
      f32 x = t1*t1*p0.x + 2*t1*t*p1.x + t2*p2.x;
      f32 y = t1*t1*p0.y + 2*t1*t*p1.y + t2*p2.y;
      output[out_size].x = x;
      output[out_size].y = y;
      out_size++;
    }
    
    *output_size = out_size;
 }

pretty straight forward code, we specify our desired subdivision (which 5 here) then calculate how much we should step per iteration of the loop and basically calculate t based on that, the rest is just the formula applied to both x and y, of course to note here is the output variable, we must make sure we supply a big enough array so we can fit all our points in it, for our purpose and the font we are going to use a size of 128 should be sufficient, of course you can change it according to your needs or even manage this inside your loop if you wanted to.

lets draw a curve then, we will use the points we used for the triangle, mainly

1
2
3

p0 = {0, 0};
p1 = {100, 100};
p2 = {200, 0};

after we generate the points, we can upload them to the GPU and when we draw we change the primitive drawing from a triangle to consecutive line drawing, i.e: from GL_TRIANGLES to GL_LINE_STRIP and use output_size as the count of our vertices.

here is the result:

now let me show you what happens if we don't subdivide properly, if we use the following points:

point2 curve[] = {
	{0, 0},
	{50, 300},
    {200, 200},
};

but don't subdivide/step a good amount, here is the result:

that is subdivided only 5 times, if we increase it to 20, we get this:

there are algorithms out that will continuously subdivide sections of the line until it reaches a particular "curvature".

here is the code for drawing the bezier lines

void draw_bezier() {
	point2 curve[] = {
		{0, 0},
		{50, 300},
		{200, 200},
	};

	point2 points[128] = {0};
	i32 point_count = 0;

	tessellate_bezier(points, &point_count, curve[0], curve[1], curve[2]);

	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, point_count*sizeof(point2), points, GL_STATIC_DRAW);
	glEnableVertexAttribArray(0);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);

	glDrawArrays(GL_LINE_STRIP, 0, point_count);

}

nothing major here, the only thing to pay attention to is the glDrawArrays and GL_LINE_STRIP, everything else is like the draw triangle code.

2.2 Glyph outline tessellation

We already know how to extract the glyph outline, and as a refresher we know that each X, Y pair has a flag that gives us some information about the points, one of these flags is the on_curve bit that indicates whether the point is on the actual glyph curve or not so it will tell us whether the point is a control point or an actual point on the glyph that needs to be drawn.

Given all of this, we can loop through all the points of each contour of the glyph and try to create a closed line loop of each of the contours. here is the general algorithm for this:

we begin by finding an on curve point (on_curve is 1) on the glyph, this usually ends up being the first point of the glyph, if not we skip until we find this first point. Then the following points apply.
if a point is on the line then we add it directly to the list of generated points.
if the point is an off curve point (on_curve is 0) then we have two possibilities:
the next point is on curve this means this together with the previous point indicates a Quadratic Bezier, we tessellate using the previous point as p0 this point as p1 and next point as p2
if the next point is also off curve then it indicates a Cubic Bezier, so we find a point Pmid that is in the middle of the current point and the next point, we will use this Pmid as P2 for our tessellation.
we might run into multiple "off curve" points one after another, we must be careful what we choose as our previous point, we usually choose the latest generated point rather than the previous point on the glyph because we have generated points that are not in the glyph outline.
if the point is the last point then
if its on curve, we add it to the generated points list.
if its off curve we will tessellate it together with the first point in the outline if the first point is on the glyph.

now onto the code writing, first we write a loop to go through all the points of the contours.

each contour end point is in the endPtsOfContours field of the glyph_outline struct we made in the previous post, we also keep start of where each contour starts because we need it later on.
here is the general loop structure.

void generate_points(glyph_outline *outline, point2 *generated_points, i32 capacity, i32 *point_count) {
	i32 j = 0;
	i32 index = 0; //keep count of current generated points
	for(i32 i = 0; i < outline->numberOfContours; i++) {
		i32 contour_start_index = j;
		for(;j < outline->endPtsOfContours[i]; j++) {
			glyph_flag *flag = outline->flags + j;
			...
		}
	}
	
	*point_count = index; //send back the data we have generated
}

I assume generated_points is going to be big enough to contain all of our code, we can also pass in the capacity to check if the array is full and we can sort of make it bigger.

now as I mentioned, if the point is on curve we add it and go to the next point. remember these code snippets go into the 2nd loop

i32 x = outline->xCoordinates[j];
i32 y = outline->yCoordinates[j];
if(flag->on_curve) {
	//point on curve so add directly 
	generated_points[index].x = x;
	generated_points[index].y = y;
	index++;
	continue;
} else {
	// handle off curve
}

and for off curve points, here is what we are going to do:

point2 p0 = generated_points[index - 1]; // the latest generated point
point2 p1 = {.x = x, .y = y};
point2 p2 = { //we will need this point either way
    .x = outline->xCoordinates[j+1],
    .y = outline->yCoordinates[j+1]
}; 
if(!flag[1].on_curve) { //check next point
	p2.x = p1.x + (p2.x - p1.x)/2.0;
	p2.y = p1.y + (p2.y - p1.y)/2.0;
}
i32 out_size = 0;
tessellate_bezier(generated_points, &out_size, p0, p1, p2);
index += out_size;

okay, I can post the final code but I think its worth while to tell you how I went through the process of even debugging this,
first lets start running this code, we need to get ourselves the glyph outline, I will reuse the code I wrote in Part 1, here is the initial font loading in the main function

//globals
...
font_directory ft = {0}; //make this global so we can reuse
...
//everything here goes into main function
int file_size = 0;
char* file = read_file("envy.ttf", &file_size); //Envy Code R
char* mem_ptr = file;
read_font_directory(file, &mem_ptr, &ft); /

now a function as the test bed for all our rendering of the glyph:

void render_glyph() {
	glyph_outline glyph = get_glyph_outline(&ft, get_glyph_index(&ft, 'A'));
	point2 temp_points[128];
	
	i32 index = 0;
	generate_points(&glyph, temp_points, 128, &index);
}

small note: I had put the subdivision of the curves in the tessellate_bezier to 20, this caused a crash because the size of temp_points was small.

now that we have our points, best way to see if anything is broken is to draw all the points, this function will be the same as the triangle and line functions, except for the we use the GL_POINTS to draw the points.

also to note, if you directly draw the glyph outlines they are pretty big as the grid they are drawn on is big, so we will scale the points to draw them.

here is the function for drawing an array of points:

void draw_points(point2 *points, i32 size) {

	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, size*sizeof(point2), points, GL_STATIC_DRAW);
	glEnableVertexAttribArray(0);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);
	glDrawArrays(GL_POINTS, 0, size);
}

I will scale the glyphs by 8 and draw them at in the center of my screen, here is the code:

for(int i = 0; i < index; i++) {
	temp_points[i].x = temp_points[i].x/8 + 250;
	temp_points[i].y = temp_points[i].y/8 + 250;
}
draw_points(temp_points, index);

if you look at the points under a debugger, you will see we have a lot of repeated points, the main culprit here is the tessellation code, as you know at t = 0 this will be p0 and because we take p0 as being the latest point generated, we don't want to reinclude this point, so we can start our tessellation code at i = 1 instead of i = 0.

1
2
3

for(int i = 1; i < subdiv_into; i++) {
...
}

here is the result of our generated points:

this is pretty neato however if you look closely you see that one point is missing, that would be the last point in the last contour, that is because our loop needs to include endPtsOfContours as that is the actual last index of the contour however due to the j+1 we would be doing an out of bound access, [m]flag[1][/m] would also cause another out of bound access.
If we note that the glyph outlines actually create a closed shape and instead of going "through all the points" we actually "go around the contour" then we can use the module operator to make sure our indices actually loop back properly.

We first need to calculate number of indices in each contour then use that for the module, here is how we calculate it:

1
2
3

i32 contour_len = outline->endPtsOfContours[i] - contour_start_index + 1;
i32 cur_index = j;
i32 next_index = (j+1 - contour_start_index)%contour_len contour_start_index;

we subtract contour_start_index because it will map j back to "0 to contour_len" range and later on we add back contour_start_index to get the proper index for the next point in line, to be honest all of this calculation seems a bit "smart" and you can make a special case of handling the last index but this works well enough.

We change our loop to [m]j <= outline->endPtsOfContours[i][/m] and change every instance of j+1 to next_index, now if we draw the points we get the last point properly:

now if we draw these points as lines we are suppose to get the A shape, so just changing GL_POINTS to GL_LINE_STRIP is enough, here is the result:

these are suppose to be closed shapes, we can see from the picture (and your own code if you have come this far) that something is missing and that would be that we haven't closed the "contour outline" which we need to do by just adding the first point again to the list of generated points when we finish processing a contour, so we have to add this to our generation loop:

if(outline->flags[j-1].on_curve) {
    generated_points[index] = generated_points[generated_points_start_index];
    index++;
}

so why are we checking the previous point? well we put this code at the end of the loop which means j will be at the start of the next contour and j-1 would be the last point of the current contour, we check the last point in the contour to see if its on the "curve" or not, if it is then we need to add the first point manually however if it isn't we must have tessellated at this point by using the point before last as p0 last point as p1 and the first point in the glyph as p2 and our tessellation code at t = 1 will be equal to p2 thus the tessellation code would have added the "first point" into the generated code so in that specific case we don't need to manually add it..this sounds complicated but just work through it by a debugger and you will get there.

now that we have added this code to add a last point, here is what the line strip looks like:

so why is there a random weird line there connecting the first points of the two contours together? well that is because GL_LINE_STRIP uses consecutive points to draw lines so inevitably it will connect those two points and draw a line between them, so to fix this we have to create the line segments ourselves and be mindful to when we get to the points between the two contours, which the next section deals with.

2.3 Generating glyph edges from generated points.

So far we have the points we need, in this section we will generate the edges by connection two consecutive points on the "contour" to get our edges, we will skip generating an edge between the two points going from one contour to the next.

In order to skip the edge between the two contours, we need to keep track of when one ends and the other starts, we will follow the same example as TTF fonts do by having endPtsOfContours array that keeps track of our own generated points of course taking into account the points that the tessellation generated.

This is pretty straight forward, we pass in an array gen_pts_end_indices to generate_points, for the size of the array the number of contours will stay the same.

our function parameter becomes this:

1	void generate_points(glyph_outline outline, point2 generated_points, i32 capacity, i32 point_count, i32 gen_pts_end_indices);

we will put the value index into this array at the end of each contour:

1	gen_pts_end_indices[i] = index;

very simple stuff, now that we have the contour end points, we can go ahead and generate our edges, here is the code for it.

line* generate_edges(point2 *pts_gen, i32 *edge_count, i32 *contour_ends, i32 contour_count) {
    i32 j = 0;
    i32 edge_index = 0;
    line *edges = (line*) malloc(sizeof(line)*contour_ends[contour_count-1]);
    for(i32 i = 0; i < contour_count; i++) {
      for(; j < contour_ends[i]-1; j++) {
        line *edge = edges + edge_index;
        edge->p1.x = pts_gen[j].x;
        edge->p1.y = pts_gen[j].y;
        
        edge->p2.x = pts_gen[j+1].x;
        edge->p2.y = pts_gen[j+1].y;
        edge_index++;
      }
      j++; //we are the end of the contour, skip over this point to the start of the next contour
    }
    
    *edge_count = edge_index;
    
    return edges;
}

we pass in the generated points, the end of contour array and its length, and one variable edge_count to get the size of the edges back from it. we don't need to pass in the size of the generated points because its already encoded inside the contour_ends, we return an array of line which is simply a struct with two point2 fields, p1 & p2

1
2
3

typedef struct line {
  point2 p1, p2;
} line;

now that we have generated the edges, we will make another small function to draw them, easy stuff:

void draw_lines(line *edges, i32 edge_count) {

	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, edge_count*sizeof(line), edges, GL_STATIC_DRAW);
	glEnableVertexAttribArray(0);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);
	glDrawArrays(GL_LINES, 0, edge_count*2);

}

note the edge_count*2 that is because each line is technically made of 2 vertices.

we will call the function with the generated and scaled points:

1
2
3

i32 edge_count = 0;
line *edges = generate_edges(temp_points, &edge_count, contour_end_pts, glyph.numberOfContours);
draw_lines(edges, edge_count);

here is the result:

pretty cool if I do say so myself.

now you would think that we are done generating edge right? Wrong. Go ahead and change the letter 'A' to the number '6' or the number '9' you will see a gap.

what might the problem be you ask? Of course its a bug in the generate_points code, the problem is that we haven't accounted for the first point to be off curve, we have always assumed it will on curve, I have mentioned this when describing the general algorithm but didn't put code in just so we can see it here.

2.3.1 First point start off curve

If the first point is off curve, we will keep track of it and in order to find an on curve point to start us off, we have 2 cases:

next point is also off curve which means we are inside a cubic bezier and we will make a point Pmid between the two that is surely on the curve and add that as the first point.
next point is not off, we add this point to the list and make sure to do j++ in order to skip the "next point" after the first point.

here is the code:

void generate_points(glyph_outline *outline, point2 *generated_points, i32 capacity, i32 *point_count, i32* gen_pts_end_indices) {
	for(i32 i = 0; i < outline->numberOfContours; i++) {
		...
		i32 contour_start = 1;
		i32 contour_started_off = 0;
		for(;j <= outline->endPtsOfContours[i]; j++) {
			...
			if(flag->on_curve) {
			...
			} else {
				// handle off curve
				if(contour_start) {
					contour_started_off = 1;
					
					//next point is on curve, add it and skip
					if(outline->flags[next_index].on_curve) {
						generated_points[index].x = outline->xCoordinates[next_index];
						generated_points[index].y = outline->yCoordinates[next_index];
						index++;
						j++;
						continue;
					}
                    // next point is off curve, find pmid and add
					x = x + (outline->xCoordinates[next_index] - x)/2.0;
					y = y + (outline->yCoordinates[next_index] - y)/2.0;

					generated_points[index].x = x;
					generated_points[index].y = y;
					index++;
				}
				...
			}

			contour_start = 0;
		}
        ...
		if(contour_started_off) {
			//handle first point if its off
			point2 p0 = generated_points[index-1]; //our latest added point
			point2 p1 = {.x = outline->xCoordinates[contour_start_index], .y = outline->yCoordinates[contour_start_index]};// first point in the contour
			point2 p2 = generated_points[generated_points_start_index];

			i32 out_size = 0;
			tessellate_bezier(generated_points + index, &out_size, p0, p1, p2);
			index += out_size;
		}
	}
}

a bit too much code, this is the generate_points function but everything stripped out except for the new code, we have added two new variables contour_start and contour_started_off we keep use contour_start to tell us if the point we are checking is the start of the contour and if so we do the specific points I mentioned above.

then at the end of the contour we check if the first point was off curve, if so we will just tessellate by using the first generated point of the contour and the latest point we have added and the "first point" being the control point, as you can see in the code. maybe it takes a few tries to get, feel free to ask question.

now if we check our '6' and '9' we get this:

2.4 Scanline algorithm and the final stretch

The scanline algorithm I'm going to write about here isn't "truly" a scanline algorithm, its as kids say these days a "scuffed" one, I will touch on this once we get to the end.

So first the idea of "scanline algorithms" comes from old CRT TVs, they had something called an electron gun that would go from top to bottom left to right across the screen to draw the image on the screen, people have adapted this idea of going from "left to right" "top to bottom" as an algorithm to draw polygons.

Here is how you should see it:

we will be rasterizing the glyph on a bitmap with height H and width W
we scale the outline of the glyph so that it fits inside the bitmap's height
we loop through each row of the bitmap
at each iteration, we will get the intersection of the scanline with the edges of the glyph that fall on this scanline
a scanline here is just a horizontal straight line
horizontal lines can be represented by one number, i.e: f(x) = n,
we will sort the intersections by their X value, from smallest to biggest
fill in the pixels between each two consecutive intersection points.

Alright time for the implementation.

2.4.1 The Bitmap

Bitmaps are just some contiguous block of memory that we interpret as a 2D image but in reality its just one big block of memory, to show you how this stuff works I will use visual studio's ImageWatch plugin, pretty nice plugin to inspect textures/bitmaps which are in memory.

ImageWatch for VS 2019

we big by just making a block of memory that is 64x64,

1	u8 temp_texture[64*64] = {0};

we will index this memory like this temp_texture[column + row*texture_width] the texture width here is 64, lets put a few pixels in the texture

1 2	temp_texture[0 + 064] = 255; temp_texture[32 + 3264] = 255;

here is the result, one pixel at 0,0 and one pixel at the center

each "row" of this bitmap can be considered a "scanline" and that is what we are exactly aiming for, for each "row" in the bitmap

2.4.2 Main rasterization loop

Our main loop will go through each row of the bitmap and for each row and check which one of the edges that we generated will intersect with this "row", we will store each of these intersections into an array and sort them

after we have the intersections sorted we will pick two consecutive intersections and fill in the values between them.

the main loop looks like this

void rasterize_glyph(line *edges, i32 edge_count, u8 *bitmap, i32 bitmap_height, i32 bitmap_width) {
    f32 intersections[32] = {0};
    i32 intersection_count = 0;
    for(i32 i = 0; i < bitmap_height; i++) {
      f32 scanline = i;
      for(i32 j = 0; j < edge_count; j++) {
        line *edge = edges + j;
        //code for intersection and stuff goes here
      }
      //sort intersections
      if(intersection_count > 0) {
        for(i32 m = 0; m < intersection_count; m++) {
            // filling code
        }
      }
      
    }
}

for the intersection all we need is to find what X is when our y = scanline or when our y is equal to i of our loop, using the equation of a line makes this pretty straight forward

y - y1 = m*(x - x1) sub both sides by m
y - y1/m = x - x1 add x1 to both sides
(y-y1)*1/m + x1 = x m is just the slope of the line so dy/dx and 1/m is dx/dy
y in this equation would be the scanline, x1 & y1

putting this into code would be

f32 dx = edge->p2.x - edge->p1.x;
f32 dy = edge->p2.y = edge->p1.y;

f32 intersection = (scanline - edge->p1.y)*(dx/dy) + edge->p1.x;

you can see immediately that if dy is zero then we get a bad number, dy equals to horizontal lines which we will skip and dx being zero means our intersection can be either p1.x or p2.x as both of them would be equal.

if(dy == 0) continue;

f32 intersection = -1;
if(dx == 0) {
    intersection = edge->p1.x;
} else {
    intersection = (scanline - edge->p1.y)*(dx/dy) + edge->p1.x;
}

we need to put our intersections into an array and after that we will sort them using qsort.

f32 intersections[32] = {0};
i32 intersection_count = 0;
for(i32 i = 0; i < bitmap_height; i++) {
	f32 scanline = i;
    for(i32 j = 0; j < edge_count; j++) {
        ...
        intersections[intersection_count++] = intersection;
    }
		
    qsort(intersections, intersection_count, sizeof(f32), cmp_floats);
}

cmp_float is a simple function that compares two floats and returns -1, 0, 1 depending on whether the first float is smaller, equal, or bigger than the second float:

i32 cmp_floats(const void *a, const void *b) {
	f32 f_a = *(f32 *)a;
	f32 f_b = *(f32 *)b;
	if(f_a < f_b) return -1;
	else if (f_a > f_b) return 1;
	return 0;
}

now all that is left is to fill in the "blanks" as it were, which we will do by taking the first two points of the intersection array.

if(intersection_count > 1) {
	for(i32 m = 0; m < intersection_count; m += 2) {
		i32 start_index = intersections[m];
       	i32 end_index = intersections[m+1];

		for(i32 j = start_index; j <= end_index; j++) {
			bitmap[j + i*bitmap_width] = 255;	
       	}
	}
}

we can't run this code just yet, first we need to scale our glyph outline to fit inside our bitmap so in the render_glyph function we can calculate the proper scale

1	f32 scale = (f32)bitmap_height/(f32)(glyph.yMax - glyph.yMin);

bitmap_height is a global variable that contains the height of the temp_texture so its 64 here, glyph.yMax and glyph.yMin can tell us the full height of this glyph so we will scale this "full height" in order to fit the entire glyph inside the bitmap.

next because some of the glyphs sometimes extend under the "baseline", i.e: they have points that are in the negative we have to "move" all the points such that the glyph has none of its point below the baseline and by extension no number under 0, we do this like this

for(int i = 0; i < index; i++) {
    temp_points[i].x = ( temp_points[i].x  - glyph.xMin)*scale;
    temp_points[i].y = ( temp_points[i].y  - glyph.yMin)*scale;
}

xMin & yMin indicate the bottom left point of the glyph's bounding box so if we move this to 0,0 it will shift the entire glyph so that no point is below 0,0, do this because our "scanline" starts at 0, it will require some algebra but you can do without this and start your scanline from the lowest point of the glyph to the highest point of the glyph yMin to yMax but moving everything to 0,0 makes things a bit easier to reason about.

now that we have scaled the glyph and moved it to 0,0 our rasterization can begin, here is the full code for the rasterizer so far, I have added a few more lines that I will explain:

void rasterize_glyph(line *edges, i32 edge_count, u8 *bitmap, i32 bitmap_height, i32 bitmap_width) {

	f32 intersections[32] = {0};
	i32 intersection_count = 0;

	for(i32 i = 0; i < bitmap_height; i++) {
		intersection_count = 0;

		f32 scanline = i;

		for(i32 j = 0; j < edge_count; j++) {
			line *edge = edges + j;
			
			f32 bigger_y = MAX(edge->p1.y, edge->p2.y);
			f32 smaller_y = MIN(edge->p1.y, edge->p2.y);

			if(scanline <= smaller_y) continue;
			if(scanline > bigger_y) continue;

			f32 dx = edge->p2.x - edge->p1.x;
			f32 dy = edge->p2.y - edge->p1.y;
			
			if(dy == 0) continue;

			f32 intersection = -1;
			if(dx == 0) {
				intersection = edge->p1.x;
			} else {
				intersection = (scanline - edge->p1.y)*(dx/dy) + edge->p1.x;
			}

			intersections[intersection_count] = intersection;
			intersection_count++;
		}

		qsort(intersections, intersection_count, sizeof(f32), cmp_floats);


		if(intersection_count > 1) {
			for(i32 m = 0; m < intersection_count; m += 2) {
				i32 start_index = intersections[m];
				i32 end_index = intersections[m+1];

				for(i32 j = start_index; j <= end_index; j++) {
					bitmap[j + i*bitmap_width] = 255;	
				}
			}
		}
	}
}

The only new addition here is this section:

f32 bigger_y = MAX(edge->p1.y, edge->p2.y);
f32 smaller_y = MIN(edge->p1.y, edge->p2.y);

if(scanline <= smaller_y) continue;
if(scanline > bigger_y) continue;

this will just say that we will skip any edge that lines above or below our scanline completely, i.e: there is no intersection against these lines.

One thing to also note is that scanline <= smaller_y scanline is smaller or equal not just equal, we are not just skipping the edge if its above the scanline we are also skipping it if one of it's vertices is exactly on the scanline, the reason for this is that if a vertex is on the scanline and is connecting two edges then we get 2 intersections if that vertex is on the scanline this will cause problems, so we will skip the edge above the scanline if its completely above the scanline or if one of its vertices is on the scanline.

Here is the result in Visual Studio's image watch:

that is a very nice and crisp letter 'A', yes its upside down..sometimes we all find our world upside down, just one of those days.

Now that we have the bit map, we should upload it to the GPU which means textures!

2.4.3 OpenGL Textures

OpenGL Textures are straight forward, they are just a bit involved as we have to change the vertex shader and fragment shader too.

Textures are just "images" that have been uploaded to our GPU and drawn on a surface, most of the time these "textures" are rectangles and most of the time they are drawn on "quad" surfaces, a "Quad" is just a rectangle, for OpenGL this "Quad" is made of 2 triangles.

So we have a texture/image and a quad to draw it on, the quad is 2 triangles and we need to send the vertices to the GPU, we also need to send a 2nd set of coordinates which are called "texture coordinates" or "uv" coordinates, the texture is a rectangle that has the coordinates 0,0 being the lower left corner and 1,1 being the upper right corner, each vertex of the quad will have an associated "uv" coordinate that says what part of the texture goes there, basically we will have to match the lower left of our quad to the lower left of the texture.

Assume we have a "Quad" going from 0,0 to 64,64, here is the two triangles, the points go clockwise:

point2 triangle[6] = {
  
  //triangle 1
  {0, 0},
  {0, 64},
  {64, 64},
  
  //triangle 2
  {0, 0}, 
  {64, 64},
  {64, 0}
}

and to map the texture to that "quad" we will upload the following array of "uv" coordinates:

point2 uv_coordinate[6] = {
  {0, 0},
  {0, 1},
  {1, 1}
  
  {0, 0},
  {1, 1},
  {1, 0}
};

as you can see we are fully mapping the "texture" image to the quad, given that our quad and texture/image match in size because they are both 64x64 we will get a nice crisp image, if our quad is bigger then the "texture" will be scaled accordingly, do note that even if we make the Quad bigger the "uv" coordinates don't change and OpenGL will take care of the scaling.

Now for the OpenGL texture upload code, the same process as we had for buffers, we generate a texture, bind the texture, upload the data, describe how the data is, and then in the shader use this data, here is the code:

glGenTextures(	GLsizei n, GLuint * textures );
glBindTexture(	GLenum target, GLuint texture);
glTexImage2D(GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLint border, GLenum format, GLenum type,
const GLvoid * data);

the only complex part is glTexImage2D

target can be one of multiple texture types, we will use GL_TEXTURE_2D
level is mipmap level, you can upload multiple images at different scales to help OpenGL scale better or use your images when scaled at different levels.
internalFormat specify the components of the texture, we want our image to be one component as our image is greyscale so we will choose GL_RED here
width & height are self explanatory
border this is legacy stuff, put this to 0.
format this is indicates pixel data format, ours is greyscale so GL_RED.
type what our pixel's data type, set it to GL_UNSIGNED_BYTE as our data is in u8 array.
data pointer to our data.

then on the shader side we have to make a couple of changes, first we need to assign a variable for the "uv" coordiantes, then get these coordinates to the fragment shader as that is where we will put the texture on the quad, remember only the vertex shader can accept "input" and we need to pass the data from the vertex shader to fragment shader, so here are the code changes:

vertex shader:

#version 450
layout(location = 0) in vec2 vertex_points;
layout(location = 1) uniform mat4 projection = mat4(1);
layout(location = 2) in vec2 uv;
out vec2 frag_uv;

void main() {
	frag_uv = uv;
	gl_Position = projection*vec4(vertex_points.x, vertex_points.y, 0, 1);
}

in order to pass data to the next shader, we must have a variable with the same name on both shaders however one of them will be designated as "out" and the other as "in", in our code "out" will be the vertex shader as its passing it out and "in" on the fragment shader part, here we have designated frag_uv as the output, our "uv" input data is in in vec2 uv which is at index "2".

fragment shader:

out vec4 fragment_color;
in vec2 frag_uv;
uniform sampler2D sam;

void main() {
    fragment_color = vec4(1,1,1, texture(sam, frag_uv).r);
}

as mentioned we have the frag_uv as input here.
In OpenGL to access texture data, you upload your textures to something called "texture units", the first and default texture unit is at position "0", in the client code in order to upload to different texture units you can use glActiveTexture function with the parameter GL_TEXTUREX with X being in the range of 0 to some number N that indicates the max texture units, this will change per GPU, however we are only dealing with one texture and that is the default so we don't need anything else, then once you have uploaded the texture you access it using something called a sampler, this a special type variable, it must be a uniform, "sampler2D" is used to access 2D textures, we also have 1D and 3D textures, we use the "uv" coordinates with the "sampler" and pass it to the texture function and it will return the data in that location of the texture, most textures are rgba most of the time but our texture is greyscale here and we have specified that our data goes into the red channel of the texture, so only the red channel .r will have our texture data.

now lets put this into code, we will create a function that will draw our texture, we will specify the "quad" as a rectangle so we will make a struct for it.

1
2
3

typedef struct rect {
  i32 x,y, width, height;
} rect;

here is our draw_texture function:

void draw_texture(rect r, u8 *texture_data, i32 width, i32 height) {
    point2 quad[12] = {
        //triangle 1
        {r.x, r.y},
        {r.x, r.y + r.height},
        {r.x + r.width, r.y + r.height},
    
        //triangle 2
        {r.x, r.y},
        {r.x + r.width, r.y + r.height},
        {r.x + r.width, r.y},
    
        // UV coordinates for triangle 1
        {0, 0},
        {0, 1},
        {1, 1},
        // UV coordinates for triangle 2
        {0, 0},
        {1, 1},
        {1, 0}
    };
  
    glBindBuffer(GL_ARRAY_BUFFER, vbo);
    glBufferData(GL_ARRAY_BUFFER, sizeof(point2)*12, quad, GL_STATIC_DRAW);
  
    glEnableVertexAttribArray(0);
    glEnableVertexAttribArray(2);
    glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
    glVertexAttribPointer(2, 2, GL_FLOAT, GL_FALSE, 0, (void *)(sizeof(point2)*6));
    glUniformMatrix4fv(1, 1, GL_TRUE, &ortho_matrix[0][0]);
  
  	glBindTexture(GL_TEXTURE_2D, tex_id);
	glPixelStorei(GL_UNPACK_ALIGNMENT, 1);

	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

	glTexImage2D(GL_TEXTURE_2D, 0, GL_RED, bitmap_width, bitmap_height, 0, GL_RED, GL_UNSIGNED_BYTE, texture_data);
    glPixelStorei(GL_UNPACK_ALIGNMENT, 4);
	
    glDrawArrays(GL_TRIANGLES, 0, 6);
}

so what is happening here? well everything is what we have discussed, as you see we enable both vertex attribute 0 and 2, for the vertex attribute 2 is the same format however it starts sizeof(point2)*6 bytes into the array because that is where our "uv" coordinates are, the new parts are these:

glBindTexture(GL_TEXTURE_2D, tex_id);
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RED, bitmap_width, bitmap_height, 0, GL_RED, GL_UNSIGNED_BYTE, texture_data);
glPixelStorei(GL_UNPACK_ALIGNMENT, 4);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

glBindTexture as just like the other binding functions, tex_id is a global variable I made to contain the texture handle, to properly generate the texture, you need to call glGenTextures(1, &tex_id) which I have put in the setup_opengl function

glPixelStorei(GL_UNPACK_ALIGNMENT, 1) this function tells OpenGL that our "texture" that we are uploading isn't 4 components its actually 1 because its greyscale.

glTextImage2D was explained before, this uploads the actual data.

then we restore the "texture components" to 4 not necessary but restoring stuff to default is a good habit to have.

the next 4 function calls all relate to how different functionality related to the texture is done, the first two functions with MIN_FILTER and MAG_FILTER relate to the minifying and magnifying filters/functions used when the texture is either scaled down or up respectively.

the next two function calls after that, the GL_TEXTURE_WRAP_S and _T tells OpenGL how to handle a "uv" coordinate when it goes outside the 0 to 1 range, GL_CLAMP_TO_EDGE means if you have coordinate > 1 clamp it to 1 and coordinate < 0 clamp to 0.

that is all for textures, now lets draw our texture, here is how I have done it, putting this snippet right after the rasterize_glyph function:

rect r = {
    250, 250, 64, 64
};

draw_texture(r, temp_texture, bitmap_width, bitmap_height);

here is what it looks like so far:

your immediate reaction is "that is a square" where is our texture!, well if you check our fragment shader, we have used the texture's data as the "alpha" for our "color" however we haven't enabled "blending" in our code so the alpha is ignored, we can enable blending as such glEnable(GL_BLEND) and we have to provide a function for blending, we use this glEnable(GL_BLEND) glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA); due to the nature of "blending" and how complicated it can get, its best if you read about it on its own, the gist of it is this:
when we draw multiple things the blending function determines how the overlapping colors are blended together, this is used to archive transparency in OpenGL with the proper settings.

we put these two lines in the setup OpenGL function. now if we run our code this is what we get:

that is basically all the code to rasterize a glyph. Let me just touch AA just a bit before ending the post.

2.4.4 Anti-Aliasing, a simple approach.

If you look at our drawn texture you see that the edges of the glyph are jagged, anti aliasing fixes these edges by computing how much of the glyph's outline is actually covering the pixel, here is a picture to explain this:

As you can see, the glyph outline doesn't cover the entire pixel that is on the left, it only covers some part of it, for example is the intersection is at x = 0.75 that means the glyph only covers .25 of the pixel so we should set the color value at pixel 0 to 0.25*255 and not the full 255, this coverage is mostly a problem at the edges and not in the pixels that are fully covered by the glyph, there are better and more accurate ways to do this but for our purpose this is fine enough, here is the code for that:

f32 start_intersection = intersections[m];
i32 start_index = intersections[m];
f32 start_covered = (start_index + 1) - start_intersection ;

f32 end_intersection = intersections[m+1];
i32 end_index = intersections[m+1];
f32 end_covered = end_intersection - (end_index);

if(start_index == end_index) {
	bitmap[start_index + i*bitmap_width] = 255*start_covered;
} else {
	bitmap[start_index + i*bitmap_width] = 255*start_covered;
	bitmap[end_index + i*bitmap_width] = 255*end_covered;
}

for(i32 j = start_index+1; j < end_index; j++) {
    bitmap[j + i*bitmap_width] = 255;
}

first notice the loop, we start from "start_index+1" and go up to "end_index" but not including that basically skipping the first pixel and last because we fill that ourselves

1	start_covered = (start_index + 1) - start_intersection;

this finds the index of the next pixel and subtracts it from the intersection, given that start_index is the whole part of the intersection so adding 1 to that and subtracting the intersection will give us the covered amount, this is equivalent to 1.0 - fractional part of the intersection.

we do something similar for the end_covered this time its easier as we can just directly subtract end_index from end_intersection and get the covered area. Here is a picture to visualize it, hopefully this shows why we have start_index + 1 but not end_index+1

after we have the covered parts we will just multiply this by 255.

here is the result:

its a bit smoother than just the straight up jaggedness of the non-AA. We will make this even better by sub pixel AA but first go ahead and change the letter to Z or 0, here is what they look like:

if you look closely you will immediately see that the horizonal line is botched, it feels like some pixels are misaligned and feels completely wrong. If you have followed all the code so far you will have the same problem.

the reason for this is easy but a bit hard to track down (it was for me at least), this is because the OpenGL "viewport" and the "window" size don't match up, think of viewport as the 2D coordinates that OpenGL thinks is the size of the window. We set our window size using the AdjustWindowRect and the size of our window is in rect_size variable, to get the OpenGL viewport size we use glGetIntegerv(GL_VIEWPORT, viewport_data), here is the code:

i32 viewport[4] = {0};
glGetIntegerv(GL_VIEWPORT, viewport);
printf("rect: %d %d %d %d\n", rect_size.left, rect_size.top, rect_size.right, rect_size.bottom);
printf("vp: %d %d %d %d\n", viewport[0], viewport[1], viewport[2], viewport[3]);

here is the result on my end:

as you can see there is a difference between them so we need to adjust the viewport accordingly, we will just call glViewport with the values inside the size_rect:

1	glViewport(rect_size.left, rect_size.top, rect_size.right, rect_size.bottom);

now this is what our zero looks like:

you can fix this another way, that is to change the ortho_projection matrix, we have set it such that it scaled things between 0 to 500, you can change it so it matches the glViewport values, that will also fix the problem.

I decided to keep this error in instead of going back and adding glViewport to the initial code mainly to emphasise that the smallest errors can show up in weird ways and this error would be something you will never forget.

We have one last thing to do

2.4.5 Subpixel AA

Subpixel AA just means that your anti-aliasing does calculations that are smaller than the pixel i.e: you divide the pixel into smaller units and do your calculation assuming the pixel is actually that small, here we will do a vertical only Subpixel AA, that means we will divide each scanline into smaller units so instead of the scanline increasing by whole numbers we use smaller steps this helps our AA because we are taking more samples to calculate the coverage of each pixel, in our normal AA each scanline has only one coverage value but in the subpixel one we can divide each scanline (and by extension each pixel vertically) into any number of scanlines.

What changes:

we will introduce another loop to subdivide each scanline.
when we fill our pixels multiple of our subdivided scanlines will land on the same pixel thus we will give each of them a weight that is equal to 255/scanline_subdiv this makes sure that if we add up all the values of a scanline that is fully covered we get 255

here is what changes:

void rasterize_glyph(line *edges, i32 edge_count, u8 *bitmap, i32 bitmap_height, i32 bitmap_width) {
..
	i32 scanline_subdiv = 5;
	f32 alpha_weight = 255.0/scanline_subdiv;
	f32 step_per_scanline = 1.0/scanline_subdiv;
	for(i32 i = 0; i < bitmap_height; i++) {
		for(i32 x = 0; x < scanline_subdiv; x++) {
			intersection_count = 0;
			f32 scanline = i + x*step_per_scanline;
...
			if(intersection_count > 1) {
				for(i32 m = 0; m < intersection_count; m += 2) {
..
					if(start_index == end_index) {
						bitmap[start_index + i*bitmap_width] += alpha_weight*start_covered;
					} else {
						bitmap[start_index + i*bitmap_width] += alpha_weight*start_covered;
						bitmap[end_index + i*bitmap_width] += alpha_weight*end_covered;
					}

					for(i32 j = start_index+1; j < end_index; j++) {
						bitmap[j + i*bitmap_width] += alpha_weight;	
					}
				}
			}
		}
	}
}

go ahead and run the code, you will get this hilarious effect:

this happens because we keep adding to the temp_texture array and it overflows and wraps from 255 to 0 overtime, the way to fix this is that at the start of each frame we clear our temp_texture to 0.

while(!quit) {
	glClear(GL_COLOR_BUFFER_BIT);
	memset(temp_texture, 0, bitmap_width*bitmap_height);
...
	SwapBuffers(global_dc);
}

now the final result:

And that is about it really, that is all she wrote. Thank you for coming to my Ted talk.

Part 3 The End

Thank you for reading this far, this post took quite sometime to finish, I delayed it so much due to work and life.

Finally I really want to thank Sean Barret, the article's written about truetype fonts was pretty fun to read and insightful, the code was pretty great to learn from too.

Here are some notes:
the scanline algorithm I used here is truly bad, please read up on how actual scanline algorithm is suppose to work, what I wrote here is a simplified version that will help you learn the actual algorithm more easily, this algorithm will hilariously break if any of the glyphs you run through it has self intersecting sections.
read the original post made by Sean Barret and read the stb_truetype.h source code, its quite a treat and if you keep at it you will learn a few optimization tricks, the code is truly ingenious in some parts specially the way it avoids using floating numbers in certain parts. at least that is what I think the intention is.
personally for me this whole post was quite great specially learning everything, now that you know how to manipulate everything you might try your hand at making Sign Distance Fields and try rendering on the GPU, its fun.
I have part 4, 5 and possibly 6 in mind, those mainly being "Type Hinting and the font engine", "GPOS and Subbing, Arabic Fonts non-latin fonts", finally "GPU rendering and fun". I might or might not write all or parts of these, I already have some parts of "Type hinting" working, GPOS and Subbing are easier for me because I know Arabic letters and how they are suppose to look, and I have a rough idea on how SDF's and GPU rendering works, If I fully learn how Slug works, I might write a post about implementing it.

here is the final code:

#include <Windows.h>
#pragma comment(lib, "user32.lib")
#pragma comment(lib, "gdi32.lib")
#pragma comment(lib, "opengl32.lib")

#include <GL\gl.h>
#include <GL\glcorearb.h>
#include "wglext.h"

#include "read_ttf.c"

#define MIN(x, y) ((x) > (y) ? (y) : (x))
#define MAX(x, y) ((x) < (y) ? (y) : (x))


//related to Uniforms
PFNGLUNIFORM4FVPROC glUniform4fv;
PFNGLUNIFORM1IPROC glUniform1i;
PFNGLUNIFORMMATRIX4FVPROC glUniformMatrix4fv;

//related to program creation
PFNGLCREATEPROGRAMPROC glCreateProgram;
PFNGLLINKPROGRAMPROC glLinkProgram;
PFNGLUSEPROGRAMPROC glUseProgram;
PFNGLGETPROGRAMIVPROC glGetProgramiv;
PFNGLGETPROGRAMINFOLOGPROC glGetProgramInfoLog;

//related to shaders
PFNGLCREATESHADERPROC glCreateShader;
PFNGLSHADERSOURCEPROC glShaderSource;
PFNGLCOMPILESHADERPROC glCompileShader;
PFNGLATTACHSHADERPROC glAttachShader;
PFNGLGETSHADERIVPROC glGetShaderiv;
PFNGLGETSHADERINFOLOGPROC glGetShaderInfoLog;

PFNGLVERTEXATTRIBPOINTERPROC glVertexAttribPointer;
PFNGLENABLEVERTEXATTRIBARRAYPROC glEnableVertexAttribArray;

//related to VAO
PFNGLGENVERTEXARRAYSPROC glGenVertexArrays;
PFNGLBINDVERTEXARRAYPROC glBindVertexArray;

//related to buffers
PFNGLISBUFFERPROC glIsBuffer;
PFNGLGENBUFFERSPROC glGenBuffers;
PFNGLBUFFERDATAPROC glBufferData;
PFNGLBINDBUFFERPROC glBindBuffer;
PFNGLDELETEBUFFERSPROC glDeleteBuffers;



// PFNGLGET



void load_funcs() {

	glUniform4fv = (PFNGLUNIFORM4FVPROC) wglGetProcAddress("glUniform4fv");
	glUniform1i = (PFNGLUNIFORM1IPROC) wglGetProcAddress("glUniform1i");
	glUniformMatrix4fv = (PFNGLUNIFORMMATRIX4FVPROC) wglGetProcAddress("glUniformMatrix4fv");


	glCreateProgram = (PFNGLCREATEPROGRAMPROC) wglGetProcAddress("glCreateProgram");
	glLinkProgram = (PFNGLLINKPROGRAMPROC) wglGetProcAddress("glLinkProgram");
	glUseProgram = (PFNGLUSEPROGRAMPROC) wglGetProcAddress("glUseProgram");
	glGetProgramInfoLog = (PFNGLGETPROGRAMINFOLOGPROC) wglGetProcAddress("glGetProgramInfoLog");
	glGetProgramiv = (PFNGLGETPROGRAMIVPROC) wglGetProcAddress("glGetProgramiv");
	printf("%p\n", glGetProgramiv);

	glCreateShader = (PFNGLCREATESHADERPROC) wglGetProcAddress("glCreateShader");
	glShaderSource = (PFNGLSHADERSOURCEPROC) wglGetProcAddress("glShaderSource");
	glCompileShader = (PFNGLCOMPILESHADERPROC) wglGetProcAddress("glCompileShader");
	glAttachShader = (PFNGLATTACHSHADERPROC) wglGetProcAddress("glAttachShader");
	glGetShaderInfoLog = (PFNGLGETSHADERINFOLOGPROC) wglGetProcAddress("glGetShaderInfoLog");
	glGetShaderiv = (PFNGLGETSHADERIVPROC) wglGetProcAddress("glGetShaderiv");


	glVertexAttribPointer = (PFNGLVERTEXATTRIBPOINTERPROC) wglGetProcAddress("glVertexAttribPointer");
	glEnableVertexAttribArray = (PFNGLENABLEVERTEXATTRIBARRAYPROC) wglGetProcAddress("glEnableVertexAttribArray");

	glGenBuffers = (PFNGLGENBUFFERSPROC) wglGetProcAddress("glGenBuffers");
	glBindBuffer = (PFNGLBINDBUFFERPROC) wglGetProcAddress("glBindBuffer");
	glBufferData = (PFNGLBUFFERDATAPROC) wglGetProcAddress("glBufferData");
	glDeleteBuffers = (PFNGLDELETEBUFFERSPROC) wglGetProcAddress("glDeleteBuffers");
	glIsBuffer = (PFNGLISBUFFERPROC) wglGetProcAddress("glIsBuffer");

	glGenVertexArrays = (PFNGLGENVERTEXARRAYSPROC) wglGetProcAddress("glGenVertexArrays");
	glBindVertexArray = (PFNGLBINDVERTEXARRAYPROC) wglGetProcAddress("glBindVertexArray");

	glBindVertexArray = (PFNGLBINDVERTEXARRAYPROC) wglGetProcAddress("glBindVertexArray");
}



// these are just the normal datatypes redefined to make typing easier
typedef unsigned char u8;
typedef char i8;
typedef unsigned short u16;
typedef short i16;
typedef unsigned int u32;
typedef int i32;
typedef float f32;
typedef double f64;

float ortho_matrix[16] = {0};


typedef struct point2 {
	float x,y;
} point2;

typedef struct color {
	float r,g,b,a;	
} color;

typedef struct line {
	point2 p1, p2;
} line;

typedef struct rect {
	i32 x,y,width,height;
} rect;


#define WIDTH 500 // set to desired size
#define HEIGHT 500 // set to desired size

//globals
i32 quit = 0;
HWND global_window; //this is the handle to our window
HDC global_dc; //handle to the DC for our window
font_directory ft = {0};
u8 temp_texture[64*64] = {0};
i32 bitmap_height = 64;
i32 bitmap_width = 64;

i32 vbo = 0;
i32 vao = 0;
i32 tex_id = 0;

LRESULT CALLBACK rasterize_proc( HWND hwnd, UINT umsg, WPARAM wparam, LPARAM lparam) {

	LRESULT result = 0;

	if(umsg == WM_KEYUP && wparam == VK_ESCAPE) {
		quit = 1;
	} else {
		result = DefWindowProcA(hwnd, umsg, wparam, lparam);
	}

	return result;
}

void ortho_projection(float* m, float left, float right, float bottom, float top, float pnear, float pfar) {
	float width = (right-left);
	float height = (top - bottom);

	m[0 + 4*0] = 2.0/(width);
	m[1 + 4*1] = 2.0/(height);
	m[2 + 4*2] = -2.0/(pfar - pnear);
	m[3 + 4*3] = 1;

	m[3 + 4*0] = - (right + left) / (right - left);
	m[3 + 4*1] = - (top + bottom) / (top - bottom);
	m[3 + 4*2] = - (pfar + pnear) / (pfar - pnear);

}


u32 create_shader_compile(char *shader_filename, GLenum shader_type) {
  char *shader_source = read_file(shader_filename, NULL);
  u32 shader = glCreateShader(shader_type);
  glShaderSource(shader, 1, &shader_source, NULL);
  glCompileShader(shader);
  
  u32 shader_iv_result = 0;
  glGetShaderiv(shader, GL_COMPILE_STATUS, &shader_iv_result);
  if(shader_iv_result != GL_TRUE) {
    glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &shader_iv_result);
    char *log = (char*) malloc(shader_iv_result);
    glGetShaderInfoLog(shader, shader_iv_result, &shader_iv_result, log);
    printf("%s\n", log);
    free(log);
	free(shader_source);
    return 0;
  }
  
  free(shader_source);

  return shader;  
}


u32 create_program_link_shader(u32 *shaders_to_link, i32 size) {
  u32 program = glCreateProgram();
  for(int i = 0; i < size; i++) {
    glAttachShader(program, shaders_to_link[i]);
  }
  
  glLinkProgram(program);
  
  i32 program_iv_result = 0;
  glGetProgramiv(program, GL_LINK_STATUS, &program_iv_result);
  
  if(program_iv_result != GL_TRUE) {
    glGetProgramiv(program, GL_INFO_LOG_LENGTH, &program_iv_result);
    char *log = (char*) malloc(program_iv_result);
    glGetProgramInfoLog(program, program_iv_result, &program_iv_result, log);
	printf("%s\n", log);
	free(log);
	return 0;
  }
  
  return program;
}

void init_window(i32 width, i32 height) {
	WNDCLASSEXA wnd_class = {
		sizeof(WNDCLASSEXA),
		CS_OWNDC|CS_HREDRAW|CS_VREDRAW,
		rasterize_proc,
		0, 0,
		NULL, NULL,
		LoadCursor(NULL, IDC_ARROW), // this sets how our cursor will look, we can choose one of the predefined values
		(HBRUSH)COLOR_BACKGROUND,
		NULL,
		"rasterizer",
		NULL};

	RegisterClassEx(&wnd_class);

	RECT rect_size = {0, 0, width, height};

	//this function adjusts the Rect passed to make sure we can get a client rect (drawable area) that is 500,500
	AdjustWindowRect(&rect_size, CS_OWNDC|CS_HREDRAW|CS_VREDRAW,  0);

	global_window =  CreateWindowEx(
			0, //No extended styles are needed
			"rasterizer", //the name of our registered class
			"rasterizer", //the name of our window title, riviting stuff.
			WS_VISIBLE, // because we want our window to be visible when its first created
			0, 0,// put our window at the top left corner of the monitor
			rect_size.right, rect_size.bottom, 
			NULL, NULL, NULL, NULL);


	printf("we\n");
	fflush(stdout);
	// getting the device context for our window
	global_dc  = GetDC(global_window);

	// fill out our struct
	PIXELFORMATDESCRIPTOR x = {0};
	x.nSize = sizeof(PIXELFORMATDESCRIPTOR);
	x.nVersion = 1;
	x.dwFlags = PFD_SUPPORT_OPENGL | PFD_DOUBLEBUFFER;
	x.iPixelType = PFD_TYPE_RGBA;
	x.cColorBits = 24;

	int pixel_format = ChoosePixelFormat(global_dc, &x); // this will give us the closest supported pixel format to the one we chose
	SetPixelFormat(global_dc, pixel_format, &x); // set the pixel format

	HGLRC dummy_context = wglCreateContext(global_dc);
	wglMakeCurrent(global_dc, dummy_context);

	PFNWGLCREATECONTEXTATTRIBSARBPROC wglCreateContextAttribs = (PFNWGLCREATECONTEXTATTRIBSARBPROC) wglGetProcAddress("wglCreateContextAttribsARB");

	wglMakeCurrent(NULL, NULL);
	wglDeleteContext(dummy_context);

	GLint attr[] = {
		WGL_CONTEXT_MAJOR_VERSION_ARB, 4,
		WGL_CONTEXT_MINOR_VERSION_ARB, 5,
		WGL_CONTEXT_PROFILE_MASK_ARB,
		WGL_CONTEXT_CORE_PROFILE_BIT_ARB,
		0
	};

	HGLRC gl_context = wglCreateContextAttribs(global_dc, 0, attr);
	wglMakeCurrent(global_dc, gl_context);

	load_funcs();

	i32 viewport[4] = {0};
	glGetIntegerv(GL_VIEWPORT, viewport);
	printf("vp: %d %d %d %d\n", viewport[0], viewport[1], viewport[2], viewport[3]);
	printf("rect: %d %d %d %d\n", rect_size.left, rect_size.top, rect_size.right, rect_size.bottom);
	glViewport(rect_size.left, rect_size.top, rect_size.right, rect_size.bottom);
}

void setup_opengl() {

	glGenVertexArrays(1, &vao);
	glBindVertexArray(vao);


	u32 vs = create_shader_compile("vertex_shader.glsl", GL_VERTEX_SHADER);
	u32 fs = create_shader_compile("fragment_shader.glsl", GL_FRAGMENT_SHADER);

	u32 shaders[] = {vs,fs};
	u32 program = create_program_link_shader(shaders, 2);
	glUseProgram(program);
	
	glGenBuffers(1, &vbo);
	ortho_projection(ortho_matrix, 0, 500, 0, 500, -1, 1);

	glGenTextures(1, &tex_id);

	glEnable(GL_BLEND);
	glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);


}

void draw_triangle() {

	point2 triangle[] = {
		{0, 0},
		{100, 100},
		{200, 0}
	};

	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, 3*sizeof(point2), triangle, GL_STATIC_DRAW);
	glEnableVertexAttribArray(0);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);

	glDrawArrays(GL_TRIANGLES, 0, 3);
}


void draw_points(point2 *points, i32 size) {

	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, size*sizeof(point2), points, GL_STATIC_DRAW);
	glEnableVertexAttribArray(0);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);
	glDrawArrays(GL_POINTS, 0, size);

}

void draw_lines(line *edges, i32 edge_count) {

	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, edge_count*sizeof(line), edges, GL_STATIC_DRAW);
	glEnableVertexAttribArray(0);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);
	glDrawArrays(GL_LINES, 0, edge_count*2);

}




void tessellate_bezier(point2 *output, i32 *output_size, point2 p0, point2 p1, point2 p2) {
    i32 subdiv_into = 2;
    f32 step_per_iter = 1.0/subdiv_into;
    i32 out_size = 0;
    for(int i = 1; i <= subdiv_into; i++) {
      f32 t = i*step_per_iter;
      f32 t1 = (1.0 - t);
      f32 t2 = t*t;
      f32 x = t1*t1*p0.x + 2*t1*t*p1.x + t2*p2.x;
      f32 y = t1*t1*p0.y + 2*t1*t*p1.y + t2*p2.y;
      output[out_size].x = x;
      output[out_size].y = y;
      out_size++;
    }
    
    *output_size = out_size;
 }



void generate_points(glyph_outline *outline, point2 *generated_points, i32 capacity, i32 *point_count, i32* gen_pts_end_indices) {
	i32 j = 0;
	i32 index = 0;
	for(i32 i = 0; i < outline->numberOfContours; i++) {
		i32 contour_start_index = j;
		i32 generated_points_start_index = index;
		i32 contour_start = 1;
		i32 contour_started_off = 0;
		for(;j <= outline->endPtsOfContours[i]; j++) {
			glyph_flag *flag = outline->flags + j;

			i32 x = outline->xCoordinates[j];
			i32 y = outline->yCoordinates[j];

			i32 contour_len = outline->endPtsOfContours[i] - contour_start_index + 1;
			i32 cur_index = j;
			i32 next_index = (j+1 - contour_start_index)%contour_len + contour_start_index;

			if(flag->on_curve) {
				//point on curve so add directly 
				generated_points[index].x = x;
				generated_points[index].y = y;
				index++;
			} else {
				// handle off curve

				if(contour_start) {
					contour_started_off = 1;
					if(outline->flags[next_index].on_curve) {
						generated_points[index].x = outline->xCoordinates[next_index];
						generated_points[index].y = outline->yCoordinates[next_index];
						index++;
						j++;
						continue;
					}

					x = x + (outline->xCoordinates[next_index] - x)/2.0;
					y = y + (outline->yCoordinates[next_index] - y)/2.0;

					generated_points[index].x = x;
					generated_points[index].y = y;
					index++;
				}

				point2 p0 = generated_points[index - 1]; // the latest generated point
				point2 p1 = {.x = x, .y = y};
				point2 p2 = {.x = outline->xCoordinates[next_index], .y = outline->yCoordinates[next_index] }; //we will need this point either way

				if(!(outline->flags + next_index)->on_curve) { //check next point
					p2.x = p1.x + (p2.x - p1.x)/2.0;
					p2.y = p1.y + (p2.y - p1.y)/2.0;
				} else {
					j++;
				}

				i32 out_size = 0;
				tessellate_bezier(generated_points + index, &out_size, p0, p1, p2);
				index += out_size;
			}

			contour_start = 0;
		}

		if(outline->flags[j-1].on_curve) {
			generated_points[index] = generated_points[generated_points_start_index];
			index++;
		}

		if(contour_started_off) {
			//handle first point if its off
			point2 p0 = generated_points[index-1]; //our latest added point, this will before the first point
			point2 p1 = {.x = outline->xCoordinates[contour_start_index], .y = outline->yCoordinates[contour_start_index]};
			point2 p2 = generated_points[generated_points_start_index];

			i32 out_size = 0;
			tessellate_bezier(generated_points + index, &out_size, p0, p1, p2);
			index += out_size;
		}


		gen_pts_end_indices[i] = index;

	}

	*point_count = index;
}

line* generate_edges(point2 *pts_gen, i32 *edge_count, i32 *contour_ends, i32 contour_count) {
    i32 j = 0;
    i32 edge_index = 0;
    line *edges = (line*) malloc(sizeof(line)*contour_ends[contour_count-1]);
    for(i32 i = 0; i < contour_count; i++) {
      for(; j < contour_ends[i]-1; j++) {
        line *edge = edges + edge_index;
        edge->p1.x = pts_gen[j].x;
        edge->p1.y = pts_gen[j].y;
        
        edge->p2.x = pts_gen[j+1].x;
        edge->p2.y = pts_gen[j+1].y;
        edge_index++;
      }
      j++; //we are the end of the contour, skip over this point to the start of the next contour
    }
    
    *edge_count = edge_index;

	return edges;
}


i32 cmp_floats(const void *a, const void *b) {
	f32 f_a = *(f32 *)a;
	f32 f_b = *(f32 *)b;

	if(f_a < f_b) return -1;
	else if (f_a > f_b) return 1;

	return 0;
}


void rasterize_glyph(line *edges, i32 edge_count, u8 *bitmap, i32 bitmap_height, i32 bitmap_width) {

	f32 intersections[32] = {0};
	i32 intersection_count = 0;
	i32 scanline_subdiv = 5;
	f32 alpha_weight = 255.0/scanline_subdiv;
	f32 step_per_scanline = 1.0/5;

	for(i32 i = 0; i < bitmap_height; i++) {

		for(i32 x = 0; x < scanline_subdiv; x++) {
			intersection_count = 0;

			f32 scanline = i + x*step_per_scanline;

			for(i32 j = 0; j < edge_count; j++) {
				line *edge = edges + j;

				f32 bigger_y = MAX(edge->p1.y, edge->p2.y);
				f32 smaller_y = MIN(edge->p1.y, edge->p2.y);

				if(scanline <= smaller_y) continue;
				if(scanline > bigger_y) continue;

				f32 dx = edge->p2.x - edge->p1.x;
				f32 dy = edge->p2.y - edge->p1.y;

				if(dy == 0) continue;


				f32 intersection = -1;
				if(dx == 0) {
					intersection = edge->p1.x;
				} else {
					intersection = (scanline - edge->p1.y)*(dx/dy) + edge->p1.x;
					if(intersection < 0) intersection = edge->p1.x;
				}

				intersections[intersection_count] = intersection;
				intersection_count++;
			}


			qsort(intersections, intersection_count, sizeof(f32), cmp_floats);


			if(intersection_count > 1) {
				for(i32 m = 0; m < intersection_count; m += 2) {
					f32 start_intersection = intersections[m];
					i32 start_index = intersections[m];
					f32 start_covered = (start_index + 1) - start_intersection ;

					f32 end_intersection = intersections[m+1];
					i32 end_index = intersections[m+1];
					f32 end_covered = end_intersection - (end_index);

					if(start_index == end_index) {
						bitmap[start_index + i*bitmap_width] += alpha_weight*start_covered;
					} else {
						bitmap[start_index + i*bitmap_width] += alpha_weight*start_covered;
						bitmap[end_index + i*bitmap_width] += alpha_weight*end_covered;
					}

					for(i32 j = start_index+1; j < end_index; j++) {
						bitmap[j + i*bitmap_width] += alpha_weight;	
					}
				}
			}
		}
	}
}


void draw_texture(rect r, u8 *texture_data, i32 width, i32 height) {
	point2 quad[12] = {
		//triangle 1
		{r.x, r.y},
		{r.x, r.y + r.height},
		{r.x + r.width, r.y + r.height},

		//triangle 2
		{r.x, r.y},
		{r.x + r.width, r.y + r.height},
		{r.x + r.width, r.y},

		// UV coordinates for triangle 1
		{0, 0},
		{0, 1},
		{1, 1},
		// UV coordinates for triangle 2
		{0, 0},
		{1, 1},
		{1, 0}
	};

	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, sizeof(point2)*12, quad, GL_STATIC_DRAW);

	glEnableVertexAttribArray(0);
	glEnableVertexAttribArray(2);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
	glVertexAttribPointer(2, 2, GL_FLOAT, GL_FALSE, 0, (void *)( sizeof(point2)*6 ));
	glUniformMatrix4fv(1, 1, GL_TRUE, ortho_matrix);

	glBindTexture(GL_TEXTURE_2D, tex_id);

	glPixelStorei(GL_UNPACK_ALIGNMENT, 1);

	glTexImage2D(GL_TEXTURE_2D, 0, GL_RED, width, height, 0, GL_RED, GL_UNSIGNED_BYTE, texture_data);

	glPixelStorei(GL_UNPACK_ALIGNMENT, 4);


	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

	
	glDrawArrays(GL_TRIANGLES, 0, 6);
}

void render_glyph() {
	glyph_outline glyph = get_glyph_outline(&ft, get_glyph_index(&ft, '0'));
	point2 temp_points[512] = { 0};

	f32 scale = (f32)bitmap_height/(f32)(glyph.yMax - glyph.yMin);

	i32 index = 0;
	i32 *contour_end_pts = (i32 *) malloc(sizeof(i32)*glyph.numberOfContours);
	generate_points(&glyph, temp_points, 128, &index, contour_end_pts);

	for(int i = 0; i < index; i++) {
		temp_points[i].x = ( temp_points[i].x  - glyph.xMin)*scale;
		temp_points[i].y = ( temp_points[i].y  - glyph.yMin)*scale;
	}


	/* draw_points(temp_points, index); */


	i32 edge_count = 0;
	line *edges = generate_edges(temp_points, &edge_count, contour_end_pts, glyph.numberOfContours);
	/* draw_lines(edges, edge_count); */

	rasterize_glyph(edges, edge_count, temp_texture, bitmap_height, bitmap_width);

	rect r = {
		250, 250, 64, 64
	};

	draw_texture(r, temp_texture, bitmap_width, bitmap_height);

}


int main(int argc, char **argv) {
	init_window(WIDTH, HEIGHT);
	setup_opengl();

	int file_size = 0;
	char* file = read_file("envy.ttf", &file_size);
	char* mem_ptr = file;

	read_font_directory(file, &mem_ptr, &ft);

	MSG _msg = {0}; 
	while(!quit) {
		glClear(GL_COLOR_BUFFER_BIT);
		memset(temp_texture, 0, bitmap_width*bitmap_height);

		while(PeekMessage(&_msg, global_window, 0, 0, PM_REMOVE)) { 
			TranslateMessage(&_msg); 
			DispatchMessage(&_msg); 
		} 

		render_glyph();

		SwapBuffers(global_dc);
	}
}