Hey guys, sorry for the inactivity. I have been making steady progress on Monter over the past couple of months, but couldn’t find any time to do a writeup about it. I try to make these writeups as high quality as possible, so I was reluctant to publish any article that just glosses over the details. And now I finally have the time to do another writeup on Monter. There’s a lot of stuff to cover, such as the new grass system, terrain system, and water shader. But for this post, let’s first wrap up the collision system that I promised months ago.

I’ve covered the theory of GJK in the last post, but there are a couple of implementation details that’s worth noting.

Optimizing GJK: support function

Recall that, in each iteration of GJK, we must need something called a support function. The support function’s job is to find the vertex that’s the furthest along the search direction. We could implement it with a simple linear search, but that means O(N) at each iteration of GJK. Although GJK is promised to take a low number of iteration before converging given our circumstances, we can do better than O(N).

A heuristic called Hill Climbing can be applied here. If we first store an adjacency list for each vertex, then we can start at a random vertex of the mesh, then slowly “climb” towards the extreme point by comparing with only its neighbors. This is significantly faster than a linear search, especially on big meshes.

However, Hill Climbing can get stuck at local maximas and be unable to reach the global maxima. In the context of implementing a support function, a local maxima can occur if there is a vertex with neighbors that are coplanar. However, this also means that there are unnecessary vertex in the geometry representation. Imagine if you take that problematic vertex out, would it change anything in the context of GJK? No it does not.

Therefore, it’s clear that a traditional 3D mesh representation is probably not the best candidates for GJK testing. Instead, you want a convex hull of the 3D mesh; only the points enough to enclose your geometry. If you can construct a minimal convex hull without any redundant vertex, then you don’t have to worry about the local maxima problem for our hill climbing support function, and searching will also be a lot faster.

Optimizing GJK: initial search direction

Recall that in GJK, a initial search direction must be seeded to start the entire algorithm. You could seed it with a random direction, but a good guess often means you instantly reject most collision pairings that are obviously not possible.

A good initial guess is simply a difference vector between the two objects’ centers. GJK will usually terminate after the first iteration with false if the objects are clearly separate.

Better yet, at each time GJK determines that the two objects are not colliding, you cache the last search direction. Then in the following frame, you use that as the initial guess. It would also cause GJK to terminate earlier if the two objects are indeed separate.

Optimizing the collision: error margin in bisection method

Recall in Monter’s collision system, bisection method is applied to find the legal interval of movement of the player. In order to achieve an accurate result, many iterations of the bisection method must be applied, somewhere around 35 tests. However, we could loosen up the error margin and only run a couple of iterations in the bisection method. This works surprisingly well if the test is conservatively picking the safest interval. The error margin is almost unnoticeable, but is significantly faster in comparison.

Ok, this wraps up the GJK section. Let’s move on to the terrain collision.

Terrain representation

The terrain system went through a lot of revisions, particularly because it has to interact with multiple components in the game engine. It has to interact with collision stuff, it also has to be easily modifiable by the world editor, and the grass system need to efficiently query the terrain heights to plant grass faster.

In the end, I picked a heightmap representation, storing the height of the terrain in a 2D array. A mesh can be easily constructed from this representation to render the terrain, and the heights at any point could also be efficiently queried with just a simple memory read.

Terrain collision

A desirable characstic of any terrain is continuity. When we play games, we all want a smooth terrain, so that when the characters walk on it, they won’t jitter like crazy or get stuck randomly. However, 2D heightmap is fundamentally discretized. Directly sampling it means jittered movements due to its discretized nature.

A continuous terrain translated into finitely many cells (heightmap)

However, we can try our best to reconstruct the smoothness using bilinear interpolation. By interpolating the four corners in some way, we can reconstruct a plausible smooth surface and use that to respond to terrain height queries. The following are two nice illustrations that I took from book of shader that demonstrates this technique:

Linear interpolation that reconstructs terrain’s continuity in value, but not its first derivatives (C0 continuity)

Cubic interpolation that not only reconstructs terrain’s continuity in value, but also its first derivatives (C1 continuity)

Although they are not perfect reconstructions, they are good enough to give the smoothness of the terrain.

With the above system, we can now query the height at any point within the heightmap. We can then use that to easily ensure the player is always above the ground. And that is the terrain collision system in a nutshell.

Reacting to collision: projected velocity

If you walk on a flat surface, then your can move just fine. But if you are climbing onto a steep ramp, then part of your movement contributes to lifting you up in height, and only the other part of it allows you to move towards the target direction.

This demonstrates that if your collision with other objects is not head-on, then only part of your movement will be absorbed by the collision. Essentially, when your movement is partially blocked by something, you can break it down into two parts. One part of the movement is completely negated due to collision, and the other part remain unaffected by the collision. In order to properly react to collisions, the system must be able to calculate what’s the part of the movement that remains unaffected.

Turned out, to compute this ratio, we need something called the contact normal. It’s the normal of the collision plane, which is a plane that completely separates the two colliding objects that prevents them from overlapping.

In the case with GJK, we already have a way of obtaining the contact normal. As for terrain, we could approximate it using central differences. Now, assume the height of the terrain is along Y axis, and the procedure is as follows. First, we take a close neighbor point that also lies on the terrain along the X axis, then we do the same for Z axis. We can take the differences between these two points and the center point to construct a tangent plane. The normal will just be their cross product.

Once we’ve obtained the contact normal, we can then project the movement onto the tangent plane. This way we eliminated the component of the movement that is directly opposing the contact force and leave the other component untouched.

By putting this in effect, we can simulate things like gliding and climbing a ramp, which is more realistic than just making the player stuck whenever he touches anything.
Hi all. I know I owe you a followup blog post to the collision system, but I decided it’s best to write about the recent procedural cloud render I did while it is still fresh in my head. Here are a couple of screenshots:

(you might have noticed I added grass. A blog post on that will be out soon)

Before we get started, I just want to clearly state that this is not based on any physical equations whatsoever. This result is solely obtained by experimenting with various mathematical equations with a weak physical basis; the “look” of the cloud is the only emphasis here, not correctness.


The goals I aim for when started writing the volumetric cloud rendering are the following:

. Simplicity: I want the entire cloud render to just be a single shader pass.
. Procedural: Generated geometry, for that it is easy to animate and has unlimited resolution.
. Volumetric: No skybox 2d texturing, I want a 3D “volumetric” feel out of the clouds.
. Controllable: I want to be able to control the cloud with parameters, such as coverage, wind direction, animation speed, and cloud shapes.
As you will see, I didn’t obtain all these goals, but I did achieve most of them.

Opacity-based cloud shading

Recall that in my post on the sky shading pass, I have already set up a shader that does raycasting. Since cloud rendering is a part of the grand sky shading, we can start by extending our sky shading to somehow render cloud as well.

Just as a refresher, this is what I was doing in the sky shading pass: casting rays out to the sky hemisphere, then use the rays to determine gradient color.

Note that these rays are in reverse direction of actual light rays that come into the viewer’s eyes. In other words, the paths of the light rays is more like this:

Clouds can be perceived as a type of participating media. As light passes through cloud, it gets both in-scattered and out-scattered. My approach is to trace these rays from skydome to the viewer (the opposite of what we were doing before) and accumulate the amount of cloud the light passes through. After the light exits all cloud volumes, I use this accumulated density value to somehow mix the ray’s original color and cloud’s color as the final output color.

So we have three problems: how do we build the cloud geometry, how do we trace rays through the cloud geometry, and how do we use the accumulated density value to calculate the final output.

First step: Procedural Cloud Modelling

If you have played with shadertoy a bunch, you must be familiar with the concept of fractal brownian motion (shortened as fbm). It’s one of the most powerful tools for procedural texturing and even procedural modelling. Here’s an excellent introduction to it. The basic idea is to scale the size of the noise space by two, then reduce the noise values to only half, add to itself, and repeat. This procedure creates a texture that resembles puffy cloud that has wispy edges.

We can use this technique to build 3D fbm, which we can then use as a building block of the density function of our cloudscape. Here’s the function signature of it:

float fbm(in vec3 position);

We can’t use fbm() directly as our cloud density function because if we do so, that would mean the entire 3D space is filled with cloud.

We can first limit the cloud volume to only exist between certain heights. We can do so by creating two analytical spheres whose centers are at earth’s center, and make sure that the cloud volume exists within their difference. We can do so by finding the intersection point between rays and the spheres analytically, then trace the light ray from the farther intersection point to the closer one.

and here’s what the cloud geometry looks like:

Despite being limited only to a certain slice of 3d space, they are still more or less uniformly distributed.

The next thing we can do is to set fbm values that are below a certain threshold to zero. This carves out a chunk of the cloud volume that has density lower than this threshold. By tweaking this threshold, we can control the size of clouds.

float cloud_density(in vec3 position)
    float res = fbm(position);
    if (res < ?)
        res = 0.0;
    return res;

Here’s how it looks now:

Cloudscape when the density threshold is set to 0.5

It looks pretty good, but it doesn’t animate. We can go one step further and make the fbm 4d, whose fourth dimension could be the animation parameter. By advancing the fourth input, the cloudscape can be animated nicely.

However, 4D fbm evaluation is expensive, and as you will see, the performance of our tracing method is reliant on how cheap our fbm evaluation is. So we can’t do that, sadly.

What I did instead is create an offset vector, whose direction is the wind direction and whose magnitude is the elapsed time in seconds. Then I add this offset vectors to the sample points before density sampling. This animates the cloudscape by making it move along the wind. To change the cloudscape’s structure, I added a constant positive y direction to the offest vector so that the cloud gets lifted up slowly. That way, new cloud structure emerges and old ones vanishes, as they are both passing through the two analytical spheres we built earlier.

Recall that we not only built the cloud’s geometry, but also its density in various areas. This means that if we take density into account in my tracing method, the cloud edges will have a lower density value than its centers, which makes its edges wispy and transparent. This is crucial for getting the stylistic look that I aim for.

Second Step: Accumulating Density

So now we have a ray, a begin position, an end position, and a density sampling function. If what’s available to us is just a density sampling function as the representation of the cloud geometry, then constant-stepped ray marching is pretty much the only way I can accumulate the density.

To compute the accumulated density, I first define a fixed sample count per ray, then divide the ray length by this sample count to get the fixed step size. Then I just keep stepping and sampling along the ray. At each sample, I multiply the step size with the density at that sample, then add to the accumulated value. Multiplying with the step size is necessary because it let the longer steps weigh more in the final addition. Each sample is multiplied with some magic number to “normalize”, sort of. This tweaky aspect originates from my emphasis on “look”, not correctness.

Now we convert the accumulated density to opacity. It is well-known that linear lerp isn’t even enough for fog effects, and exp() mapping is commonly used to fake the fog effect. I am doing the same for my cloud effect as well; I map density, which is [0, inf] (generously), to opacity, which is [0, 1]. Here’s the function:

 float opacity = 1.0 - exp(-acc_density);

same formula is also commonly used for HDR tonemapping, except an exposure term is multiplied within exp()

Now we set a base color for the clouds, then blend the original sky color with the cloud color based on opacity value:

vec3 cloud_col = vec3(5); 
vec3 final_col = mix(sky_col, cloud_col, magic_num * opacity);

Some magic number is necessary to control the look of the final render.

Final Step: Magic

Here’s the result of what we did. Surprisingly, it did not look good, even though what we did seems to be correct:

Here’s where the tweaky aspect comes in: by trying a whole bunch of stuff, I hit this line of code that magically gives depth and volume to the clouds:

//NOTE(chen): instead of just a base color, cloud color varies based on its accumulated density
vec3 cloud_col = vec3(1.5) + acc_density;
vec3 final_col = mix(sky_col, cloud_col, magic_num * opacity);

By changing the cloud color to be a variable depending on the accumulated density, clouds suddenly seem a lot puffier.

Review our Goals

Let's review our goals. We did achieve a completely procedural cloudscape, and we did have it be animated. It is also contained within a single shader, but one thing I didn't really achieve is control. I could control the amount of clouds on the sky, but I could never quite control the cloud shapes. That being said, I'm still quite happy with the results.


How good the cloud looks really depends on the sample count per ray. However, to obtain a reasonable image, the cloud shading pass needs to take around 10ms on my machine. Do not fret, though, as we still haven’t used the ultimate graphics optimization technique in our bag.

Recall from my very first rendering pipeline article, both bloom and SSAO shader were to slow and needed optimization. The trick was to render it as a texture at lower resolution first, then render the texture at higher resolution and blur it. In this case, we can just run the cloud shader on a ¼ resolution texture, and blit it to the main framebuffer with bilinear filtering turned on. With this optimization, cloud shading runs at a reasonable 1.5ms now. There is a bit of quality drop and temporal aliasing issues, but it isn’t quite noticeable.

Last Trick

Lastly, there’s more tricks that can be done to the cloud model. Recall that density comes from fbm(), but the 3d space that we are sampling fbm() from can be warped. In other words, we can do something along the line of :

float density = fbm(fbm(p + anim_t1) + anim_t2);

But I’d rather not have 2 fbm evaluations per sample, so I didn’t leave it in. But I did record a video of this technique into a video. It looks pretty cool.

For some reason, information on collision detection & response is quite sparse compared to all the other subjects. Worse yet, there are some bad information out there that are praised to be good resources. Pieces of information are also often discoherent, such as resources on collision detection says nothing about collision response and vice versa. It took me a while to find good information on this subject, and I have finally implemented a working collision system that works reasonably well for games.

A new collision object representation
This time, I chose convex polyhedrons to be the collision representation for the small objects in my game, such as trees, rocks, and so on. Concave objects, such as houses, are broken down into convex polyhedrons. I am aware of some of the convex decomposition techniques, but I didn’t want to do them, as I think that would take too much time. Instead, I manually decompose concave mesh into convex polyhedrons in Blender.

As for terrain, I use a completely different representation. This time, I render the terrain from top down in orthographic view, then store away the depthmap into some texture I can easily use later. Therefore, querying terrain height at any arbitrary point becomes very fast.

Unlike last time, we don’t have the fortune to have a uniform representation for all collidables in our game world, so we have to write two separate solutions. However, it is widely more efficient compared to last time. Instead of feeding huge numbers of polygons into the collision system, I can approximate them with cheap convex shapes now. As for terrain, it just becomes four texture reads.

First, let’s talk about how per-object collision works.


GJK is a powerful technique that queries the minimal distance between two convex objects, and it is also what I mainly use for my per-object collision system. I will only briefly summarize it here since there are already really good information on this topic, which I will link later.

First, imagine convex objects to be made of infinite small points. If a convex object A, and a convex object B are considered colliding, that must mean some of the points in this space belong to both A and B. If we subtract all points belong to B from all points belong to A, we must be subtracting a small subset of points from themselves. Therefore, some of these subtractions result in zero.

Imagine the subtractions we did form a new shape, C. For A and B to intersect, C must contain the origin. That is due to some of the subtractions resulting in zero. If A and B do not intersect, then C does not contain origin.

Now, to check if A and B intersect, we can check if C contains the origin. This is the essence of GJK; we have reduced the problem to checking whether or not the origin is contained by some convex shape.

How do we obtain C? Do we have to subtract all the points from B by all the points from A? No. If we can find support functions for A and B, then we can combine them to be the support function for C.

Given the support function of C, we can use it to map directions to extreme points in C. It can also be shown that for any convex shape, if it contains an origin, a tetrahedron made up of four points from the convex shape can capture the origin (meaning contain the origin). That means, we can use the support functions to find extreme points, and try to let these extreme points form a tetrahedron that contains the origin. That is what GJK does.

GJK does this by evolving a group of points into that tetrahedron that would capture the origin, we will call this group of points a simplex. For the 3D case, a simplex can be a point, a line segment, a triangle, or a tetrahedron.

In each iteration of GJK, it finds which voronoi region the origin is in relative to the simplex, then keep only the subset of that simplex which is contained by that voronoi region’s feature, and then change the search direction towards that voronoi region.

We can check the progress by finding the closest point on the simplex to the origin. Functions that do this, such as ClosestPointOnLineSegmentToPoint(), ClosestPointOnTriangleToPoint(), and ClosestPointonTetrahedronToPoint(), are well covered in Ericson’s book “Real Time Collision Detection”. The distance from this closest point to the origin tells us our progress. If this closest point stops getting closer, that means we will never be able to capture the origin. In that case, we return the distance as our minimal distance.

I know in Casey’s lecture on GJK, the termination condition is quite different. He simply checks if the extreme point in the search direction can get to the other side of the origin. That is enough for determining whether or not two objects collide, but it is not enough to find the minimal distance and the closest point on the simplex, which is vital to us, since we will need these information to resolve collision. Furthermore, some of the interesting optimizations I did will require GJK to reach the simplex that has the minimal distance in order to work.

Lastly, a word of warning. What I said above is purely in theory. There is more to termination condition in 3D GJK, such as clever ways of checking progression. I always find it reassuring to actually keep a minimal distance every iteration. Furthermore, you will also need to watch out for degenerate simplices, since most algorithms that determines a point’s voronoi regions will have unexpected behaviors if a simplex is degenerate, such as when a triangle becomes collinear and a tetrahedron becomes collinear in certain faces or coplanar. These are indications that GJK cannot make further progresses, because the new points is still within the current simplex.

In addition, a naive GJK implementation doesn’t work well with quadric shapes, such as spheres or ellipsoids, which are often shapes used to represent a player’s collision volume. In order for GJK to work for them, a tolerance value must be added to the progression check.

More details on GJK, read Erin’s talk.

Accounting for movement in GJK

As I described. GJK is an algorithm for computing the minimal distance between two convex shapes. It cannot be directly used as a collision detection algorithm in games, since it doesn’t account for movement.

One simple trick in Ericson’s book can change that, though. Keep in mind that GJK only requires the geometry’s support function to work, it does not care about the explicit definition of that geometry. We can turn one of the object to be swept across space by its motion by changing the support function for that object.

support function only returns points from B if search direction D is along motion vector

Consider the above swept sphere. The support function of the convex hull of the moving sphere will only return points from sphere B if the search direction is along the motion vector. In other words, we can simply do a dot product between the search direction and motion vector and use it to determine whether to use A’s support mapping or B’s support mapping. This effectively gives us a support function for a swept geometry.

Incorporating GJK into game’s collision detection

In my game, I am using a capsule to approximate player’s collision volume and convex polyhedrons for other geometries in the world. There are two pieces of information that we need to handle collision correctly each frame. One is time-of-impact (TOI) and contact point. We use TOI to stop the player right before collision and use contact point to compute the normal plane, which is then used to reflect player’s motion in order to achieve “gliding”.

My use of capsule is intentional, as it simplifies things a lot.

Obtaining time-of-impact

To obtain time of impact, one GJK isn’t enough. It only tells me whether my moving capsule will collide with any object, but doesn’t say when it will collide with them. As recommended by Ericson in his book, bisection method can be used here to obtain a relatively accurate TOI. The concept of bisection method is simply halving the interval to ensure that the margin within the interval is where the possible TOI is. It is analogous to binary search.

finding TOI with bisection method

However, as I have stated before, quadric shapes don’t play well with GJK. We can simplify the problem a lot here by reducing the capsule into its inner line segment. This is inspired by Randy Gaul’s answer to my post. His suggestion is reduce the capsule to a line segment in order to compute contact point, but I realized it can also work in collision detection phase.

I am not sure if this is the canonical definition of a capsule, but the “capsule” that I am using is defined by two parameters: the inner line segment and a radius. The inner line segment defines its height, and the radius expands that line segment out radially so it actually has a volume.

Given the above definition, it is clear that an object can only intersect with the capsule if the minimal distance between the inner line segment and the object is less or equal to the capsule’s radius. Therefore, instead of feeding a capsule into GJK, I can instead feed its inner line segment to GJK, and then compare the capsule’s radius with the returned minimal distance to detect collision.

Obtaining contact point

After determining TOI, we move the capsule forward by that amount, and we now determine the contact point between capsule and the closest object. As Randy Gaul suggested, the capsule is now reduced to line segment to find the contact point. That gives the advantage of only having to deal with non-quadric shapes and expect a non-tetrahedron simplex in the last iteration of GJK, since it’s impossible for the two to intersect when the radius of the capsule is non-zero, and our capsule’s radius is obviously non-zero.

But how do we compute closest point on either the capsule or the colliding object? Think back to the GJK algorithm. If the two objects are non-intersecting (which is obviously the case here), GJK is terminated by detecting a lack of progression (namely, comparing the minimal distance with the last iteration). Recall that we keep a closest point on the simplex to origin. This is not the closest point we need, though, as it is a point belong to the minkowski difference instead of the two original convex shapes. At the last terminating iteration, we convert the point into its barycentric form relative to the simplex. Because the current closest point on the simplex is a direct subtraction between A and B, we must be able to find two simplices on A and B that correspond to the end minkowski simplex as “source”, with the same dimension.

there must be a 1 to 2 mapping for simplex kept during GJK

In order to find the corresponding simplex in A or B, we have to keep track of the source points when we construct the simplex in GJK. By retrieving these points and applying the barycentric coordinates to them, we can obtain the closest point on either A or B.

After obtaining the contact point, the rest is old news; reflecting the motion back towards the collision normal so that the resultant motion vector is along the collision normal plane, and all that goodies.

That’s it!

That is all for player vs object collision detection & response routine. I wanted to include terrain collision in this post as well, but it’s getting quite long, so I will save it for next time. And as always, a video of the working collision system. The pink spheres are debug drawings of the closest point computed by this algorithm:

Hi all! Sorry for the inactivity. School got in the way and I wasn't able to work on Monter much in the past three months and therefore hasn't posted any new progress. But now with school out of the way, I can finally come back and work on Monter again. So, expect new updates on the reworked collision system very soon!
Thanks to the debugging system from last post, I was able to pinpoint the bugs fairly quickly. However, as I tweak the algorithm to improve its numerical robustness, it becomes obvious that the problem is not as simple as just making the algorithm robust. In order to elaborate my concern and problem, here’s a review of the main algorithm.

Recap of Faurby’s
As I have stated in my previous posts, the current collision system is based on Faurby’s paper, which is really just a simple swept sphere vs triangle mesh collision detection & response routine. Here’s a pseudo-code version:

vec3 move_sphere(vec3 sphere_p, float sphere_r, vec3 vel) {

   Triangle_Array mesh = get_scene_static_mesh();

   // NOTE(chen):contact_p is initial contact point of swept sphere against mesh
   //            t is the normalized time of impact used to modulate vel
   vec3 contact_p, float t = try_move_sphere(sphere_p, sphere_r, vel, mesh);

   vec3 legal_vel = vel * t; 
   vec3 remain_vel = vel - legal_vel; 
   vec3 new_sphere_center = sphere_p + legal_vel;

   if (magnitude(remain_vel) < 0.00001f)  {
      return new_sphere_center;

   //project remain_vel onto tangent plane of contact point
   vec3 collision_normal = normalize(new_sphere_center - contact_p);
   vec3 next_vel = remain_vel - dot(collision_normal, remain_vel) * collision_normal;

   move_sphere(new_sphere_center, sphere_r, next_vel);


The algorithm strives to ensure a lack of overlap between the sphere and triangles while trying its best to preserve the intended motion. The mathematics checks out, but the paper fails to address some crucial numerical robustness issues.

Numerical robustness readdressed

I use 32-bit floating point throughout my codebase, including the collision system. One important characteristic of floating point numbers is that it is fundamentally discretized due to its 32-bit limitation. As a result, there are real numbers that a floating point cannot represent exactly.

This characteristic potentially introduces small errors which can lead to terrible consequences, such as player getting stuck or falling through the ground. The nature of such errors is also unpredictable, making it all the harder to deal with. That is the numerical robustness problem. When an algorithm compensates for these small errors and prevents them from affecting the result, it is called being numerically robust. I have alluded to this in previous posts, but since I am really going to talk about it in this one, I thought it deserves a proper reintroduction.

Faurby’s problems

The algorithm is mostly consist of vector math, but since the vectors are composed of floating point numbers, it suffers from the same issues. By repeated running the code on the bad inputs captured by the debug system from the previous post, I have figured out the weaknesses of Faurby’s algorithm and made a few adjustments to strengthen its numerical robustness. As you will see, all the issues lie within the collision resolution part of the code (that makes sense since the collision detection method used is already a well-established one).

First problem: an inexact arrival point/direction

As illustrated above, one of the steps in the algorithm is to stop the sphere exactly at the initial contact point. However, this “exact” value often cannot be represented by floating point values. The position where the sphere is stopped at is either too shallow or too deep. And when it’s too deep, that means the sphere has clipped through the triangle and overlaps with the triangle mesh. This error results in player getting stuck or falling through the triangle.

In his paper, Faurby made an effort to resolve this issue. His proposed solution is to pull the sphere back along its velocity vector by a tiny bit. This way, sphere always remains hovering over the collision plane (ostensibly), which prevents any possible overlap between sphere and triangle mesh.

Unfortunately, the solution above solves the inexactness of the magnitude of sphere’s velocity but ignores that the velocity’s direction is also inexact. When sphere’s velocity is tangential to the collision plane, it forms a swept sphere parallel to the triangle’s surface:

However, due to floating point’s inexactness, the actual direction stored in computer often is not exactly parallel to the surface. Therefore, due to the error of margin, the swept sphere could potentially enclose a larger volume, possibly overlapping with the collision plane.

The red swept spheres are the same swept sphere, but offseted by the floating point errors. As you can see, the inexactness of velocity’s direction causes precision issues. It cannot be fixed by simply retracting the sphere along the velocity vector itself, because the velocity vector itself is misdirected.

Fix to the first problem

By simple observation, it is clear that when the sphere clips through the plane, only a small part of the sphere sinks through the triangle. Therefore, by pushing the sphere out along the collision normal by a tiny amount, it can be ensured that the sphere never intersects with the triangle regardless of its travel direction and travel distance. However, this solution has its own drawbacks, which will be discussed later.

Second problem: infinite recursion due to inexact velocity projection

As you can see from the pseudo-code, the algorithm recursively resolves sphere’s movement until the movement is exhausted. In each recursion, the remained velocity is projected onto the collision normal, then negated. The result of this projection is a reflecting vector, which is then used to project the remaining velocity into a new direction that is parallel to the collision plane. Again, for the same reason, the distance of the reflection might not be exact, and therefore it will fail to reflect the remaining velocity vector out of the collision plane.

The problem is not as severe as the first problem, as the collision detection will catch that and defer the collision resolution onto next recursion. However, in some cases, the work will be deferred far too many times, which results in recursion exceeding the program’s stack memory.

Fix to the second problem

Again, the solution is applying some form of bias to the calculation. In this case, the reflecting vector is increased in length by a tiny amount to ensure the reflected velocity will always not intersect with the collision plane.

Problems beyond numerical robustness

The above solutions improve the numerical robustness quite a bit. Before these fixes, the debug system will report around 500 errors for 10000 movement inputs, but now it consistently gives 0 errors for 20000 movement inputs.

But they introduce new problems. These biases that I apply make the algorithm fundamentally incorrect, and therefore errors might manifest itself as weird movements in the game. As I feared, these fixes result in jerky movements.

I think it’s time to say goodbye to this algorithm. Even if I can come up with someway to fix it, it will just be the same dirty work and loses its elegance. The reason I experimented with a collision system that works with triangle mesh is because I want the collision system to be based on a single method. Be it uneven terrain, convex or concave geometries, I want this one method that can handle all of these problems. The fact that I have to slap patches onto this broken algorithm just makes me sick.

Instead, here’s what I am going to try implement:

1. Do a top-down depth pass using the GPU, then download it into a heightmap texture (could be done just initially or per-frame, we will see).
2. Sample terrain height using entity’s coordinates and the heightmap, then ensure character will always be above that height (that solves the collision detection, but not resolution).
3. For other entities, manually composite collision primitives to fit their geometric shape and use them for collision purposes.

This method obviously has its own drawbacks, but they aren’t unsolvable. The heightmap approach makes it harder to implement multi-layered terrain, but I don’t think there will be any in Monter. Collision resolution will also likely to be a problem, but I can take the gradient of the heightmap and use that to estimate terrain slope.

I can’t really tell if that’s going to work unless I start experimenting, but at least it seems more promising than what I have currently. I will have to tear down the old work and rewrite it again, but hey, that’s just what happens when you are in the process of exploring.