Confusion around coordinate systems

I've been working with my own software renderer for a little while, but I've decided its time to do a proper hardware renderer (but keeping it 2D). As I primarily use Macs, that means Metal.

The orthographic projection matrix is defined as Screen Shot 2023-09-18 at 8.41.23 PM.png

but since Metal uses a left-handed coordinate system (+z into the screen, away from you), does that mean the two z terms should be negated? i.e. -2/(far - near) should just be 2/(far - near) and -(far + near)/(far - near) should be (far + near)/(far - near).

Secondly, one thing I wasn't expecting is that since +z is into the screen, and I currently have near at 0 and far at 100, that higher positive values of z would place objects further away. e.g. Something at (20,10,3) is closer than (20,10,6). However, it turns out I need to specify negative values for z in order to show up on screen, because z values > 0 end up greater than 1 so they are outside the clipping region. Is this expected?

Intuitively I was thinking that since I set near & far to be 0 & 100, that I could specify z values in that range, but in fact, I'm finding that only z values in the range of -50 >= z <= 0 are valid.

Furthermore, I set my depth buffer up to clear to 0 and test using greater-than. This ensures smaller values of z are placed further back in the scene.

I've been messing around with this so long I've started to doubt what I'm doing is correct. :) It almost feels like I hacked something together that kind of works for me (not that that's always a bad thing..). But does this sound reasonable for a 2D renderer? I'm trying to avoid setting something up that will end up causing me pain or frustration down the line. Thanks for any feedback/tips!

Metal clips z in NDC space to [0,1] just like D3D. Compared to OpenGL's default [-1;+1].

From https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf image.png

So if you want to use OpenGL style projection matrix you need to put z*0.5 + 0.5 calculation for z inside this matrix.

Because Metal matches what D3D wants, you can lookup how such matrix is done in DirectX Math library for XMMatrixOrthographicLH function: https://github.com/microsoft/DirectXMath/blob/22e6d747994600e00834faff5fc2a95ab60f1790/Inc/DirectXMathMatrix.inl#L2850-L2872

This blog post explains how to derive such orthographic matrices: https://blog.demofox.org/2017/03/31/orthogonal-projection-matrix-plainly-explained/

For 2D orthographic rendering I would not worry about near and far planes at all, instead just manually pass z value you want to use for your triangles. w will be 1 anyway. Then LH vs RH does not matter at all.


Edited by Mārtiņš Možeiko on

"For 2D orthographic rendering I would not worry about near and far planes at all, instead just manually pass z value you want to use for your triangles. w will be 1 anyway. Then LH vs RH does not matter at all."

That makes sense! I could even do my own mapping if I find it's easier to think of z in a certain range in the game code, then map it to the renderer's (in this case Metal's) clipping box.

And thanks for the links! Those are really helpful.

It's definitely making more sense to me now!


Replying to mmozeiko (#29617)