Another big update to the Ray Cassting dungeon crawler project - removing the use of a RGB-float buffer and writing directly into u32 pixels more than quadruppled the performance on the GPU, and is now easilly around 2000-3000fps(!) Also added debug modes for Untextured, Depth, UVs, and Mip-Levels, and some of them on the CPU even get to 1000fps. So unsurprisingly the main bottleneck is the software texture filtered sampling, on the CPU (no SIMD is used, and all on a single thread). But GPU has no issue at at all, using the same software texturing code (no hardware texture units used). Can even be above 200FPS on my ultra-wide 5Kx2K screen.