Just finish my flexable UI Renderer using nothing but compute shaders. Optimized it down to 0.7ms for a full screen of buttons. Was hoping to do better but it shall do for now!
Essentially each command has an array of words that ran through a SDF VM on the GPU per pixel. Chunk & tile binning, groups draw commands into smaller arrays for faster per pixel processing
image 1: 1280x960 filled with buttons
image 2: 13 different draw commands, each made up of different shape ops
image 3: shader perf results for full screen of buttons
image 4: snippet of the draw command API
image 5: chunk bin visualisation
image 6: tile bin visualisation