Not a lot of super cool pictures that actually mean anything for day 2 of &pps, but I've implemented 'my' algorithm (which I've taken to calling the 'kernel solver') for the CPU (for now), and I was surprised to learn that, when compiled with odin's optimizer, it was actually faster, as well as more efficient, than Successive Overrelaxation
That bodes well!
A lot more characterizing and visualisations have to be done though