Ever needed to intern a lot of stuff across a bunch of cores? well I did and I've now got it working correctly, this is a lock-free hashset. It's an application of Cliff's non-blocking hash map except written in C instead of Java and for simplified for hashsets, not hashmaps (If i need those later I might port it). Here i've got a spall trace of attempting 4 million interns on around 1 million unique entries, it's 360ms for 6 threads to do all that and around 400ns per attempt when it's not resizing. I do start with a shitty size to stress test it (32 entries) and you can see the resizes in yellowish. When I cut the number of threads in half it's around 512ms to complete the job and single-threaded it's 1.1 seconds. I'm hoping to finish packaging it up into a library tomorrow alongside writing some comparisons against strided & fully locking interning.