Oh, you're right. Value will be 0 if no other jobs are being added. I don't know what I was think…
Ok, well I grabbed the source code and looking through it properly I can see there's no bug. I do…
Why do you think semaphore is at 0 at 4th step? Semaphore decreases only when WaitForSingleObject…
Apologies if I'm mistaken, but I just watched episode 126 and spotted what might be a possible ra…
then there is _aligned_malloc which will take care of alignment to arbitrary boundaries for you:
…
_mm_mul_epu32 only multiplies 2 32-bit integers and stores result as 2 64-bit integers. So you wo…
elle —
After debugging my vectorized version for a long time, I thought of a few minor things that might…
Yes, for variables allocated on stack compiler aligns them automatically.
It gets tricky with st…
MSDN states that parameters for InterlockedCompareExchange and InterlockedIncrement functions mus…
Link to the interview.
Not much that hasn't already been discussed in these forums or on the str…
Yes, I got VLIW somehow confused with SIMD, not sure why.
SIMD executes one instruction on sever…
Just to be clear, it was SIMD stuff we did (Single Instruction Multiple Data), not VLIW (Very Lon…
Awesome catch! Yes I think that is a bug. This probably serves me write for violating my own ru…
cmuratori Obviously, the degree of speedup you get depends entirely on a) how underutilized the w…
Unfortunately no. Clang does same thing that GCC does for these vector types: https://gcc.gnu.org…
elle —
I noticed that LLVM is less strict about the vector types than MSVC by default. For example, it d…
Nick —
Ah I see. Thank you both for the explanation.
It's hard to ignore 8 processors. After all, they're just sitting there, unused. I'm pretty sure …
robert —
Hello All,
I was watching episode 124 and i think i ahve spotted a possible bug in the threading…
There's more stuff about MESI on ryg blog: https://fgiesen.wordpress.com/2014/08/18/atomics-and-c…
Kladdehelvete, did you read all the text I posted or just the first sentence?
If what you are sa…
The problem here is strictly that there is no compression happening, and the dynamic range of the…
Ah! Good :) Although there's probably still a smarter way to do it if I thought about it for a …
The problem with any of this is that you really don't know until you actually test it with your w…
Well, that may be true, but we cannot get rid of it until we duplicate the function, because keep…
Hyperthreading definitely does allow you to potentially double the work done per core. The reaso…
Please take note that I am 100% confident that Casey will do extremely well with his game eventua…
mmozeiko 1. There is no such thing as 8.1 cores. So it is not correct to say that 1 thread is usi…
Marco —
Hey everyone,
I hope Casey won't ban me for posting CppCon references :) Here are some videos an…
elle —
I think we can save 1 multiply, 1 add, and 1 shift at the end of the function when we unnecessari…