We are currently in the process of converting the website to the new design. Some pages, like this one, are still broken. We appreciate your patience.
Mārtiņš Možeiko
_mm_mul_epu32 only multiplies 2 32-bit integers and stores result as 2 64-bit integers. So you wo…
»
elle
After debugging my vectorized version for a long time, I thought of a few minor things that might…
»
Mārtiņš Možeiko
Yes, for variables allocated on stack compiler aligns them automatically. It gets tricky with st…
»
Marius Adaškevičius
MSDN states that parameters for InterlockedCompareExchange and InterlockedIncrement functions mus…
»
Benjamin Kloster
Link to the interview. Not much that hasn't already been discussed in these forums or on the str…
»
noxy_key
Yes, I got VLIW somehow confused with SIMD, not sure why. SIMD executes one instruction on sever…
»
Casey Muratori
Just to be clear, it was SIMD stuff we did (Single Instruction Multiple Data), not VLIW (Very Lon…
»
Casey Muratori
Awesome catch! Yes I think that is a bug. This probably serves me write for violating my own ru…
»
Andrew Bromage
cmuratori Obviously, the degree of speedup you get depends entirely on a) how underutilized the w…
»
Mārtiņš Možeiko
Unfortunately no. Clang does same thing that GCC does for these vector types: https://gcc.gnu.org…
»
elle
I noticed that LLVM is less strict about the vector types than MSVC by default. For example, it d…
»
Nick
Ah I see. Thank you both for the explanation.
»
noxy_key
It's hard to ignore 8 processors. After all, they're just sitting there, unused. I'm pretty sure …
»
robert
Hello All, I was watching episode 124 and i think i ahve spotted a possible bug in the threading…
»
Mārtiņš Možeiko
There's more stuff about MESI on ryg blog: https://fgiesen.wordpress.com/2014/08/18/atomics-and-c…
»
Mārtiņš Možeiko
Kladdehelvete, did you read all the text I posted or just the first sentence? If what you are sa…
»
Casey Muratori
The problem here is strictly that there is no compression happening, and the dynamic range of the…
»
Casey Muratori
Ah! Good :) Although there's probably still a smarter way to do it if I thought about it for a …
»
Casey Muratori
The problem with any of this is that you really don't know until you actually test it with your w…
»
Casey Muratori
Well, that may be true, but we cannot get rid of it until we duplicate the function, because keep…
»
Casey Muratori
Hyperthreading definitely does allow you to potentially double the work done per core. The reaso…
»
Livet Ersomen Strøm
Please take note that I am 100% confident that Casey will do extremely well with his game eventua…
»
Livet Ersomen Strøm
mmozeiko 1. There is no such thing as 8.1 cores. So it is not correct to say that 1 thread is usi…
»
Marco
Hey everyone, I hope Casey won't ban me for posting CppCon references :) Here are some videos an…
»
elle
I think we can save 1 multiply, 1 add, and 1 shift at the end of the function when we unnecessari…
»
Abner Coimbre
Hear, hear! B)
»
Mārtiņš Možeiko
1. There is no such thing as 8.1 cores. So it is not correct to say that 1 thread is using 1/16 o…
»
»
noxy_key
I don't know how applicable this is to optimization via hyperthreading, but the two links below s…
»
Livet Ersomen Strøm
elle Has anyone else noticed this too, or is this a bug in my version? No, it seems to me it s…
»