Why is Clang so much slower compiling than MSVC?

Am I mad? Or anyone also noticed it?

I wrote a build.bat for Clang and MSVC, both with 3 different build settings: Slow, Fast and Release.

MSVC compiles both of them faster than Clang takes to compile just one.

What can be happening?

BTW, I'm using the lld-link with Clang. Also, I'm using /MP (Multiprocessor compilation) with MSVC, but I doesn't really matter since I don't have that many files to compile.

Edited by Leonardo on Reason: Initial post
Yes, that's how it is. Clang has higher startup cost than MSVC.
There are -ftime-report and -ftime-trace arguments you can try to see where it spends most of time, and the possibly turn off those parts (if it's some extra optimization or something).
Clang is a cross compiler for many different systems and hardware without getting to know the target platform when the compiler is compiled, so it makes sense that a bigger program fetches more unused code into the instruction cache. Maybe your CPU starts compiling for the wrong target platform a few times until branch prediction learned which route you take most often.

MSVC:
* Use this hard-coded 64-bit compiler, not the 32-bit version.
* Done.

Cross compiler:
* What platform will you target this time?
* So which specific processor model will you be optimizing for today?
* Oh, you don't want this extension for niche hardware that was abandoned 50 years ago but some bank might still be stuck with? Lets just abort this incorrect branch prediction.
* Starting to compile using dynamic jumps between code scattered in the huge binary while flooding the instruction cache with things you never even call.
* Finally done.
That's not how clang works. It creates objects specialized for target architecture at beginning and there is no "unused code" to fetch or incorrectly predict (most of the time).
The slowness comes from way how compiler is structured / architected. Not because it is cross compiler.
It has very separate optimization vs code generation passes. Sometimes applies same optimization steps multiple times. Etc..

Edited by Mārtiņš Možeiko on
mmozeiko
That's not how clang works. It creates objects specialized for target architecture at beginning and there is no "unused code" to fetch or incorrectly predict (most of the time).
The slowness comes from way how compiler is structured / architected. Not because it is cross compiler.
It has very separate optimization vs code generation passes. Sometimes applies same optimization steps multiple times. Etc..


Even worse then. :D
Dawoodoz
Even worse then

You seem to be under the impression that doing multiple optimization passes is somehow stupid and wasteful, but in actual fact it's simply part of what's necessary to get good codegen. There's a reason LLVM generates vastly better optimized code than MSVC. What's unfortunate is that the way LLVM is architected makes it significantly slower when generating unoptimized/debug builds as well.