Why is Clang so much slower compiling than MSVC?

Leonardo

#23841

January 20, 2021

Am I mad? Or anyone also noticed it?

I wrote a build.bat for Clang and MSVC, both with 3 different build settings: Slow, Fast and Release.

MSVC compiles both of them faster than Clang takes to compile just one.

What can be happening?

BTW, I'm using the lld-link with Clang. Also, I'm using /MP (Multiprocessor compilation) with MSVC, but I doesn't really matter since I don't have that many files to compile.

Edited by Leonardo on January 20, 2021, 9:36pm Reason: Initial post

Mārtiņš Možeiko

#23842

January 20, 2021

Yes, that's how it is. Clang has higher startup cost than MSVC.
There are -ftime-report and -ftime-trace arguments you can try to see where it spends most of time, and the possibly turn off those parts (if it's some extra optimization or something).

Dawoodoz

#23892

February 4, 2021

Clang is a cross compiler for many different systems and hardware without getting to know the target platform when the compiler is compiled, so it makes sense that a bigger program fetches more unused code into the instruction cache. Maybe your CPU starts compiling for the wrong target platform a few times until branch prediction learned which route you take most often.

MSVC:
* Use this hard-coded 64-bit compiler, not the 32-bit version.
* Done.

Cross compiler:
* What platform will you target this time?
* So which specific processor model will you be optimizing for today?
* Oh, you don't want this extension for niche hardware that was abandoned 50 years ago but some bank might still be stuck with? Lets just abort this incorrect branch prediction.
* Starting to compile using dynamic jumps between code scattered in the huge binary while flooding the instruction cache with things you never even call.
* Finally done.

Mārtiņš Možeiko

#23893

February 4, 2021

That's not how clang works. It creates objects specialized for target architecture at beginning and there is no "unused code" to fetch or incorrectly predict (most of the time).
The slowness comes from way how compiler is structured / architected. Not because it is cross compiler.
It has very separate optimization vs code generation passes. Sometimes applies same optimization steps multiple times. Etc..

Edited by Mārtiņš Možeiko on February 4, 2021, 7:58pm

Dawoodoz

#23897

February 4, 2021

mmozeiko
That's not how clang works. It creates objects specialized for target architecture at beginning and there is no "unused code" to fetch or incorrectly predict (most of the time).
The slowness comes from way how compiler is structured / architected. Not because it is cross compiler.
It has very separate optimization vs code generation passes. Sometimes applies same optimization steps multiple times. Etc..

Even worse then. :D

Miles

#23899

February 5, 2021

Dawoodoz
Even worse then

You seem to be under the impression that doing multiple optimization passes is somehow stupid and wasteful, but in actual fact it's simply part of what's necessary to get good codegen. There's a reason LLVM generates vastly better optimized code than MSVC. What's unfortunate is that the way LLVM is architected makes it significantly slower when generating unoptimized/debug builds as well.