> I'd like to see references to those claims and experiments, size of the codebase etc. I find it hard to believe the figures since the bottleneck in large codebases is not a compute, e.g. headers preprocessing, but it's a memory bandwidth.
Edit: I think I misunderstood what you meant by memory bandwidth at first?
Modules reduce the amount of work being done by the compiler in parsing and interpreting C++ code (think constexpr). Even if your compilation infrastructure is constrained by RAM access, modules replace a compute+RAM heavy part with a trivial amount of loading a module into compiler memory so it's a win.
I suspect this is because these are C or Fortran sub-projects. I'm looking for some clean way to tell Cmake to apply externis to all the C++ only subprojects if possible. I'll see what I can come up with.
I'd also like to know, if multiple GCC commands end up pointing to the same trace.json, especially in a parallel build, will externis automagically ensure that it doesn't step over itself?
I figured out a way to set it at a top level, so it only happens with C++ files:
target_compile_options(${NAME} PUBLIC
$<$<COMPILE_LANGUAGE:CXX>:-fplugin=externis -fplugin-arg-externis-trace-dir=(where I want to put traces)>
)
But as I suspected, it is not a single trace file. It's thousands of trace files. Is there some way to collate all the data into one larger picture of how the build progressed?
I guess it's better, but with C++ being C++, you will then need to decide if you consider
struct A { A(const volatile& A); };
as a class with a const copy constructor. Maybe someone cares?
Proper templated classes don't behave like this. If you manually define a copy constructor in a template class it has to work. And if it works only conditionally (like in many container classes) you need to add constraints on your constructors (>C++20) or derive from appropriately specialized base classes (e.g. std::_Optional_base in libstdc++).
It sucks to tell users "you're holding it wrong", but I don't think there's a way to make it simpler without breaking everything written since C++11.
It's one of the problems with the 'current' model of Eukaryogenesis.
Eugene Koonin has recently suggested an interesting two stage model based also on shared virus phylogenies: https://www.nature.com/articles/s41564-023-01378-y
Edit: I think I misunderstood what you meant by memory bandwidth at first? Modules reduce the amount of work being done by the compiler in parsing and interpreting C++ code (think constexpr). Even if your compilation infrastructure is constrained by RAM access, modules replace a compute+RAM heavy part with a trivial amount of loading a module into compiler memory so it's a win.