One of the most common things you'll hear when learning CMake is that "CMake is not a build system". This is technically correct, depending on one's definition of a "build system". However, this statement alone is meaningless on a practical level as it doesn't communicate anything actionable regarding how to approach CMake. It just invites semantics games. The slightly clickbait headline aside, my goal in this article is to unpack what CMake really is in a way you can hopefully use to understand CMake better.
Still, I do understand why people like saying this: technically correct is the best kind of correct, after all.
To even have this discussion, we'll have to pin down a definition of what a build system is. Let's ask Jeff Atwood, co-creator of the venerable StackExchange:
The value of a build script is manifold. Once you have a build script together, you've created a form of living documentation: here's how you build this crazy thing. And naturally this artifact is checked into source control, right alongside the files necessary to build it (and even the database necessary to run it, too). From there, you can begin to think about having that script run on a neutral build server to avoid the "Works On My Machine" syndrome. [...]
This is from his blog post "The F5 Key Is Not a Build Process". This was written a while ago, in 2007, somewhat before CMake became wildly popular. It was also written in the context of C#, which is more tolerant of "just click 'build' in Visual Studio" workflows than C++, which isn't managed.
Still, it touches on a very important point, which is that a build system serves as a source of truth for how to build your software. If that's the essence of what a build system is, then CMake fits the bill.
Maybe you don't believe Jeff. After all, he says "build process" rather than "build system", so maybe he's talking about something else. Let's ask academia. The 2018 paper, "Build Systems à la Carte" by Andrey Mokhov, Neil Mitchell, and Simon Peyton Jones (of Haskell fame), gives a rigorous definition:
Keys, values, and the store. The goal of any build system is to bring up to date a store that implements a mapping from keys to values. In software build systems the store is the file system, the keys are file names, and the values are file contents. [...]
Task description. Any build system requires the user to specify how to compute the new value for one key, using the (up to date) values of its dependencies. We call this specification the task description. For example, [...] in Make the rules in the makefile are the task description.
Build system. A build system takes a task description, a target key, and a store, and returns a new store in which the target key and all its dependencies have an up to date value.
According to this definition, technically, CMake is not a build system because it isn't responsible for running your tasks, so it can't bring the store "up to date", but it does have a full task description language which assumes dependencies on files and their time stamps.
On the other hand, this is a build system:
1 2 3 4 |
|
The keys, value, and store are the same as they are for every conventional build system: the file-system contents. The task description is now firmly the CMakeLists.txt and the build system is this script. The fact that it calls Ninja is an implementation detail. This is also technically correct.
Mokhov, et.al. is a fascinating paper, and you should absolutely read it (did you know that Microsoft Excel is a build system?); but the purpose of their research is to taxonomize the ways various build systems model tasks and dependencies, and then carry out execution plans over those dependencies. It's not about pragmatic questions concerning the software lifecycle, but about the design space of certain tools that serve a particular purpose therein.
The descriptivist definition of "build system" would be much closer to what Jeff has in mind. When most people think about build systems, they aren't narrowly constraining themselves to the actual tool that invokes the compiler. For their purposes, the meta/non-meta distinction doesn't affect how they interact with CMake.
So why do people bother to draw this distinction? What do people think is actually meaningful about CMake being a "meta build system" or a "build system generator" rather than a plain "build system"? There isn't similar controversy about the GNU Build System (ie. autotools), and it also has separate configuration and build invocation steps. Heck, it popularized that process. Ever see this?
$ ./configure && make && make install
Yes, the configure script isn't a build system on its own, but you always run make
afterwards. Autotools and CMake both call themselves build systems. Are they wrong? Sure, but only technically.
In the most common case, both CMake and Autotools are the single source of truth for building their respective projects. In order to build such software, you have to go through CMake (resp. Autotools) first. You get your pick of execution engine, but it's semantically irrelevant (ideally). There is a bug either in your CMake code or in CMake itself if you get different results from one backend versus another.
In 2018, David Chisnall wrote an article in ACMQueue titled "C Is Not a Low-level Language". The tagline, "Your computer is not a fast PDP-11", distills the central point of the article: that thinking about C programming in tandem with your target architecture is incorrect, because C targets an abstract machine which has its own semantics that the compiler is responsible for mapping to the target assembly language. There are some fascinating pitfalls detailed in the article, like how undefined behavior and pointer provenance can disable "obvious" optimizations (like loop unswitching), and delete null checks.
By analogy, CMake is not Make. Nor is it Visual Studio, or Ninja, or any of its many target backends. If the CMake generator is the architecture, then CMake code is C, the abstract build model it creates is the abstract machine, and targets with generator expressions are its IR. It is accurate to say that CMake is a domain-specific language for metaprogramming an abstract build model, which is assembled into input files for a build execution engine (*ahem* build system) of your choice.
When you search for "CMake is not a build system", this reminder appears in a few different contexts. Sometimes it's cited as an advantage, for example, when JetBrains says:
Yet another benefit, is that CMake is not a build system in the general meaning and doesn't lock its users on one particular build system: users are free to use make/Ninja/etc to actually build the products; and that's a huge advantage since neither build tool is suitable in all situations.
Other times, it shows up to explain why something doesn't work how you'd expect in CMake. Several other blog posts make this claim to explain why globbing for sources is discouraged in CMake.
In these cases, I think it's much more useful to be precise and say "CMake is not Make" as shorthand for the full truth: CMake's abstract build model must trade-off between being a leaky abstraction and constraining itself to the least common denominator among its targets. Just because you can glob for sources in GNU Make doesn't mean that it's appropriate to do in CMake (I could write a whole article on just this point; maybe I will). The reason for this isn't because "CMake is not a build system", it's because globbing happens during metaprogramming and doesn't make it into the final program (with 3.12+ there's CONFIGURE_DEPENDS
, but it's unreliable, and the devs still discourage it).
There are certainly deficiencies in CMake's abstract model and (especially) its metaprogramming / scripting language. I think it's more productive to talk about those things in clear language than it is to wave your hands and say "CMake is not a build system".
Unless otherwise stated, all code snippets are licensed under CC-BY-SA 4.0.