When I work with other languages on large software projects, the workflow is typically:
- Grab the source code, extract it, cd into that directory.
- Run some standard build tool. Usually this tool is well known and completely accepted by the programming language's community (Maven, Tox, Cargo, etc), at least compared to C++, where every few years I hit a wall and have an existential crisis where contemplate how I'm building software and spend ages learning another tool.
- See it create a pristine directory created to host all build artifacts.
I've been amazed how in C++ this last step is so different. Most tools, instead of creating a single directory with the output of the build process, instead pollute whatever directory they're currently in with zillions of object files and associated build artifacts. There's a historical reason for this: Make does things this way, Ninja was inspired by Make, CMake wants to work with all these tools so it has to follow suit. But it's really gross, and coming from other programming language cultures it feels extremely unintuitive.
In the same vein, “standard” package installs are pretty gross: instead of polluting the current directory with artifacts, they pollute your entire machine by affecting any software you build afterwards.
Typically, a package install works by invoking an install target, such as running “sudo make install”. This copies libraries, header files, and other stuff to system directories such as /usr/lib, /usr/include, /usr/bin, etc.
So if you're working on a project and want to pull in a dependency, like SDL2, you'd download SDL2, run sudo make install and then be able to use it from the SDL2 from your project without including SDL2's source inside of your own project or “vendoring” the dependency (a euphemism for shoving all the build artifacts into source control).
I've typically avoided installation processes like this because:
- They require sudo.
- Installing globally clearly doesn't scale if you plan to work on two projects which require different versions or build variants of the same dependency.
- Because you install the package globally, afterwards it's easy to forget you had to cross this hurdle. If you're not constantly documenting things (which is very possible when you're spinning up a new project) you may not even remember that you had to do anything a year later when you're solving the mystery of “why does this not build correctly on the new guy's machine” (don't say CI will fix this; it's just as easy to bake this kind of dependency into a CI box or base image).
- On Windows this procedure probably doesn't work at all, or uses some different standard the author of the library or build tool invented that installs things to unpredictable locations. Or worse, it uses the actual Windows standards.
- Uninstalling the package is not possible because the installation process isn't very well tracked (unlike using a Windows MSI or a Debian package) so you have to guess what needs to be removed.
Thankfully, there's a way to avoid globally installing packages: prefix paths.
These are root directory paths that overrides the default “system” paths. So instead of files being copied to /usr/lib they go into ${prefix_path}/lib, ‘/usr/include’ goes to ‘${prefix_path}/include’ etc.
Since the idiom of C and C++ package installs is only an idiom and not enforced by a contract between build systems, the way you specify prefix paths differs between tools.
In CMake the standard is to set the variables CMAKE_INSTALL_PREFIX to tell it where to put packages, and CMAKE_PREFIX_PATH to tell it where to find them.
It's helpful for me to imagine each directory that can be used as a prefix path as it's own, semi-isolated environment for C and C++ dependencies. I call this a C-environment, or just cenv for short.
A cenv is isolated in that it can't be affected except by packages installed globally. Since cenvs don't affect each other, you can protect them from external influences by keeping your system clean and never installing packages globally.
Once you realize that a mechanism for cleaning installing libraries for C/C++ exists, it's easy to imagine how to achieve a nice work flow similar to other languages:
- Create a new cenv.
- Download, build and install whatever packages you want your project to depend on, such as the Boost headers or SDL2, to the cenv.
- Build the project your working on using the cenv to pick up the packages installed earlier.
Unfortunately, installing packages is still a somewhat difficult process that entails checking out source code, generating build files in CMake, and installing it to your cenv.
Thankfully we can use a tool called cget to download and install CMake based projects.
In recent years there have been a series of package managers introduced for C++. What makes Cget different is how simple it is; most of these tools have introduced their own ideas about what it means to install a package, while cget instead went along with the CMake standards which itself was based on common idioms already in use in Makefile based projects.
The one area cget breaks from the norm is it doesn't install packages globally by default. Instead, cget's default is it creates a brand new cenv for you in the current directory by creating a directory named ./cget. It also creates a CMake toolchain file in this directory, which sets CMAKE_INSTALL_PREFIX and CMAKE_PREFIX_PATH to use the cenv.
(Note: cget calls this directory a “new prefix path”, but I think the name “cenv” represents it better.)
Using cget looks like this:
cd your-project-directory # This contains a CMakeLists.txt which uses GLM
cget init # creates a new cenv at `./cget` if none exists.
# Downloads the 0.9.8.5 release of glm from Github, creates a build directory
# somewhere inside of `./cget`, builds glm and installs it to locations in
# the new cenv such as `./cget/include`.
cget install g-truc/glm@0.9.8.5
mkdir build && cd build
# -DCMAKE_TOOLCHAIN_FILE tells it to use cget's cenv
cmake -DCMAKE_TOOLCHAIN_FILE=../cget/cget/cget.cmake -H../ -B./
cmake --build ./
The code above creates a cenv, installs the GLM library to it, then builds the CMake project in the directory by passing the toolchain file cget created for the cenv to Cmake.
(If you're curious about how CMake itself consumes GLM, somewhere in the CMakeLists.txt file will be “find_package(glm)” which will look in the cenv for package info on GLM. This blog post is already pretty long so I'll be explaining how this works in another one, but essentially if CMake knows where to look it can find libraries and header files that are installed in the typical way.)
The string you pass to cget install is called a “package source”. This can be the name of a project in GitHub (for example, above we fetched branch 0.9.8.5 of GLM), a file path on your local machine, a URL to a tar.gz file, or other more exotic types beyond the scope of this blog post.
cget can also accept as a package source a text file containing a list of package sources. By convention this file is called requirements.txt.
This means it's now possible to make a typical C++ Cmake project and distribute it with a file called requirements.txt in the root. Users can then install all the necessary packages they need with cget before building or installing the source code of your our package.
Since cget is based on existing Cmake and make idioms and standards, this also means if they're weirdos they don't have to use cget but can still get information on what packages our project needs.
If we created a requirements.txt file for our project above, we could fill it with:
With a requirements.txt file in the root of our project our new process becomes:
cd your-project-directory
cget init
cget install -f requirements.txt
mkdir build && cd build
# -DCMAKE_TOOLCHAIN_FILE tells it to use cget's cenv
cmake -DCMAKE_TOOLCHAIN_FILE=../cget/cget/cget.cmake -H../ -B./
cmake --build ./
If someone else wants to install our project using cget, cget will find the requirements.txt file and install the dependencies we require first.
If you want to build your project with different compilers or otherwise use multiple configurations, you'll need more than one cenv. It's possible to make cget create cenvs in different locations by passing –prefix to cget init. The environment variable CGET_PREFIX can also tell cget to use a cenv other than the directory cget in the current directory.
Additionally, we can pass arbitrary toolchains as well as certain CMake settings to cget init to cause it to include those toolchains from the cenv toolchain file it creates.
This means building for multiple configurations looks like this:
# Build with GCC 6 in debug mode
cget init --prefix gcc-debug -DCMAKE_C_COMPILER:none=gcc-6 -DCMAKE_CXX_COMPILER:none=g++-6 -DCMAKE_BUILD_TYPE:none=Debug
export CGET_PREFIX=$(pwd)/gcc-debug
mkdir build-gcc && cd build-gcc
cmake -DCMAKE_TOOLCHAIN_FILE=../gcc-debug/cget/cget.cmake -H../ -B./
cmake --build ./
# Now build with Clang in release mode
cd ..
cget init --prefix clang-release -DCMAKE_C_COMPILER:none=clang-3.8 -DCMAKE_CXX_COMPILER:none=clang++-3.8 -DCMAKE_BUILD_TYPE:none=Release
export CGET_PREFIX=$(pwd)/clang-release
mkdir build-clang && cd build-clang
cmake -DCMAKE_TOOLCHAIN_FILE=../clang-release/cget/cget.cmake -H../ -B./
cmake --build ./
Creating a cenv in the root of each project you're working on is probably fine for some people, but I discovered I quickly grew sheepish about creating brand new cenvs for all my little projects; I seem to always be on the verge of filling the solid state drives where I do all my work. Additionally certain packages- such as Boost- can take up a ton of space.
This is a very similar problem faced by Python developers who use virtualenv, which are like cenvs but for Python projects. For the purposes of testing and CI, a virtualenv for every project makes sense, but depending on how prolific you are this can get expensive. Tools like pyenv and virtualenvwrapper help by creating a list of virtualenvs that are available globally from a shell session that can be easily switched between.
I liked this workflow, so I did the same thing for cenvs by building a tool called, confusingly enough, Cenv (installing it is made to be simple even for those unfamilar with Python, and it also installs cget).
Cenv manages a group of cenvs stored at ~/.cenv (C:\Users\your-name\.cenv on Windows). You create and list them like this:
$ cenv init gcc-debug -DCMAKE_C_COMPILER:none=gcc-6 -DCMAKE_CXX_COMPILER:none=g++-6 -DCMAKE_BUILD_TYPE:none=Debug
$ cenv init clang-release -DCMAKE_C_COMPILER:none=clang-3.8 -DCMAKE_CXX_COMPILER:none=clang++-3.8 -DCMAKE_BUILD_TYPE:none=Release
$ cenv list
gcc-debug
clang-release
You can activate one of these cenvs by calling cenv set:
$ cenv set gcc-debug
* * using gcc-debug
$ cenv list
* gcc-debug
clang-release
“activating” a cenv does three things:
- It sets the CGET_PREFIX environment variable, so cget uses that cenv.
- It adds the lib directory of the cenv to the PATH and LD_LIBRARY_PATH environment variables. This is necessary to run executables that have been linked to shared libraries or DLLs that were installed to the cenv (the alternative would be installing them globally or copying all of the needed shared libraries and DLLs to the same directory as the executable you're building, which is wasteful).
Cenv also wraps the cmake command so that it always passes in -DCMAKE_TOOLCHAIN_FILE=${CGET_PREFIX_PATH}/cget/cget.cmake, meaning you get to stop thinking about that.
Running cget set or cget deactivate undoes these changes (it also smartly removes entries added to the PATH and LD_LIBRARY_CONFIG).
With Cenv installed, building a project for two different configurations looks like this:
# Build with GCC 6 in debug mode
cenv set gcc-debug
mkdir build-gcc && cd build-gcc
cmake -H../ -B./
cmake --build ./
cd ..
cenv set clang-release
mkdir build-clang && cd build-clang
cmake -H../ -B./
cmake --build ./
It took me awhile to apprecaite cget and the standard CMake practices it was advocating for, mostly due to the fact that CMake itself, while being a useful, high quality tool, is loaded with so many options and settings that using it the right way isn't immediately clear, and often involves overly verbose arguments that made it feel like I was on the wrong path even when I wasn't.
However, at the core of it package installation with CMake is simple, and I'd argue the inner workings of it are easier to understand than current competing packaging tools for C++. Though cget is extremely useful, the source code is tiny as it focuses on solving a few tiny problems very well. It makes existing CMake practices easier to use instead of inventing it's own standards and procedures for installing C++ packages and furthering the babel of sorts the community is headed towards. As someone who has looked at most of the other C++ package managers I think cget's approach is ultimately the simplest and most maintainable.
I believe cget and Cenv collectively rub most of the rough edges off of CMake, leaving a workflow that is scalable and pleasant. You can install both today by following the instructions on Cenv's README.
In a future blog post, I hope to discuss the basics of writing CMake files which correctly install and consume packages from cenvs.