C++: Combinatorial instantiation of templates with std::variant
C++ templates can increase compile times dramatically. When developing a library that makes heavy use of them, a common approach is to use explicit instantiation to reduce compile times for users of the library, instantiating in the library build, instead of deferring to the user build. There are other reasons too. Consider, for example, writing a library of templated CUDA kernels, where the CUDA toolchain is required for compilation: explicit instantiation alleviates the user from setting up the CUDA toolchain, instead building their own code with their preferred toolchain and linking in the library (e.g. this is what I do with NumBirch).
But there are often many combinations of types for which a function template must be instantiated. Consider the following simple example:
template<class T, class U>
void test(T x, U y) {
//
}
where valid types for T
and U
are double
, float
, and int
(we should probably use SFINAE or concepts to restrict T
and U
to these, but omit the detail here). To explicitly instantiate all such combinations we would use the following:
template void test(double, double);
template void test(double, float);
template void test(double, int);
template void test(float, double);
template void test(float, float);
template void test(float, int);
template void test(int, double);
template void test(int, float);
template void test(int, int);
Compiling:
g++ -std=c++20 -c instantiate.cpp
and inspecting:
nm -C instantiate.o | grep test
gives:
0000000000000000 W void test<double, double>(double, double)
0000000000000000 W void test<double, float>(double, float)
0000000000000000 W void test<double, int>(double, int)
0000000000000000 W void test<float, double>(float, double)
0000000000000000 W void test<float, float>(float, float)
0000000000000000 W void test<float, int>(float, int)
0000000000000000 W void test<int, double>(int, double)
0000000000000000 W void test<int, float>(int, float)
0000000000000000 W void test<int, int>(int, int)
We can see that all desired instantiations have been made.
The code is quite verbose, and this is a small example: two arguments with each of three possible types, for instantiations. We can imagine that this can quickly get out of hand. It is realistic to think of ternary functions with three arguments and, say, eight different types, for instantiations. We will not want to enumerate all of those.
One option is to use the preprocessor. Something like this:
#define BINARY(f) \
BINARY_FIRST(f, double) \
BINARY_FIRST(f, float) \
BINARY_FIRST(f, int)
#define BINARY_FIRST(f, T) \
BINARY_SECOND(f, T, double) \
BINARY_SECOND(f, T, float) \
BINARY_SECOND(f, T, int)
#define BINARY_SECOND(f, T, U) \
template void f(T, U);
BINARY(test)
This works, giving the same results on inspection as above, but becomes messy for large numbers of arguments or types. We might also need to exclude particular combinations of types that are incompatible, which will add to the mess.
Since C++17, another approach is to use std::variant
to encode the valid types, then std::visit
to enumerate all the instantiations (now implicit rather than explicit, but the end result is the same):
static void instantiate() {
using valid = std::variant<double,float,int>;
valid x, y;
std::visit([](auto x, auto y) {
test(x, y);
}, x, y);
}
Again, same result on inspection with nm
, but we might consider this a little nicer. Or we might wrap it up in a macro to make it a little nicer. And we can add instantiations of more than one template function to the one instantiate()
static function.
Another advantage of this approach is that we can add logic to include or exclude particular combinations of types. For the sake of example, requiring that the two types are different, using an if constexpr
statement:
static void instantiate() {
using valid = std::variant<double,float,int>;
valid x, y;
std::visit([]<typename T, typename U>(T x, U y) {
if constexpr (!std::is_same_v<T,U>) {
test(x, y);
}
}, x, y);
}
This time we can inspect the result:
g++ -std=c++20 -c instantiate.cpp
nm -C instantiate.o | grep test
and see that only combinations where T
and U
are different are instantiated:
0000000000000000 W void test<double, float>(double const&, float const&)
0000000000000000 W void test<double, int>(double const&, int const&)
0000000000000000 W void test<float, double>(float const&, double const&)
0000000000000000 W void test<float, int>(float const&, int const&)
0000000000000000 W void test<int, double>(int const&, double const&)
0000000000000000 W void test<int, float>(int const&, float const&)
A downside is that the object file also contains a bunch of other instantiations related to std::variant
and std::visit
, although this seems minor.
Here’s the full code for further exploration:
#include <variant> template<class T, class U> void test(T x, U y) { // } /* template void test(double, double); template void test(double, float); template void test(double, int); template void test(float, double); template void test(float, float); template void test(float, int); template void test(int, double); template void test(int, float); template void test(int, int); */ /* #define BINARY(f) \ BINARY_FIRST(f, double) \ BINARY_FIRST(f, float) \ BINARY_FIRST(f, int) #define BINARY_FIRST(f, T) \ BINARY_SECOND(f, T, double) \ BINARY_SECOND(f, T, float) \ BINARY_SECOND(f, T, int) #define BINARY_SECOND(f, T, U) \ template void f(T, U); BINARY(test) */ static void instantiate() { using valid = std::variant<double,float,int>; valid x, y; std::visit([]<typename T, typename U>(T x, U y) { if constexpr (!std::is_same_v<T,U>) { test(x, y); } }, x, y); }
Update 16 August 2023
This technique may not work with certain compiler optimizations. See the next post for how to fix that.