Icon

blog

Revisited: Combinatorial instantiation of C++ templates with std::variant

Compiler optimizations can break it, function attributes can fix it.

Lawrence Murray on 16 August 2023

The previous post suggested a way of using std::variant and std::visit to enumerate all desired instantiations of a C++ template, rather than using laborious explicit instantiations or macros. Unfortunately, it becomes more complicated once compiler optimizations are involved, as they may conspire to remove or render invisible the implicit instantiations of templates in an object file, making them unavailable for linking later.

To get around the problem, use function attributes to guide the compiler.

The original example code is below. The idea is that the dummy instantiate() function uses std::variant and std::visit to enumerate all the instantiations that we desire. The advantage is reduced code size, reduced chance of errors (did we miss an instantiation?), and the ability to use compile-time conditionals—think if constexpr—to eliminate combinations that we do not want.

#include <variant>

template<class T, class U>
void test(T x, U y) {
  //
}

static void instantiate() {
  std::variant<double,float,int> x, y;
  std::visit([]<typename T, typename U>(T x, U y) {
    if constexpr (!std::is_same_v<T,U>) {
      test(x, y);
    }
  }, x, y);
}

We can compile the example code and inspect the resulting object file to verify that the desired instantiations are present:

g++ -std=c++20 -c instantiate.cpp
nm -C instantiate.o | grep test

giving:

0000000000000000 W void test<double, float>(double const&, float const&)
0000000000000000 W void test<double, int>(double const&, int const&)
0000000000000000 W void test<float, double>(float const&, double const&)
0000000000000000 W void test<float, int>(float const&, int const&)
0000000000000000 W void test<int, double>(int const&, double const&)
0000000000000000 W void test<int, float>(int const&, float const&)

However, when compiler optimizations are enabled, the entire instantiate() function may be removed as unused, along with the implicit instantiations; or, the implicit instantiations may be inlined or hidden. If this occurs the resulting object file will not contain the symbols for later linking. We can see this if we compile with optimizations enabled:

g++ -std=c++20 -O3 -c instantiate.cpp
nm -C instantiate.o | grep test

giving: nothing!

The fix is to use function attributes. Some of these are compiler specific. With Clang, it is sufficient to add the attributes used, retain and noinline to both instantiate() and test(). With GCC the same attributes are supported, but it is better to replace noinline with noipa to cover more scenarios. It may be that not all attributes are necessary in all situations, but this seems like a reliable set. They achieve the following:

Be careful if you have multiple declarations or definitions of your function templates: the attributes must be applied to all of them for consistent results.

If we wrap these attributes up in some macros, we have updated example code:

#include <variant>

#if __has_attribute(noipa)
#define KEEP __attribute__((used,retain,noipa))
#else
#define KEEP __attribute__((used,retain,noinline))
#endif

template<class T, class U>
KEEP void test(T x, U y) {
  //
}

KEEP static void instantiate() {
  std::variant<double,float,int> x, y;
  std::visit([]<typename T, typename U>(T x, U y) {
    if constexpr (!std::is_same_v<T,U>) {
      test(x, y);
    }
  }, x, y);
}

Here we’ve used the older and non-standard __attribute__((...)) syntax for wider compiler support. There is a newer [[...]] syntax, see e.g. cppreference.com.

We can see that this works, even with optimization enabled:

g++ -std=c++20 -O3 -c instantiate.cpp
nm -C instantiate.o | grep test

giving what we expect:

00000000 W void test<double, float>(double, float)
00000000 W void test<double, int>(double, int)
00000000 W void test<float, double>(float, double)
00000000 W void test<float, int>(float, int)
00000000 W void test<int, double>(int, double)
00000000 W void test<int, float>(int, float)
blog Related?
Sums of Discrete Random Variables as Banded Matrix Products

Zero-stride catch and a custom CUDA kernel.

Sums of Discrete Random Variables as Banded Matrix Products

Lawrence Murray

16 Mar 23

blog Related?
GPU Programming in the Cloud

A how-to and round-up of cloud service providers.

Lawrence Murray

22 Nov 22

photography Next
Vista Perugia
1

3 Sep 23

blog Previous
Combinatorial instantiation of C++ templates with std::variant

An alternative to explicit instantiations and macros.

Lawrence Murray

11 Jun 23