3

Enum reflection in C++ with template metaprogramming

 2 years ago
source link: https://taylorconor.com/blog/enum-reflection/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Enum reflection in C++ with template metaprogramming

Conor Taylor — 30th December 2019

Enums in C++ lack a number of features native to other languages. Consider one small feature as a case study: retrieving the total number of elements in an enum. In Java, we have the Enum.values().length method. C# has Enum.GetNames().Length. Both of these incur a runtime cost, but at least they exist. The most common workaround suggested for total number of enum elements in C++ is to manually append a COUNT member to the enum:

enum Fruit {
  APPLE,
  BANANA,
  ORANGE,
  COUNT, // COUNT == 3.
};

This won't work for enums with explicit initialisers, and in general leaves a lot to be desired, especially since this information is known at compile time.

Wouldn't it be nice to build a compile-time computable enum element count function, as a small taster of reflective template metaprogramming in C++? Spolier alert: it can be done! Although the journey is more interesting than the destination. Let's see how to implement it!

C++ template metaprogramming overview

Template metaprogramming in C++ is the use of the template system in C++ to perform compile-time, turing-complete computation within the code. It can be quite cumbersome to write code in this way; all variables are immutable and must be known at compile-time, and recursion is often used in place of iteration to reduce complexity.

The hello world of template metaprogramming is calculating factorials at compile time. Consider:

// Recursive template metaprogramming Factorial calculation.
template <int N>
struct Factorial {
  static const int value = N * Factorial<N - 1>::value;
};

// Specialization of Factorial for base case = 1.
template <>
struct Factorial<1> {
  static const int value = 1;
};

int main() {
  std::cout << Factorial<5>::value << std::endl;
}

No runtime code is generated to calculate the factorial here. The program is completely analogous to std::cout << 120 << std::endl;.

More information is available on the Wikibooks entry on C++ template metaprogramming.

Implementing reflection with template metaprogramming

C++ doesn't have any native reflection like Java does, but its template system is so powerful that reflection can be added in specific cases, such as for this particular enum reflection use-case. The first step is to figure out if a given integer is a valid enum value. C++ does not enforce this, so any value of the underlying enum type can be cast into that enum without issue:

enum Fruit {
  BANANA = 5,
};

int main() {
  Fruit valid = Fruit::BANANA;
  Fruit invalid = (Fruit)10; // Compiles OK.
}

One hacky way to reflectively access some enum properties is to use the __PRETTY_FUNCTION__ identifier, which is a gcc/clang/msvc compiler extension. This identifier represents the human-readable pretty-printed function name string for each context it's accessed from. It also contains template parameter details when used within template functions. For enum-type template parameters, the __PRETTY_FUNCTION__ identifier includes the enum name for enum-type template parameters, and the enum value name for valid enum value template parameters. This makes it possible to differentiate between valid and invalid values at compile time! Consider the following:

enum Fruit {
  BANANA = 5,
};

template <typename E, E V> constexpr void func() {
  std::cout << __PRETTY_FUNCTION__ << std::endl;
}

int main() {
  func<Fruit, (Fruit)5>();
  func<Fruit, (Fruit)10>();
}

Here, typename E is deduced to the Enum type Fruit, and E V is a template parameter of type Fruit (since E is deduced as Fruit). This compiles without warning (Apple clang version 11.0.0) and outputs the following when run:

void func() [E = Fruit, V = Fruit::BANANA]
void func() [E = Fruit, V = 10]

Since func() is constexpr, it can be evaluated at compile time. So the enum name Fruit, value name BANANA, and validity of enum values 5 and 10 (based on whether __PRETTY_FUNCTION__ specifies a name or an integer value for the enum value template parameter) are all available at compile time here! Reflection!

This isn't very useful right now, but we can use it to calculate the total number of values in a given enum (with an underlying integer type).

Reflectively finding the total number of values in an enum

The big trick here is to enumerate all possible values for a given enum, and then count which of these are valid to find the number of values in the enum. First step: a function to determine if a given enum value is valid. This function can then be called recursively to check the enum validity of each possible value of the underlying type of the enum:

template <typename E, E V> constexpr bool IsValid() {
  // When compiled with clang, `name` will contain a prettified function name,
  // including the enum value name for `V` if valid. For example:
  // "bool IsValid() [E = Fruit, V = Fruit::BANANA]" for valid enum values, or:
  // "bool IsValid() [E = Fruit, V = 10]" for invalid enum values.
  auto name = __PRETTY_NAME__;
  int i = strlen(name);
  // Find the final space character in the pretty name.
  for (; i >= 0; --i) {
    if (name[i] == ' ') {
      break;
    }
  }
  // The character after the final space will indicate if
  // it's a valid value or not.
  char c = name[i + 1];
  if (c >= '0' && c <= '9') {
    return false;
  }
  return true;
}

int main() {
  bool x = IsValid<Fruit, (Fruit)5>();  // true
  bool y = IsValid<Fruit, (Fruit)10>(); // false
}

Yes, the compiler will even iterate through strings for us at compile time and compare character values!

Next, we need a way to call IsValid() once for each value in a list. Recursion using parameter packs to the rescue:

template <typename E> constexpr int CountValid() {
  return 0;
}

template <typename E, E A, E... B> constexpr int CountValid() {
  bool is_valid = IsValid<E, A>();
  return CountValid<E, B...>() + (int)is_valid;
}

Here, IsValid() is called once for each enum value template parameter (of type E) after typename E. Writing the template parameters in this way, <typename E, E A, E... B>, allows us to retrieve one value (A) from the parameter pack B for each call to CountValid(), which has the effect of allowing IsValid() to be called once for each enum value in the parameter pack B. Since parameter packs are template parameters that accept zero or more template arguments, a recursion-terminating version of CountValid() that accepts no additional template arguments after typename E is needed, which returns 0 to terminate the recursion.

Now all that's left is to generate parameters to fill the parameter pack. If we create a parameter pack containing every possible value of the underlying enum type, we can call IsValid() for each of those values and count the valid values to determine the amount of values in the enum. To generate parameters, we can use a std::integer_sequence. It allows us to create a compile-time list of integers that can be used to deduce a parameter pack of integers, which can then be cast into a parameter pack of the specified enum type, which is exactly what we need to pass to CountValid(). The std::integer_sequence in InternalElementCount is not used explicitly as a traditional function parameter, rather its underlying type (a sequence of ints) is used to deduce a parameter pack int... I:

template <typename E, int... I> constexpr int InternalElementCount(std::integer_sequence<int, I...> unused) {
  return CountValid<E, (E)I...>();
}

template <typename E> struct ElementCount() {
  static const int value = InternalElementCount<E>(std::make_integer_sequence<int, 100>());
}

Now, because we access ElementCount<Fruit>::value, an integer sequence consisting of all values between 0-100 is created at compile time for InternalElementCount. Although InternalElementCount does not use this parameter, passing it forces the compiler to deduce I from the type of the std::integer_sequence. Then, these are cast to type E (which in this case is the enum Fruits) and provided as template arguments to the CountValid() function we wrote earlier.

To prove this is a compile-time operation, we can write a short program:

int main() {
  __asm__("# test begin");
  int count = ElementCount<Fruit>::value;
  __asm__("# test end");
}

Which produces the following assembly:

## test begin
movl    $3, -4(%rbp)
## test end

The enum size value 3 is computed at compile time and inserted right into the program binary!

This post was heavily influenced by Daniil Goncharov's amazing magic_enum project. Check it out!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK