4

No more plain old data

 3 years ago
source link: https://mariusbancila.ro/blog/2020/08/10/no-more-plain-old-data/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

No more plain old data

Posted on August 10, 2020August 10, 2020 by Marius Bancila

When working in C++, you often hear about POD types (which stands for Plain Old Data). PODs are useful for communicating with code written in other programming languages (such as C or .NET languages). They can also be copied using memcpy (which is important because this is a fast, low-level function that provides performance benefits), and have other characteristics that are key for some scenarios. However, the new C++20 standard has deprecated the concept of POD types in favor of two more refined categories, which are trivial and standard-layout types. In this post, I will discuss what these categories are and when to use instead of POD.

Let’s start with a simple example:

struct A
   int    a;
   double b;
struct B
private:
   int a;
public:
   double b;
struct C
   int    a;
   double b;
   C(int const x, double const y) :a{ x }, b{ y }

The question is, which of these is a POD type? To answer the question, we can use the type traits available in the standard library since C++11:

Type trait (since C++11) Variable template (since C++17) Description std::is_pod<T> std::is_pod_v<T> If T is a POD type, then the constant member value is true; otherwise it is false std::is_trivial<T> std::is_trivial_v<T> If T is a trivial type, then the constant member value is true; otherwise it is false std::is_standard_layout<T> std::is_standard_layout_v<T> If T is a standard type, then the constant member value is true; otherwise it is false

Using these type traits, provides the following answer:

Type Trivial Standard layout POD A ✔✔✔ B ✔❌❌ C ❌✔❌

We can see from this table that B is trivial, C is standard-layout, and A is trivial, standard-layout, and POD. And this leads us to the definition of a POD type:

A POD type is a type that is both trivial and standard-layout. This definition must hold recursively for all its non-static data members.

Or, in standardese:

A POD class is a class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD class (or array thereof). A POD type is a scalar type, a POD class, an array of such a type, or a cv-qualified version of one of these types.

This definition refers to scalar types, so for completeness, a scalar type is any of the following:

  • an arithmetic type
  • an enumeration type
  • a pointer type
  • a pointer-to-member type
  • the std::nullptr_t type
  • cv-qualified versions of the above types

POD types cannot have non-static member data that are not themselves POD types. However, it has no requirements on static members or functions. Therefore, the type A1 shown here, is still a POD type, although it has member functions and non-POD static members.

struct A1
   int    a;
   double b;
   static std::string s;
   int get() const { return a; }

If a POD type is a trivial type with standard layout, the question is what is trivial and what is standard-layout? Let’s answer one at a time.

Trivial types

A trivial type is a type that is trivially copyable and has one or more default constructors, all of which are either trivial or deleted, and at least one of which is not deleted.

Keep in mind that a class can have multiple default constructors (for instance a constructor with no parameters, and a constructor that supplies default arguments for all its parameters) as long as it is possible to create, without any ambiguity, an instance of the type without explicitly invoking the constructor (can be default-initialized).

A trivially-copyable type is a type that has:

  • only copy-constructors and copy-assignment operators that are either trivial or deleted
  • only move-constructors and move-assignment operators that are either trivial or deleted
  • at least one of these four special member functions is not deleted
  • a trivial non-deleted destructor
  • no virtual functions or virtual base classes

In this definition, trivial means that the special member function belongs to a class that:

  • it is not user-provided
  • has no virtual functions or virtual base classes
  • has no base classes with a non-trivial constructor/operator/destructor
  • has no data members of a type that has non-trivial constructor/operator/destructor

The specifications for trivial types are available here.

Trivial types have some properties:

  • They occupy a contiguous memory area.
  • There can be padding bytes between members due to alignment requirements.
  • Can use memcpy with objects of trivial types.
  • Can be copied to an array of char or unsigned char and then back.
  • They can have members with different access specifiers. However, in this situation the compiler can decide how to order the members.

However, trivial types cannot be safely used to interop with code written in other programming languages. This is due to the fact that the order of the members is compiler-specific.

The following snippet shows more examples of trivial types (B1, B2, B3, and B4 are all trivial types):

struct B1
struct B2
private:
   int a;
public:
   double b;
   void foo() {}
struct B3
private:
   int a;
public:
   double b;
   B3(int const x, double const y) :
      a(x), b(y) {}
   B3() = default;
struct B4Base
   int    a;
   double b;
struct B4 : public B4Base
private:
   int a;

Standard-layout types

In simple words, a standard-layout type is a type that has members with the same access control and does not have virtual functions or virtual base classes, or other features not present in the C language.

Formally defined, a standard-layout type is a type that:

  • has the same access control for all the non-static data members
  • has no non-static data members of reference types
  • has no virtual functions or virtual base classes
  • all the non-static data members and base classes are standard-layout types
  • has no two base class sub-objects of the same type (no diamond problem due to multiple inheritance)
  • has all non-static data members and bit-fields declared in the same class
  • has no base classes of the same type as the first non-static data member

The specifications for standard layout types are available here.

Standard-layout types have some properties, including the following:

  • The memory layout of a standard-layout type is well defined so that it can be used to interop with other programming languages, such as C.
  • Objects of standard-layout types can be memcpy-ed.
  • Facilitates empty base class optimization. This is an optimization that ensures base classes with no data members take up no space and, therefore, have the size zero. Also, such a base subobject has the same address as the first data member of the derived class (therefore, the last limitation in the preceding list).
  • Can use the offsetof macro to determine the offset of a data member, in bytes, from the beginning of the object.

Let’s look at some more examples of standard-layout types. In the snippet below, the classes C1, C2, and C3 are all standard-laytout.

struct C1
struct C2
   double b;
struct C3Base
   void foo() {}
struct C3 : public C3Base
   int    a;
   double b;

On the other hand, none of the following classes, C4 to C8, are standard-layout.

struct C4Base
   int    a;
struct C4 : public C4Base
   double b;
struct C5
   int    a;
private:
   virtual void foo() {};
struct C6Base {};
struct X : public C6Base {};
struct Y : public C6Base {};
struct C6 : public X, Y {};
struct C7
   int    a;
private:
   double b;
struct C8 : public C6Base
   C6Base b;
   int    a;

The reason for this is that:

  • C4 does not have all the non-static data members defined in the same class
  • C5 has virtual functions
  • C6 has two base class sub-objects of the same type (diamond problem)
  • C7 has members with different access control
  • C8 has the first non-static data member of the same type as the base class

Although it has a non-static data member of the base class such in the case of the C8 class, the C9 class shown below is standard-layout, because this member is not the first.

struct C9 : public C6Base
   int    a;
   C6Base b;

This C9 class is reported as non standard-type by the VC++ compiler, although Clang and GCC correctly identify it as standard-layout.

A demo is available here.

Deprecated POD

The C++20 standard has deprecated the concept of POD and the type trait std::is_pod<T> (and the variable template std::is_pod_v<T>) because, one one hand, POD is equivalent to trivial and standard-layout, and, on the other hand, in most scenarios, using just trivial or standard-layout types is enough or desired. The following comment is taken from the ISO committee notes:

The term POD no longer serves a purpose in the standard, it is merely defined, and restrictions apply for when a few other types preserve this vestigial property. The is_pod trait should be deprecated, moving the definition of a POD type alongside the trait in Annex D, and any remaining wording referring to POD should be struck, or revised to clearly state intent (usually triviality) without mentioning PODs.

You can check the following papers:

The key question that arises from this is what should be used instead of POD? The answer is the following:

  • for scenarios where initialization is concerned, use triviality
  • for scenarios where layout and interoping with other programming languages is concerned, use the standard layout requirement
Okay?

See also

Like this:

Loading...

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK