21

std::any: How, when, and why

 5 years ago
source link: https://www.tuicool.com/articles/hit/YFZZv2b
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

This post is part of a  regular series of posts  where the C++ product team here at Microsoft and other guests answer questions we have received from customers. The questions can be about anything C++ related: MSVC toolset, the standard language and library, the C++ standards committee, isocpp.org,  CppCon , etc. Today’s post is by Casey Carter.

C++17 adds several new “vocabulary types” – types intended to be used in the interfaces between components from different sources – to the standard library.  MSVC  has been shipping implementations of  std::optional std::any , and  std::variant since the Visual Studio 2017 release, but we haven’t provided any guidelines on how and when these vocabulary types should be used. This article on  std::any  is the second of a series that examines each of the vocabulary types in turn.

Storing arbitrary user data

Say you’re creating a calendar component that you intend to distribute in a library for use by other programmers. You want your calendar to be usable for solving a wide array of problems, so you decide you need a mechanism to associate arbitrary client data with days/weeks/months/years. How do you best implement this extensibility design requirement?

A C programmer might add a  void*  to each appropriate data structure:

struct day { 
  // ...things... 
  void* user_data; 
}; 

struct month { 
  std::vector<day> days; 
  void* user_data; 
};

and suggest that clients hang whatever data they like from it. This solution has a few immediately apparent shortcomings:

  • You can always cast a  void*  to a  Foo* whether or not the object it points at is actually a  Foo . The lack of type information for the associated data means that the library can’t provide even a basic level of type safety by guaranteeing that later accesses to stored data use the same type as was stored originally:
    some_day.user_data = new std::string{"Hello, World!"}; 
    // …much later 
    Foo* some_foo = static_cast<Foo*>(some_day.user_data); 
    some_foo->frobnicate(); // BOOM!
  • void*  doesn’t manage lifetime like a smart pointer would, so clients must manage the lifetime of the associated data manually. Mistakes result in memory leaks:
    delete some_day.user_data; 
    some_day.user_data = nullptr; 
    some_month.days.clear(); // Oops: hopefully none of these days had 
                             // non-null user_data
  • The library cannot copy the object that a  void*  points at since it doesn’t know that object’s type. For example, if your library provides facilities to copy annotations from one week to another, clients must copy the associated data manually. As was the case with manual lifetime management, mistakes are likely to result in dangling pointers, double frees, or leaks:
    some_month.days[0] = some_month.days[1]; 
    if (some_month.days[1].user_data) { 
      // I'm storing strings in user_data, and don't want them shared 
      // between days. Copy manually: 
      std::string const& src = *some_month.days[1].user_data; 
      some_month.days[0].user_data = new std::string(src); 
    }

The C++ Standard Library provides us with at least one tool that can help:  shared_ptr<void> . Replacing the  void* with  shared_ptr<void> solves the problem of lifetime management:

struct day {
  // ...things...
  std::shared_ptr<void> user_data;
};

struct month {
  std::vector<day> days;
  std::shared_ptr<void> user_data;
};

since  shared_ptr squirrels away enough type info to know how to properly destroy the object it points at. A client could create a  shared_ptr<Foo> , and the deleter would continue to work just fine after converting to  shared_ptr<void>  for storage in the calendar:

some_day.user_data = std::make_shared<std::string>("Hello, world!");
// ...much later...
some_day = some_other_day; // the object at which some_day.user_data _was_
                           // pointing is freed automatically

This solution may help solve the copyability problem as well, if the client is happy to have multiple days/weeks/etc. hold copies of the same  shared_ptr<void>  – denoting a single object – rather than independent values.  shared_ptr  doesn’t help with the primary problem of type-safety, however. Just as with  void* shared_ptr<void>  provides no help tracking the proper type for associated data. Using a  shared_ptr  instead of a  void*  also makes it impossible for clients to “hack the system” to avoid memory allocation by reinterpreting integral values as  void*  and storing them directly; using  shared_ptr  forces us to allocate memory even for tiny objects like  int .

Not just  any solution will do

std::any  is the smarter  void* / shared_ptr<void> . You can initialize an  any  with a value of any copyable type:

std::any a0; 
std::any a1 = 42; 
std::any a2 = month{"October"};

Like  shared_ptr any  remembers how to destroy the contained value for you when the  any  object is destroyed. Unlike  shared_ptr any  also remembers how to  copy  the contained value and does so when the  any  object is copied:

std::any a3 = a0; // Copies the empty any from the previous snippet
std::any a4 = a1; // Copies the "int"-containing any
a4 = a0;          // copy assignment works, and properly destroys the old value

Unlike  shared_ptr any  knows what type it contains:

assert(!a0.has_value());            // a0 is still empty
assert(a1.type() == typeid(int));
assert(a2.type() == typeid(month));
assert(a4.type() == typeid(void));  // type() returns typeid(void) when empty

and uses that knowledge to ensure that when you access the contained value – for example, by obtaining a reference with  any_cast  – you access it with the correct type:

assert(std::any_cast<int&>(a1) == 42);             // succeeds
std::string str = std::any_cast<std::string&>(a1); // throws bad_any_cast since
                                                   // a1 holds int, not string
assert(std::any_cast<month&>(a2).days.size() == 0);
std::any_cast<month&>(a2).days.push_back(some_day);

If you want to avoid exceptions in a particular code sequence and you are uncertain what type an  any  contains, you can perform a combined type query and access with the pointer overload of  any_cast :

if (auto ptr = std::any_cast<int>(&a1)) {
  assert(*ptr == 42); // runs since a1 contains an int, and succeeds
}
if (auto ptr = std::any_cast<std::string>(&a1)) {
  assert(false);      // never runs: any_cast returns nullptr since
                      // a1 doesn't contain a string
}

The C++ Standard encourages implementations to store small objects with non-throwing move constructors directly in the storage of the  any  object, avoiding the costs of dynamic allocation. This feature is best-effort and there’s no guaranteed threshold below which  any  is portably guaranteed not to allocate. In practice, the Visual C++ implementation uses a larger  any  that avoids allocation for object types with non-throwing moves up to a handful of pointers in size, whereas libc++ and libstdc++ allocate for objects that are two or more pointers in size (See https://godbolt.org/z/RQd_w5 ).

How to select a vocabulary type (aka “What if you know the type(s) to be stored?”)

If you have knowledge about the type(s) being stored – beyond the fact that the types being stored must be copyable – then  std::any  is probably not the proper tool: its flexibility has performance costs. If there is exactly one such type  T , you should reach for  std::optional . If the types to store will always be function objects with a particular signature – callbacks, for example – you want  std::function . If you only need to store types from some set fixed at compile time,  std::variant  is a good choice; but let’s not get ahead of ourselves – that will be the next article.

Conclusions

When you need to store an object of an arbitrary type, pull  std::any  out of your toolbox. Be aware that there are probably more appropriate tools available when you do know something about the type to be stored.

If you have any questions (Get it? “any” questions?), please feel free to post in the comments below. You can also send any comments and suggestions directly to the author via e-mail at  [email protected] , or Twitter  @CoderCasey . Thank you!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK