14

Move, simply

 4 years ago
source link: https://herbsutter.com/2020/02/17/move-simply/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

C++ “move” semantics are simple, but they are still widely misunderstood. This post is an attempt to shed light on that situation.

Thank you to the following for their feedback on drafts of this material: Howard Hinnant (lead designer and author of move semantics), Jens Maurer, Arthur O’Dwyer, Geoffrey Romer, Bjarne Stroustrup, Andrew Sutton, Ville Voutilainen, Jonathan Wakely.

Move

What “move” is, and how to use it

In C++, copying or moving from an object a to an object  b sets  b to  a ‘s original value. The only difference is that copying from  a won’t change  a , but moving from  a might.

To pass a named object a as an argument to a  && “move” parameter (rvalue reference parameter), write  std::move(a) . That’s pretty much the only time you should write  std::move , because C++ already uses move automatically when copying from an object it knows will never be used again, such as a temporary object or a local variable being returned or thrown from a function.

That’s it.

Advanced notes for type authors

Copying is a const operation on  a , so copy construction/assignment functions should always take their parameter by  const& . Move is a  noexcept non- const operation on  a , so move construction/assignment functions should always be  noexcept and take their parameter by (non-const)  && .

For copyable types, move is always an optimization of copy, so only explicitly write move functions for the type if copying is expensive enough to be worth optimizing. Otherwise, you’ll either get the implicitly generated move functions, or else requests to move will automatically just do a copy instead, since copy is always a valid implementation of move (it just doesn’t exercise the non- const option).

For types that are move-only (not copyable), move is C++’s closest current approximation to expressing an object that can be cheaply moved around to different memory addresses, by making at least its value cheap to move around. (Other not-yet-standard proposals to go further in this direction include ones with names like “relocatable” and “destructive move,” but those aren’t standard yet so it’s premature to talk about them.) These types are used to express objects that have unique values or uniquely own a resource.

Appendix: Q&A

Wait, that seems oversimplified… for example, doesn’t C++ let me write copy functions in ways not mentioned above, like write a copy constructor that takes by non- const reference or a move constructor that can throw?

Yes, but don’t. Such things are legal but not good — ask auto_ptr (now removed), or  vector implementations that used dynamic sentinel nodes (now being removed).

How can moving from an object not change its state?

For example, moving an int doesn’t change the source’s value because an  int is cheap to copy, so move just does the same thing as copy. Copy is always a valid implementation of move if the type didn’t provide anything more efficient.

Can a given type document that moving from an object always changes its state? or changes it to a known state?

Yes, move is just another non- const function. Any non- const function can document when and how it changes the object’s state, including to specify a known new state as a postcondition if it wants. For example,  unique_ptr ‘s  .release() function is guaranteed to set the object to null — just as its move functions are guaranteed to set the source object to null.

I wrote std::move(a) but  a ‘s value didn’t change. Why?

Because moving from an object a can modify its value, but doesn’t have to. This is the same as any other non- const operation on  a .

There are other secondary reasons, but they’re all just special cases of the above fundamental reason, which applies irrespective of whether move is just a “move it if you can/want” cast or not, or whether a move vs. copy function is actually called, or other secondary reasons.

But what about the “moved-from” state, isn’t it special somehow?

No. The state of a after it has been moved from is the same as the state of  a after any other non- const operation. Move is just another non- const function that might (or might not) change the value of the source object.

I heard that a moved-from object is in a state where “you can call its functions that have no preconditions,” or is in a “valid but unspecified state,” is that right?

Yes, both are saying the same thing as above — the object continues to be a valid object of its type, its value might or might not have been modified. The standard library specifies this guarantee for all standard types, and all well-behaved types should do the same.

Note that this is the same state as in the following example that’s familiar to C++ programmers of all experience levels:

void f( /* and optionally const */ Thing& thing ) {  // no preconditions
    // here 'thing' is a valid object of its type
    // (aka "in a valid but unspecified state")

    // ... naturally you’ll want to know its value, so now just ask it,
    //     easy peasy, just use the object ...
}

This is not a mysterious state. It’s the ordinary state any object is in when you first encounter it.

Does “but unspecified” mean the object’s invariants might not hold?

No. In C++, an object is valid (meets its invariants) for its entire lifetime, which is from the end of its construction to the start of its destruction. Moving from an object does not end its lifetime, only destruction does, so moving from an object does not make it invalid or not obey its invariants.

If any non- const function on an object (including moving from it) makes the object invalid, the function has a bug.

Does “but unspecified” mean the only safe operation on a moved-from object is to call its destructor?

No.

Does “but unspecified” mean the only safe operation on a moved-from object is to call its destructor or to assign it a new value?

No.

Does “but unspecified” sound scary or confusing to average programmers?

It shouldn’t, it’s just a reminder that the value might have changed, that’s all. It isn’t intended to make “moved-from” seem mysterious (it’s not).

What about objects that aren’t safe to be used normally after being moved from?

They are buggy. Here’s a recent example:

// Buggy class: Move leaves behind a null smart pointer

class IndirectInt {
    shared_ptr<int> sp = make_shared<int>(42);
public:
    // ... more functions, but using defaulted move functions
    bool operator<(const IndirectInt& rhs) const { return *sp < *rhs.sp; }
                                                // oops: unconditional deref
    // ...
};

IndirectInt i[2];
i[0] = move(i[1]); // move leaves i[1].sp == nullptr
sort(begin(i), end(i)); // undefined behavior

This is simply a buggy movable type: The default compiler-generated move can leave behind a null sp member, but  operator< unconditionally dereferences  sp without checking for null. There are two possibilities:

  • If  operator<  is right and  sp  is supposed to never be null, then the class has a bug in its move functions and needs to fix that by suppressing or overriding the defaulted move functions.
  • Otherwise, if the move operation is right and  sp  is supposed to be nullable, then  operator<  has a bug and needs to fix it by checking for null before dereferencing.

Either way, the class has a bug — the move functions and operator< can’t both be right, so one has to be fixed, it’s that simple.

Assuming the invariant is intended to be that sp is not null, the ideal way to fix the bug is to directly express the design intent so that the class is correct by construction. Since the problem is that we are not expressing the “not null” invariant, we should express that by construction — one way is to  make the pointer member a  gsl::not_null<> (see for example the  Microsoft GSL implementation ) which is copyable but not movable or default-constructible. Then the class is both correct by construction and simple to write:

// Corrected class: Declare intent, naturally get only copy and not move

struct IndirectInt {
    not_null<shared_ptr<int>> sp = make_shared<int>(42);
public:
    // ... more functions, but NOT using defaulted move functions
    //     which are automatically suppressed
    bool operator<(const IndirectInt& rhs) const { return *sp < *rhs.sp; }  // ok
    // ...
};

IndirectInt i[2];
i[0] = move(i[1]); // performs a copy
sort(begin(i), end(i)); // ok, no undefined behavior

There’s one more question before we leave this example…

But what about a third option, that the class intends (and documents) that you just shouldn’t call operator< on a moved-from object… that’s a hard-to-use class, but that doesn’t necessarily make it a buggy class, does it?

Yes, in my view it does make it a buggy class that shouldn’t pass code review. The fundamental point is that “moved-from” really is just another ordinary state that can be encountered anytime, and so the suggested strategy would mean every user would have to test every object they ever encounter before they compare it… which is madness.

But let’s try it out: In this most generous view of IndirectInt , let’s say that the class tries to boldly document to its heroic users that they must never try to compare moved-from objects. That’s not enough, because users won’t always know if a given object they encounter is moved-from. For example:

void f(const IndirectInt& a, const IndirectInt& b) {
    if (a < b)  // this would be a bug without first testing (somehow) that a and b both aren't moved-from
       // ...
}

Worse, it can be viral: For example, if we compose this type in a class X { Y y; IndirectInt value; Z z; /* ... */ }; and then make a  vector<X> and use standard algorithms on it, some  X objects’  value members can contain null pointers if an exception is thrown, so there would have to be a way to test whether each object of such a composed type can be compared.

So the only documentable advice would be to require users of IndirectInt , and by default of every other type that composes an  IndirectInt , to always test an object for a null data member in some way before trying to compare it. I view that as an unreasonable burden on users of this type, nearly impossible to use correctly in practice, and something that shouldn’t pass code review.

Note that even floating point types, which are notoriously hard to use because of their NaN and signed-zero mysteries, are generally not this hard to use: With IEEE 754 non-signaling relational comparison, they support comparing any floating point values without having to first test at every call site whether comparison can be called. (With IEEE 754 signaling relational comparison, they’re as hard to use as IndirectInt . See your C++ implementation’s documentation for which kind of floating point comparison it supports.)

Does the “moved-from” state correspond to the “partially formed but not well formed” described in Elements of Programming (aka EoP)?

Not quite.

In EoP, the description of an object’s state as “partially formed but not well formed” is similar to the C++ Standard’s description of “valid but unspecified.” The difference is that EoP requires such objects to be assignable and destroyable (i.e., partially formed) while the C++ standard makes a broader statement that “operations on the object behave as specified for its type” and that a moved-from object “must still meet the requirements of the library component that is using it.” (See Cpp17MoveConstructible and  Cpp17MoveAssignable .)


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK