Post-Conditions on Self-Move

UPDATE April 8, 2016 This post has been edited since publication to reflect my evolving understanding. As a result of the issues raised in this post, it’s possible that the committee decides to strengthen the post-conditions on move, so the recommendations made here may evolve further. Stay tuned.

TL;DR: In addition to the usual rule about move operations leaving the source object in a valid but unspecified state, we can add an additional rule:

Self-move assignment should “work” and at the very least leave the object in a valid but unspecified state.

Discussion

What do you think the following code should do?

X x = {/*something*/};

x = std::move(x);

Yes, it’s dumb, but with our alias-happy language, it can happen. So what does the standard say about this? For that we turn to [res.on.arguments]/p1.3 taken from the library introduction (emphasis mine):

If a function argument binds to an rvalue reference parameter, the implementation may assume that this parameter is a unique reference to this argument. […] If a program casts an lvalue to an xvalue while passing that lvalue to a library function (e.g. by calling the function with the argument std::move(x)), the program is effectively asking that function to treat that lvalue as a temporary. The implementation is free to optimize away aliasing checks which might be needed if the argument waswere an lvalue.

(I fixed the grammar mistake because I am a Huge Dork.) The above seems to say that std::swap(x, x) is playing with fire, because std::swap is implemented as follows:

template <class T>

void swap(T& a, T& b) {

auto x(std::move(a));

a = std::move(b); // Here be dragons

b = std::move(x);

}

If a and b refer to the same object, the second line of std::swap does a self-move assign. Blamo! Undefined behavior, right?

Such was what I thought when I first wrote this post until Howard Hinnant drew my attention to the requirements table for the MoveAssignable concept, which says that for the expression t = rv (emphasis mine):

If t and rv do not refer to the same object, t is equivalent to the value of rv before the assignment […] rv’s state is unspecified. [ Note: rv must still meet the requirements of the library component that is using it, whether or not t and rv refer to the same object. […] –end note]

Ah, ha! So here we have it. After a self-move, the object is required to be in a valid-but-unspecified state.

My attention we drawn to this issue during a code review of a change I wanted to make to Folly‘s Function class template. I wanted to change this:

Function& operator=(Function&& that) noexcept {

if (this != &that) {

// do the move

}

return *this;

}

to this:

Function& operator=(Function&& that) noexcept {

assert(this != &that);

// do the move

return *this;

}

The reason: let’s make moves as fast as possible and take advantage of the fact that Self-Moves Shouldn’t Happen. We assert, fix up the places that get it wrong, and make our programs an iota faster. Right?

Not so fast, said one clued-in reviewer. Self-swaps can happen quite easily in generic algorithms, and they shouldn’t trash the state of the object or the state of the program. This rang true, and so begin my investigation.

A few Google searches later turned up this StackOverflow gem from Howard Hinnant. C++ wonks know Howard Hinnant. He’s the author of libc++, and an old time C++ library developer. (Remember Metrowerks CodeWarrior? No? Get off my lawn.) He also happens to be the person who wrote the proposal to add rvalue references to the language, so you know, Howard’s given this some thought. First Howard says this:

Some will argue that swap(x, x) is a good idea, or just a necessary evil. And this, if the swap goes to the default swap, can cause a self-move-assignment.

I disagree that swap(x, x) is ever a good idea. If found in my own code, I will consider it a performance bug and fix it.

But then in an Update, he backtracks:

I’ve given this issue some more thought, and changed my position somewhat. I now believe that assignment should be tolerant of self assignment, but that the post conditions on copy assignment and move assignment are different:

For copy assignment:

x = y;

one should have a post-condition that the value of y should not be altered. When &x == &y then this postcondition translates into: self copy assignment should have no impact on the value of x.

For move assignment:

x = std::move(y);

one should have a post-condition that y has a valid but unspecified state. When &x == &y then this postcondition translates into: x has a valid but unspecified state. I.e. self move assignment does not have to be a no-op. But it should not crash. This post-condition is consistent with allowing swap(x, x) to just work […]

When Howard Hinnant changes his mind about something having to do with library design, I sit up and take note, because it means that something very deep and subtle is going on. In this case, it means I’ve been writing bad move assignment operators for years.

By Howard’s yardstick — and by the requirements for the MoveAssignable concept in the standard, thanks Howard! — this move assignment operator is wrong:

Function& operator=(Function&& that) noexcept {

assert(this != &that); // No! Bad C++ programmer!

// do the move

return *this;

}

Move assignment operators should accept self-moves and do no evil; indeed for std::swap(f, f) to work it must.

That’s not the same as saying it needs to preserve the object’s value, though, and not preserving the object’s value can be a performance win. It can save a branch, for instance. Here is how I reformulated folly::Function’s move assignment operator[*]:

Function& operator=(Function&& that) noexcept {

clear_(); // Free all of the resources owned by *this

moveFrom_(that); // Move that's guts into *this.

return *this;

}

[*] Well, not exactly, but that’s the gist.

Of note is that clear_() leaves *this in a state such that it is still OK to moveFrom_(*this), which is what happens when that and *this are the same object. In the case of Function, it just so happens that the effect of this code is to put the Function object back into the default-constructed state, obliterating the previous value. The particular final state of the object isn’t important though, so long as it is still valid.

Summary

So, as always we have the rule about moves:

Move operations should leave the source object in a valid but unspecified state.

And to that we can add an additional rule:

Self-moves should do no evil and leave the object in a valid but unspecified state.

If you want to go further and leave the object unmodified, that’s not wrong per se, but it’s not required by the standard as it is today. Changing the value is perfectly OK (Howard and the standard say so!), and doing that might save you some cycles.

Post-Conditions on Self-Move

Post-Conditions on Self-Move

Discussion

Summary

Recommend

Ranges, Coroutines, and React: Early Musings on the Future of Async in C++

如何通过Timeline的形式实现技能编辑器

打造工业级研发流水线，UWA Pipeline内测申请中！

Is Anatoly Dyatlov to blame?

Validation is a mirage

彭博：货拉拉新一轮融资希望至少筹集5亿美元

来自解直锟的证词中植系也挺不容易

京东集团的风控实践

成都金大地“越南道”翻车了？

成都信和·御龙山取证为啥总是“辣鸡”户型？

About Joyk