Implementing C++20 atomic waiting in libstdc++

Implementing C++20 atomic waiting in libstdc++ Skip to main content

The C++ standard library gained some new concurrency features with C++20:

Wait and notify operations on std::atomic<T>
Semaphores
Latches
Barriers

In this article, I will cover the current implementation approach for atomic wait/notify, as these are basis operations required to implement the remaining coordination primitives introduced with C++20. Subsequent articles in this series will cover the details of the other types.

Note: The implementation presented here is considered experimental and the details will almost certainly change in a future version of GCC as we commit to an ABI for these features.

Let's start by taking a look at what the C++ standard says about atomic waiting:

1 Atomic waiting operations and atomic notifying operations provide a mechanism to wait for the value of an atomic object to change more efficiently than can be achieved with polling. An atomic waiting operation may block until it is unblocked by an atomic notifying operation, according to each function's effects.

[Note 1 : Programs are not guaranteed to observe transient atomic values, an issue known as the A-B-A problem, resulting in continued blocking if a condition is only temporarily met. — end note]

2 [Note 2 : The following functions are atomic waiting operations:

(2.1) — atomic<T>::wait,

(2.2) — atomic_flag::wait,

(2.3) — atomic_wait and atomic_wait_explicit

(2.4) — atomic_flag_wait and atomic_flag_wait_explicit, and

(2.5) — atomic_ref::wait. — end note]

3 [Note 3 : The following functions are atomic notifying operations:

(3.1) — atomic<T>::notify_one and atomic<T>::notify_all,

(3.2) — atomic_flag::notify_one and atomic_flag::notify_all,

(3.3) — atomic_notify_one and atomic_notify_all,

(3.4) — atomic_flag_notify_one and atomic_flag_notify_all, and

(3.5) — atomic_ref<T>::notify_one and atomic_ref<T>::notify_all. — end note]

4 A call to an atomic waiting operation on an atomic object M is eligible to be unblocked by a call to an atomic notifying operation on M if there exist side effects X and Y on M such that:

(4.1) — the atomic waiting operation has blocked after observing the result of X,

(4.2) — X precedes Y in the modification order of M, and

(4.3) — Y happens before the call to the atomic notifying operation.

How can we implement atomic waiting?

The only universal strategy for implementing atomic waiting is to spin in an atomic load-compare loop. This isn't particularly efficient if the waiter is likely to block for some extended period, but it is advantageous in terms of application responsiveness in many cases to do a bit of spinning before calling into a more expensive operating system level primitive.

libstdc++ implements its spin logic as follows:

If supported, this is how we let the CPU or kernel know we are able to yield or relax our spinning:

inline void
__thread_yield() noexcept
{
#if defined _GLIBCXX_HAS_GTHREADS && defined _GLIBCXX_USE_SCHED_YIELD
  __gthread_yield();
#endif
}

inline void
__thread_relax() noexcept
{
#if defined __i386__ || defined __x86_64__
  __builtin_ia32_pause();
#else
  __thread_yield();
#endif
}

How can we implement atomic waiting?

A more efficient wait

How to handle those types that do not fit in a __platform_wait_t

Making notify cheaper

What about platforms that don't provide some low-level efficient wait?

Putting together the pieces for a wait primitive

Putting together the pieces for a notify primitive

Handling the various type and platform choices

Next time

Recommend

About Joyk