9

Ranges, Coroutines, and React: Early Musings on the Future of Async in C++

 3 years ago
source link: https://ericniebler.com/2017/08/17/ranges-coroutines-and-react-early-musings-on-the-future-of-async-in-c/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Disclaimer: these are my early thoughts. None of this is battle ready. You’ve been warned.

Hello, Coroutines!

At the recent C++ Committee meeting in Toronto, the Coroutines TS was forwarded to ISO for publication. That roughly means that the coroutine “feature branch” is finished, and is ready to be merged into trunk (standard C++) after a suitable vetting period (no less than a year). That puts it on target for C++20. What does that mean for idiomatic modern C++?

Lots, actually. With the resumable functions (aka, stackless coroutines) from the Coroutines TS, we can do away with callbacks, event loops, and future chaining (future.then()) in our asynchronous APIs. Instead, our APIs can return “awaitable” types. Programmers can then just use these APIs in a synchronous-looking style, spamming co_await in front of any async API call and returning an awaitable type.

This is a bit abstract, so this blog post make it more concrete. It describes how the author wrapped the interface of libuv — a C library that provides the asynchronous I/O in Node.js — in awaitables. In libuv, all async APIs take a callback and loop on an internal event loop, invoking the callback when the operation completes. Wrapping the interfaces in awaitables makes for a much better experience without the callbacks and the inversion of control they bring.

Below, for instance, is a function that (asynchronously) opens a file, reads from it, writes it to stdout, and closes it:

auto start_dump_file( const std::string& str )
-> future_t<void>
{
// We can use the same request object for
// all file operations as they don't overlap.
static_buf_t<1024> buffer;
fs_t openreq;
uv_file file = co_await fs_open(uv_default_loop(),
&openreq,
str.c_str(),
O_RDONLY,
0);
if (file > 0)
{
while (1)
{
fs_t readreq;
int result = co_await fs_read(uv_default_loop(),
&readreq,
file,
&buffer,
1,
-1);
if (result <= 0)
break;
buffer.len = result;
fs_t req;
(void) co_await fs_write(uv_default_loop(),
&req,
1 /*stdout*/,
&buffer,
1,
-1);
}
fs_t closereq;
(void) co_await fs_close(uv_default_loop(),
&closereq,
file);
}
}

You can see that this looks almost exactly like ordinary synchronous code, with two exceptions:

  1. Calls to asynchronous operations are preceded with co_await, and
  2. The function returns an awaitable type (future_t<void>).

Very nice. But this code snippet does too much in my opinion. Wouldn’t it be nice to have a reusable component for asynchronously reading a file, separate from the bit about writing it to stdout? What would that even look like?

Hello, Ranges!

Also at the recent C++ Committee meeting in Toronto, the Ranges TS was forwarded to ISO for publication. This is the first baby step toward a complete reimagining and reimplementation of the C++ standard library in which interfaces are specified in terms of ranges in addition to iterators.

Once we have “range” as an abstraction, we can build range adaptors and build pipelines that transform ranges of values in interesting ways. More than just a curiosity, this is a very functional style that lets you program without a lot of state manipulation. The fewer states your program can be in, the easier it is for you to reason about your code, and the fewer bugs you’ll have. (For more on that, you can see my 2015 C++Con talk about ranges; or just look at the source for a simple app that prints a formatted calendar to stdout, and note the lack of loops, conditionals, and overt state manipulation.)

For instance, if we have a range of characters, we might want to lazily convert each character to lowercase. Using the range-v3 library, you can do the following:

std::string hello("Hello, World!");
using namespace ranges;
auto lower = hello
| view::transform([](char c){
return (char)std::tolower(c);});

Now lower presents a view of hello where each character is run through the tolower transform on the fly.

Although the range adaptors haven’t been standardized yet, the Committee has already put its stamp of approval on the overall direction, including adaptors and pipelines. (See N4128 for the ranges position paper.) Someday, these components will all be standard, and the C++ community can encourage their use in idiomatic modern C++.

Ranges + Coroutines == ?

With coroutines, ranges become even more powerful. For one thing, the co_yield keyword makes it trivial to define your own (synchronous) ranges. Already with range-v3 you can use the following code to define a range of all the integers and apply a filter to them:

#include <iostream>
#include <range/v3/all.hpp>
#include <range/v3/experimental/utility/generator.hpp>
using namespace ranges;
// Define a range of all the unsigned shorts:
experimental::generator<unsigned short> ushorts()
{
unsigned short u = 0;
do { co_yield u; } while (++u);
}
int main()
{
// Filter all the even unsigned shorts:
auto evens = ushorts()
| view::filter([](auto i) {
return (i % 2) == 0; });
// Write the evens to cout:
copy( evens, ostream_iterator<>(std::cout, "\n") );
}

Put the above code in a .cpp file, compile with a recent clang and -fcoroutines-ts -std=gnu++1z, and away you go. Congrats, you’re using coroutines and ranges together. This is a trivial example, but you get the idea.

Asynchronous Ranges

That great and all, but it’s not asynchronous, so who cares? If it were asynchronous, what would that look like? Moving to the first element of the range would be an awaitable operation, and then moving to every subsequent element would also be awaitable.

In the ranges world, moving to the first element of a range R is spelled “auto it = begin(R)”, and moving to subsequent elements is spelled “++it”. So for an asynchronous range, those two operations should be awaitable. In other words, given an asynchronous range R, we should be able to do:

// Consume a range asynchronously
for( auto it = co_await begin(R);
it != end(R);
co_await ++it )
{
auto && e = *it;
do_something( e );
}

In fact, the Coroutines TS anticipates this and has a asynchronous range-based for loop for just this abstraction. The above code can be rewritten:

// Same as above:
for co_await ( auto&& e : R )
{
do_something( e );
}

Now we have two different but closely related abstractions: Range and AsynchronousRange. In the first, begin returns something that models an Iterator. In the second, begin returns an Awaitable of an AsynchronousIterator. What does that buy us?

Asynchronous Range Adaptors

Once we have an abstraction, we can program against that abstraction. Today we have a view::transform that knows how to operate on synchronous ranges. It can be extended to also work with asynchronous ranges. So can all the other range adaptors: filter, join, chunk, group_by, interleave, transpose, etc, etc. So it will be possible to build a pipeline of operations, and apply the pipeline to a synchronous range to get a (lazy) synchronous transformation, and apply the same exact pipeline to an asynchronous range to get a non-blocking asynchronous transformation. The benefits are:

  • The same functional style can be used for synchronous and asynchronous code, reusing the same components and the same idioms.
  • Asynchronous code, when expressed with ranges and transformations, can be made largely stateless, as can be done today with synchronous range-based code. This leads to programs with fewer states and hence fewer state-related bugs.
  • Range-based code composes very well and encourages a decomposition of problems into orthogonal pieces which are easily testable in isolation. (E.g., a view::filter component can be used with any input range, synchronous or asynchronous, and can be easily tested in isolation of any particular range.)

Another way to look at this is that synchronous ranges are an example of a pull-based interface: the user extracts elements from the range and processes them one at a time. Asynchronous ranges, on the other hand, represent more of a push-based model: things happen when data shows up, whenever that may be. This is akin to the reactive style of programming.

By using ranges and coroutines together, we unify push and pull based idioms into a consistent, functional style of programming. And that’s going to be important, I think.

Back to LibUV

Earlier, we wondered about a reusable libuv component that used its asynchronous operations to read a file. Now we know what such a component could look like: an asynchronous range. Let’s start with an asynchronous range of characters. (Here I’m glossing over the fact that libuv deals with UTF-8, not ASCII. I’m also ignoring errors, which is another can of worms.)

auto async_file( const std::string& str )
-> async_generator<char>
{
// We can use the same request object for
// all file operations as they don't overlap.
static_buf_t<1024> buffer;
fs_t openreq;
uv_file file = co_await fs_open(uv_default_loop(),
&openreq,
str.c_str(),
O_RDONLY,
0);
if (file > 0)
{
while (1)
{
fs_t readreq;
int result = co_await fs_read(uv_default_loop(),
&readreq,
file,
&buffer,
1,
-1);
if (result <= 0)
break;
// Yield the characters one at a time.
for ( int i = 0; i < result; ++i )
{
co_yield buffer.buffer[i];
}
}
fs_t closereq;
(void) co_await fs_close(uv_default_loop(),
&closereq,
file);
}
}

The async_file function above asynchronously reads a block of text from the file and then co_yields the individual characters one at a time. The result is an asynchronous range of characters: async_generator<char>. (For an implementation of async_generator, look in Lewis Baker’s cppcoro library.)

Now that we have an asynchronous range of characters representing the file, we can apply transformations to it. For instance, we could convert all the characters to lowercase:

// Create an asynchronous range of characters read
// from a file and lower-cased:
auto async_lower = async_file("some_input.txt")
| view::transform([](char c){
return (char)std::tolower(c);});

That’s the same transformation we applied above to a std::string synchronously, but here it’s used asynchronously. Such an asynchronous range can then be passed through further transforms, asynchronously written out, or passed to an asynchronous std:: algorithm (because we’ll need those, too!)

One More Thing

I hear you saying, “Processing a file one character at a time like this would be too slow! I want to operate on chunks.” The above async_file function is still doing too much. It should be an asynchronous range of chunks. Let’s try again:

auto async_file_chunk( const std::string& str )
-> async_generator<static_buf_t<1024>&>
{
// We can use the same request object for
// all file operations as they don't overlap.
static_buf_t<1024> buffer;
fs_t openreq;
uv_file file = co_await fs_open(uv_default_loop(),
&openreq,
str.c_str(),
O_RDONLY,
0);
if (file > 0)
{
while (1)
{
fs_t readreq;
int result = co_await fs_read(uv_default_loop(),
&readreq,
file,
&buffer,
1,
-1);
if (result <= 0)
break;
// Just yield the buffer.
buffer.len = result;
co_yield buffer;
}
fs_t closereq;
(void) co_await fs_close(uv_default_loop(),
&closereq,
file);
}
}

Now if I want to, I can asynchronously read a block and asynchronously write the block, as the original code was doing, but while keeping those components separate, as they should be.

For some uses, a flattened view would be more convenient. No problem. That’s what the adaptors are for. If static_buf_t is a (synchronous) range of characters, we already have the tools we need:

// Create an asynchronous range of characters read from a
// chunked file and lower-cased:
auto async_lower = async_file_chunk("some_input.txt")
| view::join
| view::transform([](char c){
return (char)std::tolower(c);});

Notice the addition of view::join. Its job is to take a range of ranges and flatten it. Let’s see what joining an asynchronous range might look like:

template <class AsyncRange>
auto async_join( AsyncRange&& rng )
-> async_generator<range_value_t<
async_range_value_t<AsyncRange>>>
{
for co_await ( auto&& chunk : rng )
{
for ( auto&& e : chunk )
co_yield e;
}
}

We (asynchronously) loop over the outer range, then (synchronously) loop over the inner ranges, and co_yield each value. Pretty easy. From there, it’s just a matter of rigging up operator| to async_join to make joining work in pipelines. (A fully generic view::join will be more complicated than that since both the inner and outer ranges can be either synchronous or asynchronous, but this suffices for now.)

Summary

With ranges and coroutines together, we can unify the push and pull programming idioms, bringing ordinary C++ and reactive C++ closer together. The C++ Standard Library is already evolving in this direction, and I’m working to make that happen both on the Committee and internally at Facebook.

There’s LOTS of open questions. How well does this perform at runtime? Does this scale? Is it flexible enough to handle lots of interesting use cases? How do we handle errors in the middle of an asynchronous pipeline? What about splits and joins in the async call graph? Can this handle streaming interfaces? And so on. I’ll be looking into all this, but at least for now I have a promising direction, and that’s fun.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK