3

Adding middleware support to Rust reqwest

 2 years ago
source link: https://truelayer.com/blog/adding-middleware-support-to-rust-reqwest
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Continuing with our open source series, we present a middleware adapter for the ubiquitous reqwest Rust crate.

This is the second post in our open source series, where we talk about engineering challenges at TrueLayer and open source our solutions to them. The first one was gRPC load balancing in Rust, in case you missed it.This post is about reqwest-middleware, a crate built on top of the reqwest HTTP client to provide middleware functionality.

The problem

Running applications at scale requires built-in resilience when communicating over the network with internal and external services, because services fail. Retries are a common strategy to improve reliability. The mechanics are fairly simple: wrap each request in a loop and retry until you get a successful response or you run out of attempts.We have tens of clients across our codebases: we don’t want to re-implement retries in an ad-hoc fashion for each of them.At the same time, we’d prefer to keep our domain code free of this network-level concern — it’d be ideal to have retries transparently implemented in the HTTP client itself. We could write a RetryHttpClient that wraps a standard client to augment it with retry functionality — but retries are not the full story. There is other functionality we want our HTTP clients to handle: propagation of distributed tracing headers, caching, logging. But we don’t want to write TracingRetryableClient, TracingRetryableCachingHttpClient, RetryableTracingCachingHttpClient (order matters!) and all the other possible combinations.We want an abstraction that is composable. All these pieces of functionality follow the same pattern:
  • We want to run some arbitrary logic before and after executing a request
  • The logic is completely independent to the problem domain, it is just concerned with the underlying transport and the requirements of the organisation as a whole (eg logging standards).
The good news is this is a common problem in software systems, as such there’s also a common solution for it: middlewares.(aside: note that we’re referring to a very specific kind of middleware in this article, the term itself is more general. See the Wikipedia page on middleware for usage in other contexts)

Rust HTTP client middleware

At TrueLayer we use reqwest as our HTTP client for all our Rust services.We chose it because it provides an async-first API, it is compatible tokio and it has been used extensively in production.Sadly, reqwest doesn’t support middleware out of the box.What were our options?
  • Use an off-the-shelf crate, either replacing reqwest or built on top of it. At the time of writing, no other well-established Rust HTTP client supporting middleware ticks the same boxes that reqwest does for us. surf has good mindshare and built-in middlewares, but it requires pulling in async-std.
  • Try to get middleware support implemented upstream. reqwest maintainers have been discussing this from 2017 (see this issue) and still there doesn’t seem to be consensus, not even on whether such functionality belongs to the crate. So we’re unlikely to get something through anytime soon.
  • The final alternative was to wrap reqwest and implement middlewares on top of that, so that’s the approach we went for. reqwest-middleware was born.
With reqwest-middleware we’re able to attach middleware to a Client and then proceed to make requests as if we were using reqwest directly:
1use reqwest_middleware::{ClientBuilder, ClientWithMiddleware};
2use reqwest_retry::{RetryTransientMiddleware, policies::ExponentialBackoff};
3use reqwest_tracing::TracingMiddleware;
4
5#[tokio::main]
6async fn main() {
7    let retry_policy = ExponentialBackoff::builder().build_with_max_retries(3);
8    let client = ClientBuilder::new(reqwest::Client::new())
9        .with(TracingMiddleware)
10        .with(RetryTransientMiddleware::new_with_policy(retry_policy))
11        .build();
12    run(client).await;
13}
14
15async fn run(client: ClientWithMiddleware) {
16    // free retries!
17    client
18        .get("https://some-external-service.com")
19        .header("foo", "bar")
20        .send()
21        .await
22        .unwrap();
23}

Prior art

Before we talk about our approach let’s take a look at some middleware APIs in the wild:SurfSurf is a Rust HTTP client. Here’s an example middleware from their own docs:
1/// Log each request's duration
2#[derive(Debug)]
3pub struct Logger;
4
5#[surf::utils::async_trait]
6impl Middleware for Logger {
7    async fn handle(
8        &self,
9        req: Request,
10        client: Client,
11        next: Next<'_>,
12    ) -> Result<Response> {
13        println!("sending request to {}", req.url());
14        let now = time::Instant::now();
15        let res = next.run(req, client).await?;
16        println!("request completed ({:?})", now.elapsed());
17        Ok(res)
18    }
19}
As you can see, it takes a request object and a next value, which can be used to forward that request into the remaining pipeline, and returns a Response. This allows us to manipulate requests by mutating them before forwarding down the chain, we could also change the res value we get back from next.run before returning.We can even use control flow around next, which allows for retries and short-circuiting:
1#[derive(Debug)]
2pub struct ConditionalCall;
3
4#[surf::utils::async_trait]
5impl Middleware for ConditionalCall {
6    async fn handle(
7        &self,
8        req: Request,
9        client: Client,
10        next: Next<'_>,
11    ) -> Result<Response> {
12        // Silly example: return a dummy response 50% of the time
13        if rand::random()::<bool>() {
14          let res = next.run(req, client).await?;
15          Ok(res)
16        } else {
17          let response = http_types::Response::new(StatusCode::Ok);
18          Ok(response)
19        }
20    }
21}
ExpressExpress is a well-established Node.js web framework. Its middlewares are written as plain functions, here’s an example from their docs:
1app.use(function (req, res, next) {
2  console.log('Time:', Date.now())
3  next()
4})
This is very similar to surf’s approach, except here we take a response object and can mutate it directly: the middleware function doesn’t return anything.Towertower describes itself as a library of generic Rust components for networking applications. It’s used as a building block for many notable crates such as hyper and tonic. tower’s middlewares are a bit more involved, most likely because they did not want to force dynamic dispatch (eg async_trait) on their users. As for the other libraries, this is the example given on the tower docs:
1pub struct LogLayer {
2    target: &'static str,
3}
4
5impl<S> Layer<S> for LogLayer {
6    type Service = LogService<S>;
7
8    fn layer(&self, service: S) -> Self::Service {
9        LogService {
10            target: self.target,
11            service
12        }
13    }
14}
15
16// This service implements the Log behavior
17pub struct LogService<S> {
18    target: &'static str,
19    service: S,
20}
21
22impl<S, Request> Service<Request> for LogService<S>
23where
24    S: Service<Request>,
25    Request: fmt::Debug,
26{
27    type Response = S::Response;
28    type Error = S::Error;
29    type Future = S::Future;
30
31    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
32        self.service.poll_ready(cx)
33    }
34
35    fn call(&mut self, request: Request) -> Self::Future {
36        // Insert log statement here or other functionality
37        println!("request = {:?}, target = {:?}", request, self.target);
38        self.service.call(request)
39    }
40}
Ignoring the poll_ready method which is used for backpressure, tower's Services are defined as functions from a request to a response: call returns a Future where Future::Item is the associated Service::Response. The async middleware trait in surf is simpler because it relies on a procedural macro (async_trait) to use the async fn syntax in traits — behind the scenes it translates into boxing futures. This is necessary because async is not supported in trait methods yet, see this post by Nicholas D. Matsakis for an in-depth look into why.Middlewares in tower are defined through the Layer trait which simply maps one service into another. Implementation usually involves having a generic struct wrapping some Service and delegating call to it. The wrapped service plays the same role of the next parameter in surf and express. It gives you a way to call into the rest of the middleware chain. This approach still lets us manipulate requests and responses in the same way we could with the next-based APIs.FinagleFinagle is a JVM RPC system written in Scala. Let’s take an example middleware from the finagle docs as well:
1class TimeoutFilter[Req, Rep](timeout: Duration, timer: Timer)
2  extends SimpleFilter[Req, Rep] {
3  def apply(request: Req, service: Service[Req, Rep]): Future[Rep] = {
4    val res = service(request)
5    res.within(timer, timeout)
6  }
7}
Here we also have Services which are very similar to tower: a function from a request into a response.Middlewares in Finagle are called Filter. The Filter type is more complex than tower’s Layer as it doesn’t require the Req and Rep types in apply to be the same as the request and response for the service parameter.SimpleFilter, as the name implies, is a simplified version with fixed request/response types. A SimpleFilter takes a request and the wrapped Service as parameters and returns a response, so it functions like the tower API but collapses Layer::layer  and Service::call into the single SimpleFilter::apply method.Middleware typesIn general you’ll find that middleware APIs fall into one of two categories: either the middleware is a function taking a request and a next parameter, like in surf and express, or a mapping from one Service into another, like tower and Finagle do.Overall both approaches give just as much flexibility. Both require at least one extra dynamic dispatch per middleware, as Rust does not support impl Trait in return types of trait methods (yet), so we went with the Next approach because that makes it easier to implement middleware, as shown by the difference between surf and tower.

reqwest-middleware

We ended up with a pretty standard API for middlewares (see the docs for a more detailed view of the API):
1#[async_trait]
2pub trait Middleware {
3  async fn handle(&self, req: Request, extensions: &mut Extensions, next: Next<'_>) 
4   -> Result<Response>;
5}
Extensions is used to get arbitrary information across middlewares in a type-safe manner, both from an outer middleware into deeper ones and from an inner middleware out to previous ones.For demonstration purposes, here’s a simple logging middleware implementation:
1use reqwest::{Request, Response};
2use reqwest_middleware::{Middleware, Next};
3use truelayer_extensions::Extensions;
4
5struct LoggingMiddleware;
6
7#[async_trait::async_trait]
8impl Middleware for LoggingMiddleware {
9    async fn handle(
10        &self,
11        req: Request,
12        extensions: &mut Extensions,
13        next: Next<'_>,
14    ) -> reqwest_middleware::Result<Response> {
15        tracing::info!("Sending request {} {}", req.method(), req.url());
16        let resp = next.run(req, extensions).await?;
17        tracing::info!("Got response {}", resp.status());
18        Ok(resp)
19    }
20}
1use reqwest_middlewar::ClientBuilder;
2
3#[tokio::main]
4async fn main() {
5    tracing_subscriber::fmt::init();
6
7    let client = ClientBuilder::new(reqwest::Client::new())
8        .with(LoggingMiddleware)
9        .build();
10    client
11        .get("https://truelayer.com/")
12        .send()
13        .await
14        .unwrap();
15}
1$ RUST_LOG=info cargo run
2Jul 20 19:59:35.585  INFO post_reqwest_middleware: Sending request GET https://truelayer.com/
3Jul 20 19:59:35.705  INFO post_reqwest_middleware: Got response 200 OK

Conclusion

We wrapped reqwest with a middleware-enabled client that exposes the same simple API. This enabled us to build reusable components for our resiliency and observability requirements. On top of that we also published reqwest-retry and reqwest-opentracing which should cover a lot of the use cases for this crate.Developers can now harden their integrations with remote HTTP simply by importing a couple of crates and adding with_middleware calls to the client set up code — no disruption to any other application code.We're hiring! If you're passionate about Rust and building great products, take a look at our job opportunities

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK