On the AWS Application Load Balancer HTTP/2 Support (or lack thereof)

On the AWS Application Load Balancer HTTP/2 Support

At my employer, ShopGun , we rely on Amazon Web Services for most — if not all — of our services’ infrastructure. One of these services exposes a HTTP POST endpoint that allows our clients to persist assets, such as image files and PDFs.

Somewhat recently, our operations team revisited the infrastructure setup for this service. In the process, the AWS Elastic Load Balancer used in this setup for TLS termination and load balancing of HTTP/1.1 was replaced by the new AWS Application Load Balancer. This enabled us to support HTTP/2 for yet another of our services — or so we thought.

At first the upgrade seemed to have gone well. Our integration tests passed, and manually uploading some assets to the endpoint confirmed HTTP/2 support. However, it quickly became apparent that requests were taking way more time to complete than before the load balancer upgrade, and when invoking tcpdump(1) , we saw that throughput was curbed significantly.

Adding to the mystery, this throughput penalty manifests only for HTTP/2 connections. When performing requests that force HTTP/1.1, the throughput is as expected, that is the same as before the upgrade.

This post elaborates on our troubleshooting, and the correspondences — or lack thereof — with Amazon Web Services Premium Support. Before diving further into our troubleshooting, I’ll elaborate a bit more on the particular service interfaced by the ALB.

The Asset Upload Service

As mentioned above our service allows for clients to persist assets. We accomplish this by exposing a single HTTP POST endpoint. The endpoint accepts requests with content type multipart/form-data . A client request is delegated to an EC2 instance in the asset upload service’s auto scaling group.

The assets are extracted from the multipart request body and subjected to a set of validations, e.g. size, content type, etc. If all validations succeed, the assets are then stored temporarily in the handling instance’s local storage, until they have been persisted in S3. Upon a successful request, the client receives a response containing references to the uploaded assets.

At ShopGun , we employ Erlang heavily, and the asset upload service is no exception. It is implemented using the terrific HTTP server library, cowboy .

Initial Reporting and Findings

The asset upload service ran without incident before the infrastructure setup revisit, but soon after the service was deployed in it’s new infrastructure setup, reports of failed requests and unreasonable response times started trickling in. A quick glance at the service’s metrics on our Grafana dashboard confirmed these reports.

After performing some explorative testing, we found that the issue was seemingly not linked to neither the browser or HTTP client of choice, nor choice of operating system, nor the origin of the requests. It was evident that a more systematic approach was required, and I started playing around with curl(1) , tcpdump(1) , and tcptrace(1) .

The following list summarizes the results uncovered by this initial approach:

The throughput penalty is imposed on all HTTP/2 requests, regardless of choice of client or whether the connection is established using prior knowledge or upgraded using ALPN.
By having clients force HTTP/1.1 requests, we see that the ALB performed as the ELB did before upgrading.
The endpoint metrics shows that the validation and S3 persisting performed by the asset upload service executes as expected. The only part of the request handling affected is the actual data transfer.

I decided to create some graphs of the tcpdump(1) packet traces using tcptrace(1) — specifically graphs of TCP segment sizes and throughput. Each of the graphs depict the same request to the asset upload service. This request submits a single multipart file part, which contains a 65 MB PDF file. The request is made using either HTTP/1.1 or HTTP/2. The graphs below show the TCP segment size over time and the throughput over time for when the request is made using HTTP/1.1:

TCP segment sizes over time for an HTTP/1.1 request

Throughput over time for an HTTP/1.1 request

The alternating segment sizes in the former graph is indicative of TCP congestion control, and although the throughput drops significantly around halfway through the client’s transmission of the request body — as witnessed by the throughput graph — we see the that the request, including persisting to S3, completes in a bit more than 6 seconds.

When comparing the graphs of the HTTP/1.1 request above to the corresponding graphs of a HTTP/2 request, we see quite a different picture:

TCP segment sizes over time for an HTTP/2 request

Throughput over time for an HTTP/2 request

The first thing we notice is that the HTTP/2 request takes almost 30 seconds to complete. Additionally, the segment sizes alternate much more for the HTTP/2 request. When we zoom in, we see a saw tooth pattern:

Zooming in on the TCP segment size graph above

This patterns shows short bursts of large segments transmitted by the client, followed by pauses in client segment transmissions. Armed with these graphs, we started to wonder if something was wrong in the HTTP/2 implementation of either our clients or the ALB.

HTTP/2 and `nghttp(1)`

Up until then I had only a vague understanding of how HTTP/2 worked. I had some knowledge about connection handling and streams. As mentioned before, the service is built using Cowboy, which supports HTTP/2. It could be that this had some influence.

When we performed the same tests as above — or almost the same, as we bypassed the the ALB using an ssh tunnel — no penalty was imposed. Additionally, the AWS documentation clearly states that all connections from the ALB to backend targets are HTTP/1.1:

Application Load Balancers provide native support for HTTP/2 with HTTPS listeners. You can send up to 128 requests in parallel using one HTTP/2 connection. The load balancer converts these to individual HTTP/1.1 requests and distributes them across the healthy targets in the target group using the round robin routing algorithm.

This claim was later confirmed by additional testing.

All in all, it started to look like the ALB had a faulty or misconfigured HTTP/2 implementation, so I started reading up on the HTTP/2 protocol. The nghttp2 library offers a curl(1) -like CLI client called nghttp(1) . When invoked with the option -v , nghttp(1) outputs a connection trace which features all frames sent and received.

When examining request traces made with nghttp(1) to the ALB, we saw the following pattern:

(...)
[  0.367] recv window_update frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=16375)
[  0.367] send DATA frame <length=16375, flags=0x00, stream_id=13>
[  0.372] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=16375)
[  0.373] send DATA frame <length=16375, flags=0x00, stream_id=13>
[  0.400] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=32786)
[  0.400] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  0.400] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  0.400] send DATA frame <length=18, flags=0x00, stream_id=13>
[  0.404] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=16375)
[  0.404] send DATA frame <length=16375, flags=0x00, stream_id=13>
[  0.432] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=16375)
[  0.432] send DATA frame <length=16375, flags=0x00, stream_id=13>
[  0.434] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=16375)
[  0.434] send DATA frame <length=16375, flags=0x00, stream_id=13>
[  0.435] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=16411)
[  0.435] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  0.435] send DATA frame <length=27, flags=0x00, stream_id=13>
[  0.438] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=16375)
[  0.438] send DATA frame <length=16375, flags=0x00, stream_id=13>
(...)

The client is allowed to send between 1 and 3 DATA frames at a time before having to wait for the next WINDOW_UPDATE frame. This behaviour, along with the timing information in square brackets matches the previously mentioned saw-tooth behavior. At long last, we seemed to have found a culprit.

For the uninitiated, the WINDOW_UPDATE frames is an application level flow control mechanism to allow for data heavy transfers and low-latency streams to share the TCP connection. To contrast the trace above, we made the same HTTP/2 request as above, although bypassing the ALB. The request trace we see in this case seems much more reasonable:

(...)
[  0.831] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  0.831] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  0.831] send DATA frame <length=4608, flags=0x00, stream_id=13>
[  1.184] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=0>
          (window_size_increment=8000000)
[  1.184] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=8000000)
[  1.184] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.184] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.185] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.185] send DATA frame <length=16384, flags=0x00, stream_id=13>
(...)
[  1.663] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.663] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.663] send DATA frame <length=4608, flags=0x00, stream_id=13>
[  1.913] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=0>
          (window_size_increment=8000000)
[  1.913] recv WINDOW_UPDATE frame <length=4, flags=0x00, stream_id=13>
          (window_size_increment=8000000)
[  1.913] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.913] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.913] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.913] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.913] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.913] send DATA frame <length=16384, flags=0x00, stream_id=13>
[  1.913] send DATA frame <length=16384, flags=0x00, stream_id=13>
(...)

In the above, we see multiple WINDOW_UPDATE frames, each of them contain a window size increment of 8MB, which is the default advertised window for Cowboy. We also see that nghttp(1) subsequently fills that window, until waiting for a new WINDOW_UPDATE frame.

The parenthesized ellipsis in the middle of the snippet is a placeholder for roughly 480 DATA frames, that have been left out for brevity concerns.

Reporting Our Findings to AWS

Considering that the ALB product was announced on August 11, 2016 we figured that we couldn’t have been the first to experience this issue. However, when trawling search results we kept coming up empty handed. Upon reaching out to our AWS Account Manager, we were informed that for Amazon to respond to our inquiry, we would have to upgrade to AWS Premium Support. We chose to do so with the expectancy that our support case would be handled in a swift and — at the very least — courteous fashion.

On January 17, 2018, we submitted a report detailing our findings to AWS Premium Support. After a bit of correspondence back and forth — mostly to provide AWS with additional packet captures and traces — we were told on January 24, 2018 that AWS was investigating the issue, that they would update us with their findings, and that they appreciated our patience. And that was the last we heard.

Aftermath

Since reporting our issue in late January, Amazon has made new releases of their SDKs for Go and JavaScript among others. Among the additions in these releases is the ability to toggle a boolean attribute for Application Load Balancers named routing.http2.enabled , as evident from the following GitHub diffs:

AWS Go SDK release v1.13.3
AWS JavaScript SDK release v2.201.0

The documentation for the AWS Command Line Interface also mentions this attribute , but I have not been able to see when this attribute was added.

We were not notified of this, which we find rather strange. Disabling the HTTP/2 support of an ALB is a workaround that allows us to make use of the ALB after all. As a consequence of not being kept up to date, it was decided to cancel our Premium Support subscription.

Since originally reporting the issue, we have experimented further with the ALB, and we have made these additional findings:

The ALB always relays requests to backend targets using HTTP/1.1, whether the backend target is set up with HTTPS. Although the documentation clearly stated that this was the case, we decided to examine this after reading this rather contradictory forum post response .
The WINDOW_UPDATE issue happens regardless of whether the HTTP endpoint is POST or PUT and seemingly regardless of the request content type and other HTTP headers — seemingly as it is quite the task to test all combinations and variations.
Proxying the ALB’s requests to Cowboy through an nginx daemon set up locally on the each of the service’s EC2 instances also had no effect, nor did changing the Server HTTP response header from Cowboy.

We have since then upgraded back from the ELB to the ALB, however with HTTP/2 support disabled.