10

Abort closed clients to save capacity by sirupsen · Pull Request #1227 · puma/pu...

 3 years ago
source link: https://github.com/puma/puma/pull/1227
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Contributor

sirupsen commented on Feb 25, 2017

edited

Have you ever wondered what happens when a client refreshes their browser really fast? According to Wireshark, Chrome (I haven't tested other browsers) will properly send an RST packet. At Shopify, we see this pattern a lot during flash sales where people are refreshing their browsers aggressively at the top of the hour pending a release.

This will propagate through your reverse proxies and finally update the state in the TCP backlog to a closing state. Your Ruby webserver, however, is less fortunate—it'll pick up the client, go through the Rack middleware stack, render the response, and write it to the client. Only then it'll find out that it's doing all of this on a one-directional socket and raise an error like Errno::EPIPE. Below is a picture of this scenario (it uses Unicorn as a webserver as this is where I was fixing this the first time around, replace it with Puma and it's the same):

a3160770-eeed-11e6-9e5e-2bdcdceab9f6.png

When Req 2, which has been closed by the client, is accept(2)ed by Puma it'll go through Puma's client handling and finally run @app.call. This could mean we're easily spending 10s to 100s of milliseconds rendering a response to a client whose connection is closed. This steals capacity from actual, legitimate users.

In 2012 Tom Burns submitted a patch to Unicorn addressing this problem. All credit to Tom Burns and Eric Wong for figuring this out 5 years ago. Interest was renewed recently internally because this patch doesn't work well for clients that are not on the loopback device. The link explains this in detail, I won't duplicate this here—but feel free to ask questions! Eric and I collaborated on a new patch upstream that's pretty much identical to this one. It simply inspects the TCP state of the client socket it's accepting and aborts if it's in a closed or closing state. While we don't use Puma for Shopify Core, we happily use it in many other apps. I'd love to hear what you think about putting this patch in Puma as well. It's saved us from many DDoSes in the early days and saved us tremendous capacity during sneaker drops (and other sales). smile

The production symptom of this for us is that Nginx will emit a lot of 499 (client aborted connections), but Rails will emit 200s for these 499s. With a connection abortion patch, we can subtract the throughput of 499s from legitimate status codes.

I tested this patch with a simple script that'll start a request and close it immediately. If you run this script against a Puma server without this patch, you'll see 10 requests going through. If you run it with this patch, no requests will go through.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK