

The surprising complexity of interpreting X-Forwarded-For safely
source link: https://www.brainonfire.net/blog/2022/03/04/understanding-using-xff/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
The surprising complexity of interpreting X-Forwarded-For safely
I've seen a lot of uncertainty and misunderstandings about how to handle IP addresses correctly when developing and operating a web service:
- What's my user's IP address?
- How do I use the
X-Forwarded-For
header? - What's the difference between that and
X-Real-IP
or other HTTP headers?
This post explains the need for X-Forwarded-For
(hereafter, "XFF"),
provides a mental model for working with it, and then gives guidance
on how to handle different situations.
I'll first cover why it exists, how to think about it, how to use it, and finally some alternative approaches that may be more appropriate.
(See the end for a summary.)
Why XFF?
I'll explain the purpose of this header by starting simple and then adding in the layers of complexity that lead to it being necessary.
If you're already intimately familiar with this header, feel free to skip ahead to the section "From XFF to IP chain".
Starting simple
In the simplest possible case, you just have a client and a server. This rarely happens on today's internet, but let's start there! The client is probably some person using a browser on their laptop, and the server is directly exposed to the internet. The server sees a request like this:
GET / HTTP/1.1
Host: www.example.com
Accept: text/html
...and it knows that the request came in on a TCP connection from IP
address 1.2.3.4
.
You can see that the HTTP request does not itself contain any information about the client IP address. Instead, the server has to look at the TCP connection the request came over.
We'll represent this as the following diagram, where "no headers" is short for "no IP address related headers":
client @ 1.2.3.4:
Sends: [no headers] to server
server:
Reads: [no headers] from 1.2.3.4
This is the world HTTP started in. A client, a server, nothing in between, nothing complicated.
But then, proxies
It turns out that this situation isn't good enough for a lot of people's needs. Maybe you want to have a load balancer in front of a collection of identical servers. The client sends requests to the load-balancer, and the load-balancer forwards them to whichever server seems least busy. This is a type of proxy. Here's what the chain of requests looks like:
client @ 1.2.3.4:
Sends: [no headers] to load-balancer
load-balancer @ 10.0.3.0:
Reads: [no headers] from 1.2.3.4
Sends: [no headers] to server
server:
Reads: [no headers] from 10.0.3.0
(A public load balancer might have a public IP e.g. 3.3.3.3
and a
private-range network address of 10.0.3.0
. I'll be using the @
to
indicate the IP address that the next node sees; here, the load
balancer and server are on the same network.)
All the server can see is "10.0.3.0
is talking to me", and never
learns that the real client is at 1.2.3.4
.
Oops! The proxy is just shuttling data back and forth, and hasn't
altered the HTTP request, but by sitting in between the client and
server it has ruined the server's ability to see who the real client
is. Since all of
the clients connect through the proxy, the server thinks they all have
the same IP address. That might be a serious problem!
Proxies can send X-Real-IP
One option that people came up with is to have the proxy modify the
HTTP request to inject a header bearing the "real" IP address:
X-Real-IP
. This is not a standardized header, but is quite commonly
supported. Here's what it looks like:
client @ 1.2.3.4:
Sends: [no headers] to load-balancer
load-balancer @ 10.0.3.0:
Reads: [no headers] from 1.2.3.4
Sends: [X-Real-IP: 1.2.3.4] to server
server:
Reads: [X-Real-IP: 1.2.3.4] from 10.0.3.0
Now the server has two IP addresses to work with: 10.0.3.0
is who
directly sent the request to the server, but that sender is also
claiming that it only got the request from 1.2.3.4
.
(Note that the load balancer never says "Hey, I'm 10.0.3.0". First, because it's unnecessary, but second because a node can have multiple IPs and it's not clear which it would even announce.)
And this works out pretty well for setups like this!
But then, two proxies
Where this falls apart is when you have two proxies. This is pretty common in cloud environments. In AWS, you might have a group of application servers that all accept HTTP requests directly, and then a load balancer in front. But that load balancer is only in one geographic region, so you'll also see a CDN put in front of that, mostly acting as a geographically distributed caching proxy.
(The load balancer and the CDN are both groups of servers, not individual nodes in a network, but any one request only passes through one server in each—so it's convenient and not really incorrect to describe a request as passing through "the" load balancer node and "the" CDN node. Just keep in mind that it's not the same one each time.)
Here's what this might look like:
client @ 1.2.3.4:
Sends: [no headers] to CDN
CDN @ 5.5.5.5:
Reads: [no headers] from 1.2.3.4
Sends: [X-Real-IP: 1.2.3.4] to load-balancer
load-balancer @ 10.0.3.0:
Reads: [X-Real-IP: 1.2.3.4] from 5.5.5.5
Sends: [X-Real-IP: 5.5.5.5] to server <-- information lost!
server:
Reads: [X-Real-IP: 5.5.5.5] from 10.0.3.0
See the problem? With X-Real-IP
the server is only given information
about the node directly on the other side of the load balancer, and doesn't
learn about the real client. The information was present, but was then
lost when the load balancer replaced the incoming X-Real-IP
header
with a new one.
X-Forwarded-For makes a chain
The most commonly adopted solution to this problem is to pass a
chain of IPs in an HTTP header. Rather than replacing the existing
header and losing the information, each proxy appends to a new
header called X-Forwarded-For
. (Or sets the header if it's not
already present.) Here's how that looks in our CDN/load balancer
scenario, where marked values show how appending works:
client @ 1.2.3.4:
Sends: [no headers] to CDN
CDN @ 5.5.5.5:
Reads: [no headers] from 1.2.3.4
Sends: [X-Forwarded-For: 1.2.3.4] to load-balancer
load-balancer @ 10.0.3.0:
Reads: [X-Forwarded-For: 1.2.3.4] from 5.5.5.5
Sends: [X-Forwarded-For: 1.2.3.4, 5.5.5.5] to server
server:
Reads: [X-Forwarded-For: 1.2.3.4, 5.5.5.5] from 10.0.3.0
Now the server has all the information it needs. It knows the request
came from remote address 10.0.3.0
and that, if everyone followed the
rules, the request was previously handled by 5.5.5.5
and before that
by 1.2.3.4
. (We'll see later that that's a big "if".)
But how should this information be used?
From XFF to IP chain
Note that the X-Forwarded-For
header is by itself insufficient to
reconstruct the path taken by the request. It must be combined with
the remote address in order to get the fullest possible picture, and
to handle the broadest range of situations. The XFF itself is
incomplete.
To combine them, simply append the remote address onto the end of the XFF. For lack of a better term I'll be calling this concatenation the IP chain for the rest of the article.
The IP chain is very easy to construct. If you receive
X-Forwarded-For: 1.2.3.4, 5.5.5.5
from 10.0.3.0
then the IP chain
is 1.2.3.4, 5.5.5.5, 10.0.3.0
.
A few things to note about it:
- If there's no
X-Forwarded-For
header, the IP chain is just the remote address, e.g.1.2.3.4
. - If our server were itself to act as a proxy, the IP chain is
precisely what it would send as an
X-Forwarded-For
header to the next node.
A reverse linked list of trust
At this point it would be tempting to say:
Ah, so all the server needs to do now is just pluck out the first entry from the IP chain, and that's the client IP!
Unfortunately... the internet is a terrible place where people lie.
If you're using the client IP to block offending users, the user can
simply have their browser send a false initial header of
X-Forwarded-For: 7.8.9.0
. Now the request arrives at your server
with an IP chain of 7.8.9.0, 1.2.3.4, 5.5.5.5, 10.0.3.0
and your
server plucks out the 7.8.9.0
, totally unaware that this is untrue.
If this IP is blocked, the user can keep changing their false request.
This is why, in the general case, you can't just look at any one IP in the IP chain. Whatever your use-case is, your code needs to understand how to interpret the chain. This is all you can know for sure:
- The server knows it got the request from
10.0.3.0
- Each node in the chain claims it got the request from the previous node (and claims that it was sent all the IPs to the left of that)
The external chain
This means you need to start at the rightmost IP—the one you trust—and walk leftwards, and at some point make a decision of when to stop. Generally, this will happen at some trust boundary—the point at which you stop recognizing IP addresses of hosts that you trust to tell you good information. Beyond that point could be anything.
Everything to the right of this trust boundary is your infrastructure. And everything to the left is outside of your infrastructure, so I'll refer to it as the external chain.
What's in the external chain?
- In most circumstances, the external chain will contain a single IP, which will be the "real" client IP.
- It could contain a single IP which is the exit node of a VPN the
client is using. (If you're using TLS, the VPN couldn't attach an
X-Forwarded-For
header even if it wanted to, due to the end-to-end encryption.) - In some unusual cases, the external chain might have two elements: The real client IP, and then a corporate HTTP proxy that inspects all traffic flowing through it.
- And in all of these cases, the client can give the illusion of more IP addresses to the left side of the external chain, just by passing a spoofed XFF header.
In the general case, if you see an external chain of 7.8.9.0, 1.2.3.4
,
you won't be able to tell the difference between 1.2.3.4
being the
client IP (with 7.8.9.0
being a spoofed XFF) and 7.8.9.0
being the
client IP (and 1.2.3.4
being an HTTP proxy).
So, what do you do if there's an external chain with multiple IPs in it? It depends on your use-case. You'll want to use either the rightmost IP, leftmost IP, or entire chain, depending. And that's what the next section is about.
Using the external chain
Here are some examples of how to use the external chain, now that you have it:
IP allowlisting
Just to get some of the more exotic scenarios out of the way, I'll start with the most restrictive, paranoia-requiring case: IP allowlisting (whitelisting). This might come up if you are running a service and a customer only wants computers on their corporate network to be able to access resources on their account. They give you a list of CIDR ranges.
The algorithm here is pretty straightforward: Just use the
rightmost IP in the external chain. For example, if the external chain
were 7.8.9.0, 1.2.3.4
, you want to compare 1.2.3.4
to the
customer's IP ranges to make your decision.
(Maybe 7.8.9.0
is spoofed. Or perhaps it's not, and the user has to use
some kind of mandatory SSL-stripping corporate proxy
and 7.8.9.0
is the actual client, but you don't care—at 1.2.3.4
you've
reached the boundary of the client's network, and it's their problem
after this point.)
But... how paranoid are you? How strict does your security need to be?
Probably the CDN you use can be used by other people too. Someone else
can create a malicious CDN configuration with the same service that
uses your load balancer as its origin but doesn't handle the
X-Forwarded-For
honestly; maybe the attacker has theirs set to drop
the incoming XFF and send a falsified one:
...
attacker-CDN:
Reads: [no headers] from 6.6.6.6
Sends: [X-Forwarded-For: 1.2.3.4] to load-balancer
...
Now, with minimal cost, the attacker has bypassed your allowlisting protection. Defining a "trustworthy proxy" is harder when it's not a node you own and control, just rent from a pool.
In such a situation, you may need to check how your CDN handles XFF. Is it possible to override with configuration? Do you need to add a secret header that proves to you that the request came via your proxy and not someone else's? (What happens if they put a CDN proxy in front of yours and call through that? Will you make sure to only trust one layer of that CDN's IP ranges?)
So IP restriction is a case where you might decide not to use the IP chain at all. See later section "How to not use XFF".
With that out of the way, on to some more low-stakes examples.
Georestriction
With georestriction, the idea is that a service or resource is only permitted to be accessed by people from certain parts of the world.
This might be as specific as a city or as broad as a continent, but is often specified at the country level. Usually this is a contractual obligation; perhaps a publisher only has publishing rights on a certain continent, or a sports broadcaster is only allowing streaming access outside of the hosting country (where they want everyone to watch on cable or broadcast TV, perhaps.) It's understood that a certain amount of leakiness is OK, here; if a few people are watching a sports game from the wrong country, that's not a big deal.
Alternatively, there may be government restrictions such as embargo of advanced technical specifications to a country that OFAC is imposing relevant sanctions against. (This isn't my area, so I'm not sure how the stringency compares to contractual obligations.)
Luckily, as long as your IP-to-country-code lookup service is doing a good job, this is actually a pretty simple type of enforcement to do. It doesn't matter if someone's using a proxy (other than yours) in a blocked country while they are themselves in an allowed country, or vice versa; the content is in either case present in unencrypted form in a blocked country either way. So, the algorithm here is "deny if any":
- Take all of the IP addresses to the left of the trusted ones
- Look up the country code for each IP address
- If any IP address is in a blocked country, Deny
- Otherwise, Allow
There's some other nuance to consider:
- What will you do when an IP does not resolve to a country code? Do you default to deny or allow? (Depends on your business case.)
- Do you really want to look up all the IP addresses even if there's a very long chain? If someone sends in a request with an external chain containing 100 entries, probably most of that is fake; perhaps you want to only look up the first 3 or 4 untrusted IPs.
- It is sometimes possible to detect proxies that spoof the user's geographic location. See the appendix for more information.
Localization
This is another geolocation use-case, but rather than blocking people based on their country, you're trying to guess their location so you can show them different content.
There are often better alternatives; if you want their preferred language or time zone, the browser can usually tell you that via a header or in a Javascript API! And for many use-cases, you can simply ask them.
For localization, you're usually not in adversarial scenario. Here, you want to use the leftmost IP in the external chain. It can be spoofed, but for localization you're unlikely to care about that.
Rate-limiting
Rate-limiting is used when there is an expensive, sensitive, or critical API endpoint and you want to restrict how frequently it can be called per-caller. You'll often see these on APIs in general to protect against poorly written clients, or on authentication endpoints to slow down attacks to a manageable rate. The requirement here is to produce a rate-limiting key that can be used to put requests into buckets, with each bucket allowed a certain rate of requests. For authenticated users this can be as simple as the account ID; for anonymous users you'll likely need to fall back to their IP address.
Generally this is going to be quite simple: Take the rightmost IP in the external chain. That's your rate-limiting key. (Why rightmost? Because rate-limiting is an adversarial use-case, and the chance of spoofed IPs is very high.)
As you add and remove and change CDNs, you'll need to be careful to
update your CIDR ranges of trusted proxies accordingly. For example,
if you put a CDN in front and forget to update your configuration,
everyone in a region might be assigned to the same rate-limiting
bucket. You'll be getting requests with "external chains" like 1.2.3.4,
5.5.5.5
and 5.6.7.8, 5.5.5.5
and everyone will be seen as having IP
address 5.5.5.5
. People will get blocked, especially during
high-traffic periods. Naturally, you'll want anomaly detection
monitoring on your rate-limiting so that you're able to detect,
revert, and fix such a misconfiguration quickly.
Audit logs
There are assorted other business needs where IP addresses are used to identity people. This is not reliable, since people share IP addresses and IP addresses move around, but since it at least mostly works there can also be ethical (and regulatory) risks for storing this information and it is best avoided if possible. Nevertheless, there are legitimate business cases for collecting and storing this information at times, usually relating to security:
- Audit logs (including simple access logs) that allow reconstructing someone's behavior over time, which can useful in investigating a security incident
- Showing a user where they have active login sessions, e.g. if they have multiple sessions open that are aparently in different countries
For this, you want to use the whole external chain. Someone is going to be analyzing this data manually. Just store or display the whole list so they have all the information they can use, and they'll sort it out themselves, filtering as needed.
How to not use XFF
You may have noticed one recurring theme: It's very important to set the trust boundary correctly, but also difficult to do so in a robust way. Keeping an up-to-date list of CIDR ranges for each proxy requires another process that needs upkeep and monitoring. If the ranges go out of sync, your IP allowlisting might start denying everyone, or your georestriction might think every request involves one country. Not all CDNs or other proxies are even conducive to this kind of configuration.
There's an alternative: Hardcode the trust boundary. This can take several forms.
Fixed index
If you always know the number of proxies between you and the last external IP, you can just configure your application to always strip off the last N entries from the IP chain. The simplicity of this option is very alluring, but fragile in the face of network changes.
Set header at boundary
A more robust option is to configure your outermost proxy to set a
header with the IP that is directly before it in requests. This is
like X-Real-IP
—it could even be X-Real-IP
, for some deployments—but
with the critical requirement that none of the intervening proxies will alter
it.
For example,
Cloudflare provides CF-Connecting-IP
,
a non-standard header that is specific to their service and that helps
implement this strategy. Here's a worked example for a request in
which the client is trying several methods of spoofing their IP,
passing in faked headers:
spoofing-client @ 1.2.3.4:
Sends: [CF-Connecting-IP: 7.8.9.0; X-Forwarded-For: 7.8.9.0] to Cloudflare <-- spoofed headers
Cloudflare @ 5.5.5.5:
Reads: [CF-Connecting-IP: 7.8.9.0; X-Forwarded-For: 7.8.9.0] from 1.2.3.4
Sends: [CF-Connecting-IP: 1.2.3.4; X-Forwarded-For: 7.8.9.0, 1.2.3.4] to load-balancer <-- overwrites CF-Connecting-IP
load-balancer @ 10.0.3.0:
Reads: [CF-Connecting-IP: 1.2.3.4; X-Forwarded-For: 7.8.9.0, 1.2.3.4] from 5.5.5.5
Sends: [CF-Connecting-IP: 1.2.3.4; X-Forwarded-For: 7.8.9.0, 1.2.3.4, 5.5.5.5] to server
server:
Reads: [CF-Connecting-IP: 1.2.3.4; X-Forwarded-For: 7.8.9.0, 1.2.3.4, 5.5.5.5] from 10.0.3.0
Here, Cloudflare maintains the X-Forwarded-For
header as usual, but
it also sets the new CF-Connecting-IP
header. Critically, it throws
away any existing value for that header, since it comes from outside
the trust boundary. Equally critically, no later proxy alters that
header.
Again, this is what the server sees:
- IP chain:
7.8.9.0, 1.2.3.4, 5.5.5.5, 10.0.3.0
- Additional
CF-Connecting-IP
value of1.2.3.4
Here, Cloudflare has done the work of identifying the trust
boundary. If you need to construct the external chain, the trust
boundary is the first IP before the last instance of the
CF-Connecting-IP
value. Walk leftwards as before until 1.2.3.4
is
reached, discarding 10.0.3.0
and then 5.5.5.5
in turn:
7.8.9.0, 1.2.3.4, 5.5.5.5, 10.0.3.0
7.8.9.0, 1.2.3.4, 5.5.5.5, 10.0.3.0
7.8.9.0, 1.2.3.4, 5.5.5.5, 10.0.3.0
Therefore, the external chain is 7.8.9.0, 1.2.3.4
, and it may be used
in all the ways described above.
(Of course, you can also take shortcuts here; any code requiring the
leftmost IP can take it from the original IP chain. And any code
requiring the first untrusted IP can use the CF-Connecting-IP
without looking at the chain at all. Other use-cases need to construct
the chain.)
Caveats with this approach:
- I'm not sure which CDNs offer a feature like this.
- Ensure that your CDN properly drops and replaces any existing incoming header of that name, including any case variations.
- If anyone can bypass your CDN and make requests directly to your application or load-balancer, they can claim to be the trust boundary by sending in their own spoofed header. This failure mode is true for many of the approaches listed in this article.
- Finding the matching IP in the list may require canonicalizing the IP addresses for comparison. Straight string comparisons may result in bugs or vulnerabilities.
Fragility is inherent
All of the techniques listed here have in common that they are fragile in the face of uncoordinated network changes, or even coordinated ones. If a proxy in front of your service is added or removed, code dealing with IP addresses is likely to break. There are several things you can do to help mitigate this:
- Monitoring: If you have anomaly detection set up to alert you on sudden changes in rate-limiting, georestriction, and allowlisting denials, you're in a much better position to detect an uncoordinated change (at the possible expense of false positives). In particular, I would suggest monitoring the average length of the calculated external chain for some high-traffic but low-sensitivity API call, which should quickly alert you to changes without incurring false positives during attacks.
- Rotation support: When changing your network configuration, you may
need a way for the same server to support multiple configurations,
e.g. custom headers from two different CDNs as you switch from one
to the other. Being able to configure a list of custom headers
(as a fallback cascade) gives
you the freedom to make this change. The ability to fall back to the
plain
X-Forwarded-For
and remote address may also be useful for some installations.
Summary
- If you concatenate the
X-Forwarded-For
header and the remote address of the HTTP connection (calledremote_addr
in some systems) you get a list of IP addresses. Both parts must be included. - This list is a chain that must be walked from right to left (i.e. backwards).
- While walking leftwards, discard each IP address you encounter that you recognize and trust.
- The remaining list can then be used, but usage depends on context.
- There are alternative ways to produce the IP list, but all methods require you to know your network configuration.
Appendix: Proxy detection
I'd like to include some additional information on the topic of proxy detection. People will often evade georestriction by using VPNs with exit nodes in different countries, and service providers sometimes want to detect and block these, or at least treat them as "unknown country". I've worked with a vendor offering a detection service for proxies and they did a kind of miserable job of it, blocking a number of IPs that were not HTTP proxies at all. I learned a good deal from this experience and can offer some tips.
A mistake I've seen multiple vendors make is blocking all Tor entrance and middle relays rather than just Tor exit nodes. This ends up blocking a lot of people and data centers that are contributing to Tor but not in a way that sends traffic to your site. ("Guard" and "middle" relays only relay traffic to other Tor nodes, not to you. If you get traffic from one, it's not Tor traffic.) So if you need to block Tor, skip the vendor and just periodically download Tor's own exit list. It's far more reliable. (This is also a great test of your vendor: If they block a larger list than this when claiming to block Tor, they're probably screwing up in less obvious ways too.) However, also keep in mind that people in some countries need to use Tor to avoid censorship—that's largely what it's designed for, after all—so keep a light touch here if you can, for humanitarian reasons.
Beyond Tor, there are a great many private VPN services, and there's a cat and mouse game between VPNs and VPN-detectors. It's not possible to do this perfectly, so on this front you shouldn't have very high expectations. And as IP addresses get reassigned, you're not only going to miss a bunch of geo-spoofing proxies, you're also going to block some legitimate traffic. You'll also block people who are using VPNs but just for privacy, without the intent of spoofing their location. I don't have much experience with VPNs, but my suspicion is that VPN services will tend to choose exit nodes near the customer for improved latency, meaning that a great many of these people will have their IP hidden, but will be in the same country as their exit node. They may be very confused when they get blocked by georestriction despite being in the "right" country. (You may wish to return distinct errors for "wrong country" and "proxy blocked", at the very least.)
Be prepared to deal with the customer service load. But not only that: Depending on your business, you may need to make a "back door" that will allow people access in spite of what your geo-IP or proxy-detection service claims.
Updates
- 2022-03-31: Go check out Adam Pritchard's "The perils of the “real” client IP", which by chance came out at the same time. It covers some of the same advice, but has a strong focus on rate-limiting and covers topics that I didn't get into or didn't think of—special challenges of IPv6 in rate-limiting, mishandling of multi-valued headers (both in proxies and in standard libraries), and actual examples of widely used software and services that do the wrong thing and make it harder to secure your service. (Akamai gets a special mention as being particularly badly behaved.) I knew the situation was bad, but this was eye-opening as to the wide variety of kinds of bad that are present in the IP determination and ratelimiting space.
©2022 2U (my employer) but published here under Apache License 2.0. However, claims and opinions are my own.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK