Email explained from first principles at SSHFP resource record
source link: https://explained-from-first-principles.com/email/#sshfp-resource-record
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
explained from first principles
This article and its code were first published on 7 May 2021. If you like the article, please share it with your friends on social media: ef1p.com/email. You can also join the discussion on Reddit or download the article.
If you are visiting this website for the first time, then please first read the front page, where I explain the intention of this blog and how to best make use of it. As far as your privacy is concerned, all data entered on this page is stored locally in your browser unless noted otherwise. While I researched the content on this page thoroughly, you take or omit actions based on it at your own risk. In no event shall I as the author be liable for any damages arising from information or advice on this website or on referenced websites.
Preface
Being one of the oldest services on the Internet, email has been with us for decades and will remain with us for at least another decade. Even though email plays an important role in everyday life, most people know very little about how it works. Before we roll up our sleeves and change this, here are a few things that you should know:
- This article covers all aspects of modern email. As a result, it became really long. While later chapters do build on earlier ones, you can start reading wherever you want and fill your knowledge gaps as you go.
- This article is structured as follows: After clarifying some user-facing concepts, we’ll look at the technical architecture of email and the roles of the various entities. We’ll then study the protocols used by these entities to communicate with one another and the format of the transmitted messages. Now that we understand how email works, we can discuss its privacy and security issues and examine how some of the security issues are fixed by more recent standards.
- Among many other things, you will learn in this article why mail clients use outgoing mail servers, why SMTP is used for the submission and the relay of messages, how mail loops are prevented, and how you should configure your custom domains.
- Even if you’re not interested in email, this article can teach you a lot about Internet protocols and IT security. For example, it covers Implicit and Explicit TLS; password-based authentication mechanisms with hash functions, replay attacks, encryption mechanisms, and channel bindings; internationalized domain names with Punycode encoding, Unicode normalization, case folding, and homograph attacks; transport security with DANE and HSTS; and end-to-end security with S/MIME and PGP.
- If you haven’t done so already, read the article about the Internet first. This article assumes that you’re familiar with the following acronyms and the concepts behind them: RFC, IP, TCP, TLS, DNS, and DNSSEC.
- This article contains 29 tools. To make it easier to play around with them, I’ve published them on a separate page as well.
- This article focuses on how modern email works, not on how you set up your own email infrastructure. If you want to do that, Mail-in-a-Box seems like a good place to start.
- During my research for this article, I made responsible disclosures to Gandi, Microsoft, and Mozilla Thunderbird. I also submitted quite a few RFC errata.
Concepts
Before diving into the technical aspects of email, let’s first look at email from the perspective of its users.
Message
The purpose of email is to send messages over the Internet. A message is a recorded piece of information which is delivered asynchronously from a sender to one or several recipients. Asynchronous communication means that a message can be consumed at an arbitrary point after it has been produced, rather than having to interact with the sender concurrently. A message can be transmitted with a physical object, such as a letter, or with a physical signal, such as an acoustic or electromagnetic wave. While humans have delivered messages in the form of objects for millennia with couriers and pigeons, it’s only since the invention of the optical telegraph in the late 18th century and the invention of the electrical telegraph in the middle of the 19th century that we can signal arbitrary messages over long distances. The fundamental principle of communication stayed the same over all those years: You can either start a new conversation or continue an existing one by replying to a previous message.
Mailbox
A mailbox is a box for incoming mail (also called an inbox), into which everyone can deposit messages but ideally only the intended recipient can retrieve them. In some countries, the privacy of such messages is legally protected by the secrecy of correspondence.
Provider
There are three things that set email apart from the traditional postal system, which is sometimes also referred to as snail mail:
- Email conveys digital data, whereas a letter is an analog item. The former is much more useful for further processing.
- Email enables instant global delivery at a marginal cost of zero. The only fee you pay is for your access to the Internet.
- Mailboxes for email are provided and operated by companies, which are called email service providers. While you could operate your own server since email is an open and decentralized system, this is rarely done in practice for reasons we discuss later on.
Address
Email addresses are used to identify the sender and the recipient(s) of a message. They consist of a username followed by the @ symbol and a domain name. The domain name allows the sender to first determine and then connect to the mail server of each recipient. The username allows the mail server to determine the mailbox to which a message should be delivered. The hierarchical Domain Name System ensures that the domain name is unique, whereas the email service provider has to ensure that the name of each user is unique within its domain. There doesn’t have to be a one-to-one correspondence between addresses and mailboxes: A mailbox can be identified by several addresses, and an email sent to a single address can be delivered to multiple mailboxes.
Display name
How Apple Mail shows the display name in the To
and From
fields – if you have Smart Addresses disabled, which you totally should.
The @ symbol
Normalization
Subaddressing
Go to the Accounts and Import tab of your settings and click on “Add another email address” under “Send mail as”.Afterwards, enter the preferred display name and subaddress in the new window. You can leave the box “Treat as an alias” checked.
(In either case, Gmail asks the recipient to reply to your subaddress, while the main address is used in the Return-Path
header field.)
Click on the button “Next Step” and you’re done. You can now select a different From
address the next time you compose a message.
Alias address
Mailing list
Address syntax What a standard allowsWhat is actually being used Often only a subset of a standard finds adoption, while some things become convention without a formal standard.
Common addresses
Most of these addresses are encouraged by RFC 2142: “Mailbox names for common services, roles, and functions”. Role-based addresses are usually configured as aliases so that incoming emails can be forwarded to several people.
Recipients
You can address the recipients of a message in three different ways:
- The
To
field contains the address(es) of the primary recipient(s). As a sender, you expect the primary recipient(s) to read and often to react to your message. The expected reaction can be a reply or that they perform the requested task. - The
Cc
field contains the address(es) of the secondary recipient(s). As a sender, you want to keep the secondary recipient(s) informed without expecting them to read or react to your message. (Cc
stands for carbon copy.) - The
Bcc
field contains the address(es) of the hidden recipient(s). Their address(es) are not to be revealed to other recipients of the message. The field is usually fully preserved in your folder of sent messages but fully removed in the version of the email that is delivered to others. Alternatively, a different message could be delivered to each hidden recipient where their address alone is listed in theBcc
field. The standard also allows hidden recipients to see each other; they just have to be removed for the primary and secondary recipients. The vague semantics of this feature leads to several problems. (Bcc
stands for blind carbon copy.)
Important: Just because someone is listed as another recipient doesn’t mean that they received the same message as you. The reason for this could be innocuous or malicious. On the one hand, it may be that the email could simply not be delivered to them. On the other hand, the sender might have delivered the message only to you in order to mislead you. Your email service provider has no way of verifying that the same message has also been delivered to the other recipients. This allows a fraudster to fake a relationship that they do not have or to lead you to believe that they have done the introduction you asked them for, even when this is not the case. If you reply to all, your reply would also be sent to the faked recipients, of course.
Group construct
Sender
There are two relevant fields to indicate the originator of a message:
- The
From
field contains the address of the person who is responsible for the content of the message. - The
Reply-To
field indicates the address(es) to which replies should be sent. If absent, replies are sent to theFrom
address.
Important: The core email protocols do not authenticate the sender of an email.
It’s called spoofing
when the sender uses a From
address which doesn’t belong to them.
Forged sender addresses are a huge problem for the security of email.
There are additional standards to authenticate emails.
For them to have the desired effect, though,
both the sender and the recipients have to use them.
Sender field
No reply
Subject
The Subject
field identifies the topic of a message.
Its content is restricted to a single line but the line can be of arbitrary length.
(We’ll talk about encoding later.)
RFC 5322 also defines other informational fields,
namely Comments
and Keywords
, but I’ve never seen them being used.
All informational fields are optional, which means an email doesn’t need a subject line.
The mail clients I’ve checked, though, include the Subject
field even when it’s empty.
While the message is transmitted with an empty Subject
field,
mail clients usually display “(No subject)” instead of nothing.
Prefixes
Last but not least, an email has a body (which is strictly speaking optional). The body contains the actual content of a message. It can be formatted in different ways and can consist of different parts. Splitting the body into several parts is useful, for example, to send a plaintext version alongside an HTML-encoded message or to attach files to an email. We’ll discuss later how all of this works.
Size limit
Architecture
There are four separate aspects to understand email from a technical perspective:
- Format: What is the syntax of email messages?
- Protocols: How are these messages transmitted?
- Entities: Who transmits these messages to whom?
- Architecture: How are these entities arranged?
Let’s go through them one by one in the opposite order.
Simplified architecture
One reason why email is so hard to grasp is because the official terminology is unnecessarily complicated in most circumstances. Throughout this article, we’ll work with a much simpler version. Email follows the client-server model: A client opens a connection to a server in order to request some service. In all the graphics where arrows represent an exchange of data, the arrows point from the client to the server; i.e. in the direction of the request, not the response. The following entities and protocols are involved in the transmission of a message from a sender to a recipient:
imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient
The simplified email architecture. We’ll discuss each entity in the next section and the protocols thereafter.
Standardization SenderServerClientUserRecipientServerClientUser How emails are submitted and accessed (in blue) is independent from how emails are exchanged between servers (in green).
Webmail WebWeb serverWeb browserUserMailMail serverMail clientUserWebmailMail serverMail clientWeb serverWeb browserUser+=& In the case of webmail, the mail client is accessed via a web server using a web browser. DataDataCodeWeb serverWeb browserMail serverMail clientCodeWebMailvs. On the left, the code to interact with the data comes from the server. On the right, the logic is inside the client and only data is exchanged.
Official architecture
For the sake of completeness and to enable you to understand the linked articles, this subsection covers the official terminology as used, for example, in RFC 5598. In the official documents, there are five instead of three entities, with each of them having a more complicated name and, of course, an associated three-letter acronym (TLA):
TLA
Name
Description
MUA
Mail user agent
Client to compose, send, receive, and read emails, such as
Microsoft Outlook, Apple Mail, and Mozilla Thunderbird.
MSA
MSS
Mail submission agent
Mail submission server
Server to receive outgoing emails from authenticated users
and to queue them for delivery by the mail transfer agent (MTA).
MTA
Mail transfer agent
Server to deliver the queued emails and to receive them on the other end.
It then forwards the received emails to the mail delivery agent (MDA).
MDA
Mail delivery agent
Server to receive emails from the local mail transfer agent (MTA)
and to store them in the message store (MS) of the recipient.
MS
MAS
Message store
Mail access server
Server to store the emails received from the mail delivery agent (MDA)
and to deliver them to the mail user agent (MUA) of the recipient.
The terminology used by the Internet Engineering Task Force (IETF) in its official documents, such as this one. The terms in italics are used in some newer documents, such as this one. I added them because I like them better.
These terms are not as precise as they seem to be and the boundaries are often fluid in practice. Having more entities also changes the architecture. What follows is a nicer version of this ASCII graphic, which is a masterpiece to be appreciated in its own right.
Mail useragent (mua)Mail submissionagent (msa)Mail transferagent (mta)Mail deliveryagent (mda)Messagestore (ms)Mail useragent (mua)Mail submissionagent (msa)Mail transferagent (mta)Mail deliveryagent (mda)Messagestore (ms)SenderRecipient
The official Internet Mail Architecture with SMTP connections in green and IMAP connections in blue.
None of the servers have to be a single machine. In addition, the incoming MTA and the outgoing MTA don’t have to be the same.
Entities
There are three entities in the simplified architecture: the mail client, the outgoing mail server, and the incoming mail server.
Mail client
The mail client is a computer program to compose, send, retrieve, and read emails. It provides the interface through which users handle email. The mail client runs either locally on the user’s device or remotely on a web server. Examples of the former kind are Microsoft Outlook, Apple Mail, and Mozilla Thunderbird. Examples of the latter are Google Gmail and Yahoo! Mail when accessed through a web browser. (Both companies also provide mobile apps for Android and iOS, which fall into the former category.)
The mail client connects to the outgoing mail server to submit messages for delivery to other users and to the incoming mail server to fetch new messages from the user’s mailbox. Both servers authenticate the user, typically with a username and a password. The mail client connects to the incoming mail server through a different interface than outgoing mail servers do, which can be seen on the recipient’s side of the simplified mail architecture:
imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient
The recipient’s mail client connects to the incoming mail server using a different port and protocol than outgoing mail servers. It’s usually also a different machine with a different domain name and IP address than the one outgoing mail servers connect to.
This distinction is apparent in the official mail architecture, where the message store (MS) and the mail transfer agent (MTA) reside in different boxes. By giving the impression that the incoming mail server is a single machine, the simplified model doesn’t explain why the incoming mail server needs to be configured in the mail client of its user but not in the outgoing mail servers of other users. Since the simplified architecture is less confusing in every other regard, it’s still the preferred model for the scope of this article.
Configuration
The simplified email architecture corresponds to what mail clients like Apple Mail display to you.
The domain of the address (ef1p.com
) is different from the domain of the servers (mail.gandi.net
).
The host names of the incoming mail server and the outgoing mail server are usually not the same.
Custom domains
Autoconfiguration
Insert the appropriate Domain
and use 0 0 0 .
for all the services
which are not supported by your email service provider.
Configuration database
Outgoing mail server
The outgoing mail server accepts messages from mail clients and queues them for delivery. It then determines the incoming mail server of each recipient and delivers the message to them. The outgoing mail server acts as a server in the interaction with mail clients but assumes the role of a client when relaying the message to incoming mail servers. (Connections are always initiated by clients.) If the outgoing mail server cannot deliver a message, it sends a bounce message to the user who submitted the message. While the outgoing mail server should not change the content of a message, it adds information about the submitter at the top. Before accepting a message, the outgoing mail server authenticates the user, typically based on a username and a password.
Why do we need outgoing mail servers when mail clients could simply deliver the messages directly?
imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionsmtp fordirectmessagedelivery?Mail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient
A hypothetical email architecture without outgoing mail servers.
Userauthen-ticationUserauthen-ticationDomainauthenticationUserauthen-ticationMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient
The incoming mail server verifies that the outgoing mail server is authorized to send messages from the claimed domain,
while the outgoing mail server of the sender ensures that each user uses their own address in the From
field.
How to avoid submitting the same message to both the outgoing mail server and the incoming mail server? imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient The mail client submits the same message to both the outgoing mail server and the incoming mail server. 2. Store1. SubmitMail clientIncomingmail serverOutgoingmail server Gmail automatically stores sent messages. 2. Submit1. StoreMail clientIncomingmail serverOutgoingmail server The Courier IMAP server can deliver emails. 3. Fetch2. Reference1. StoreMail clientIncomingmail serverOutgoingmail server The lemonade profile enables the outgoing mail server to fetch content for delivery from the incoming mail server.
Incoming mail server
The incoming mail server waits for connections from outgoing mail servers of other users. When an outgoing mail server connects to transmit a message, the incoming mail server records the message together with other information from the session, such as the sender’s IP address. The incoming mail server can reject the incoming message for a number of reasons: The recipient might not exist, their mailbox might be full, the message might be too long, or the sender might not be trusted. If the message is rejected, the outgoing mail server can either try to retransmit it at some later point or inform the user about the failed delivery. If, on the other hand, the incoming mail server accepts the message, it also assumes responsibility for delivering the message. If it fails to do so, for example when the message needs to be forwarded, then the incoming mail server should notify the author of the message.
Once the session with the outgoing mail server is over, the incoming mail server adds the additional information collected during the session to the top of the accepted message. It then evaluates whether the message is likely spam. Depending on the score of this evaluation, the message is either delivered to the recipient’s inbox, quarantined to the recipient’s spam folder, or discarded without notifying the author. While the last option violates the principle that mail is either delivered or returned, the alternative is often worse. This is why the standard explicitly allows incoming mail servers to drop received messages silently. If the receiving address is an alias, the incoming mail server forwards the message to the configured email address instead of delivering it to an inbox. In case the address denotes a mailing list, the incoming mail server sends the message to all subscribers of the list. The incoming mail server also applies filters and generates automatic responses, such as delivery failures and out-of-office replies.
The incoming mail server waits for connections from mail clients on a different interface. In order to access the mailbox of its user, the mail client has to present appropriate credentials. The user’s email address and password are often used to authenticate the client, which is granted unlimited access to the mailbox on success. If the incoming mail server supports OAuth, the mail client can present an access token to gain potentially limited access to the user’s mailbox. The scopes offered by Gmail are an example of what limited access can look like. While restricted authorization is common for other services, it’s not yet the norm for email. Once the client is authenticated, it can retrieve, deposit, and delete messages. It can also mark them as read or flag them for later attention.
Address resolution
How do outgoing mail servers find the incoming mail server of a recipient?
As we learned above, an email address consists of a username and a domain name, separated by the @ symbol.
A sender finds the incoming mail server of a recipient
by querying the Domain Name System (DNS)
for mail exchange (MX
) records of the used domain name.
If no such records exist, the sender queries for address records
(A
or AAAA
) of the domain name instead.
If the DNS response is not authenticated with DNSSEC,
mail might be sent to the server of an attacker.
TLS can prevent this only
if the sender requires that the recipient’s domain is included in the
server certificate,
which is usually not the case.
A standard for securing MX
records with TLS exists, though.
A domain can list several servers that handle incoming mail.
MX
records assign a priority to each incoming mail server.
The lower the number, the higher its priority.
This is useful for providing redundancy in case the most preferred server is not responding.
Several servers with the same priority can be used for
load balancing.
You can use the following tool to look up the incoming mail servers of a domain you are interested in.
It uses an API by Google to query the Domain Name System
and an API by ipinfo.io to determine the geographic location of each server.
The latter is just to remind you that the Internet is a physical infrastructure.
Outgoing mail servers need to know only the IP address of the incoming mail server, of course.
(A remark on the subdomains you might encounter:
spool is a synonym for
buffer/queue,
fb
probably stands for fallback and alt
for alternative.)
Null MX record
Dotless domains
Name collisions
Protocols
The above entities communicate with two kinds of protocols: They use delivery protocols to deliver messages and access protocols to access the user’s mailbox. As discussed earlier, only SMTP for message relay is mandatory. All other protocols can be replaced in a proprietary setup. For example, there are efforts to combine message submission and mailbox access in a standardized way.
imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient
The simplified email architecture with delivery protocols in green and access protocols in blue.
Use of TLS
Historically, SMTP, POP3, and IMAP ran directly on top of the transport layer using the Transmission Control Protocol (TCP), which means that the communication was neither encrypted nor authenticated. Anyone with access to one of the networks through which the communication was routed could therefore read and potentially alter your messages. Even your user password might have been transmitted in the clear. In theory, the solution is straightforward: Use Transport Layer Security (TLS) to encrypt and authenticate the communication between each pair of entities. In practice, however, you want to be backward compatible: A server that expects requests to be in a specific format cannot suddenly handle a request for a TLS handshake. There are two ways around this problem:
- Implicit TLS: Introduce a new port number for each service on which the communication starts directly with a TLS handshake. The protocol variant which uses TLS implicitly is denoted by appending an S to its name. For example, IMAP becomes IMAPS.
- Explicit TLS or STARTTLS, sometimes mistakenly called opportunistic TLS: Allow the client to upgrade an insecure connection to a secure connection with a command once the server has indicated that it supports TLS. The communication is secured only if the client requests this explicitly. The server cannot require the upgrade to TLS as this would break backward compatibility.
With one notable exception, most longstanding email protocols were adapted to support both Implicit TLS and Explicit TLS.
Implicit TLS versus Explicit TLS TLS settings in mail clients
Mail clients often use other names for Implicit TLS and Explicit TLS.
The server settings in Apple Mail when “Automatically manage connection settings” is disabled. Also somewhat disappointingly, Apple Mail uses Explicit TLS rather than Implicit TLS by default. Encryption on the Web
Deployment statistics
Port numbers
Every protocol specifies a default port on which servers listen for incoming requests. Instead of scattering the port numbers used by various email protocols throughout the following subsections, here is a table with all the relevant information for future reference:
Protocol Port for Implicit TLS Port for Explicit TLS SMTP for Submission 465 (587) SMTP for Relay – 25 POP3 995 (110) IMAP 993 (143) JMAP via HTTPS 443 – ManageSieve – 4190
The port numbers used by the various email protocols.
Since RFC 8314,
Implicit TLS is the preferred option and cleartext is considered obsolete on the port for Explicit TLS.
Why has SMTP for Relay no port for Implicit TLS?
Delivery protocols
Submission versus relay
Header fields and body
We’ll have a closer look at the format of messages in the next section
but, because we already want to transmit messages in this section,
we have to cover the basics now.
A message consists of several header fields
and an optional body,
which follows after an empty line.
Each header field has to be on a separate line but can,
if necessary, span several lines.
Identical to HTTP,
header fields are formatted as Name:
Value
.
What follows is a simple example message.
You can find more examples in RFC 5322.
From: Alice <[email protected]>
To: Bob <[email protected]>
Cc: Carol <[email protected]>
Bcc: IETF <[email protected]>
Subject: A simple example message
Date: Thu, 01 Oct 2020 14:56:37 +0200
Message-ID: <[email protected]>
Hello Bob,
I think we should switch our roles.
How about you contact me from now on?
Best regards,
Alice
A simple example message with a sender, three recipients, a subject, a date, a message ID, and a body.
Message versus envelope
While outgoing mail servers may add missing header fields
and sign each message,
incoming mail servers should only add trace information to the top of a message
and leave the message as is otherwise.
The information relevant for handling the message,
such as the addresses to deliver the message to and the address to report failures to,
belongs to the so-called envelope.
The envelope is specific to the Simple Mail Transfer Protocol (SMTP)
and it can change completely during the delivery of a message.
The message, on the other hand, mostly stays the same during delivery
and its format is also used by two access protocols.
The important thing to remember is that emails are delivered based on the addresses in the envelope
and not the addresses in the header section of the message.
Somewhat unfortunately, the fields in the envelope are called similarly to some header fields in the message:
MAIL
FROM
for the address to report failures to and RCPT
TO
for each address to deliver the message to.
Diverging envelope example
Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Outgoingmail serverof example.orgSubmission: The mail client of Alice removes the Bcc
header field from the message
and submits the message with all recipient addresses in the envelope,
including the ones of Bcc
recipients, to the outgoing mail server.
Automatic responses shall be sent to the mailbox of Alice.
Outgoingmail serverof example.orgEnvelopeMAIL FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Incomingmail serverof example.comFirst relay: The outgoing mail server is now responsible
for delivering the message to the recipients that the mail client specified.
It sees that two recipient addresses are handled by the same domain
and delivers the message in a single envelope to the incoming mail server of this domain.
The outgoing mail server could also connect to the incoming mail server twice,
delivering the message once for Bob and once for Carol.
I don’t know which approach is more common in practice.
RFC 5321 just says that,
when the same message is delivered to multiple recipients in the same session,
it should be delivered with a command sequence of
MAIL
FROM
, RCPT
TO
, RCPT
TO
, DATA
rather than
MAIL
FROM
, RCPT
TO
, DATA
, MAIL
FROM
, RCPT
TO
, DATA
.
We’ll discuss how the envelope corresponds to protocol messages soon.
Incomingmail serverof example.comEnvelopeMAIL FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Incomingmail serverof example.netAlias: The incoming mail server of example.com
knows that
[email protected]
is an alias for [email protected]
.
It thus forwards the original message without any modifications to the incoming mail server of example.net
.
A potential delivery failure is still reported to Alice.
Outgoingmail serverof example.orgEnvelopeMAIL FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Incomingmail serverof ietf.orgSecond relay: The outgoing mail server of Alice also has to deliver the message to the recipient [email protected]
,
so it does that.
Incomingmail serverof ietf.orgEnvelopeMAIL FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Incomingmail serverof example.netMailing list: It turns out that [email protected]
is a mailing list.
It is now the task of the mail server of ietf.org
to deliver the message to all subscribers of this list,
with one of them being [email protected]
.
RFC 5321 requires that
the bounce address
as specified in the MAIL
FROM
field of the envelope
is changed to the entity who administers the mailing list.
The entity can be a person but is typically a piece of software,
which keeps track of delivery failures in order to revise the list.
The RFC also demands that the From
field in the message remains the same.
Mailing list tools
often modify the message in some ways,
for example by adding a field to the header and a footer to the body
in order to let recipients of the message unsubscribe from the mailing list.
Alias addresses and mailing lists cause difficulties for domain authentication.
Who removes the Bcc header field?
Another example message with several Bcc
recipients.
Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Outgoingmail serverof example.org
The Bcc
field is removed from the message for all recipients.
Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Outgoingmail serverof example.org
The non-Bcc
recipients get the redacted message.
Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Bcc: Carol <[email protected]>,David <[email protected]>Outgoingmail serverof example.org
The Bcc
recipients get the original message.
Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Outgoingmail serverof example.org
All non-Bcc
recipients get the same redacted message.
Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Bcc: Carol <[email protected]>Outgoingmail serverof example.org
Carol in Bcc
gets her own version of the message.
Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Bcc: David <[email protected]>Outgoingmail serverof example.org
And so does David.
Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Bcc:Outgoingmail serverof example.org
An empty Bcc
field indicates that there were hidden recipients without disclosing them.
How does Gmail recover the Bcc field of sent messages?
Simple Mail Transfer Protocol (SMTP)
The Simple Mail Transfer Protocol (SMTP) was first specified in RFC 821 in 1982. As its name suggests, it is a fairly simple protocol:
ClientServer(Open connection)220 server.example.comHELO client.example.org250 server.example.comMAIL FROM:<[email protected]>250 OkRCPT TO:<[email protected]>250 OkDATA354 Go ahead(Actual message)250 OkQUIT221 Bye(Close connection)
After opening a TCP connection on port 25,
the client sends commands and the server responds with
status codes.
Once they greeted each other, the client transmits the envelope,
followed by the DATA
command and the message.
Command syntax
Field terminology
Extended Simple Mail Transfer Protocol (ESMTP)
A framework for extending SMTP was introduced in RFC 1425 in 1993.
The extensible protocol, which is backward compatible with SMTP,
is called the Extended Simple Mail Transfer Protocol (ESMTP).
ESMTP was revised in RFC 1651 (1994),
RFC 1869 (1995),
RFC 2821 (2001),
and most recently in RFC 5321 (2008).
The basic idea behind ESMTP is that the client greets the server
with the “extended hello” command EHLO
instead of the old “hello” command HELO
.
This indicates to the server that the client understands ESMTP.
The server responds with all the SMTP extensions it supports.
For the rest of the session,
the client can then make use of the server’s advertised capabilities.
ESMTP tool
Let’s put theory into practice.
The following tool generates the command sequence to submit or relay an email
with parameters of your choice.
One way of using the tool is simply to observe how parameter changes affect the protocol flow.
The reason why I built this tool, however,
is that you can copy the commands to your command-line interface
and send messages without the assistance of a mail client.
Since you shouldn’t enter your email password on a random website like this one,
I recommend that you use the mode for submission only with demo accounts which you’ve created for this purpose.
The password is stored in the local storage
of your browser without any protections until you erase the history.
Having said that, the tool is open source
like the rest of this website and if you don’t trust me that this website is served from those files,
you can also build and run this website locally.
The tool uses Thunderbird’s database
and Google’s DNS API
to resolve the server you want to connect to and the API by ipinfo.io
to determine your IP address when you click on Determine
next to the Client
field.
The text in gray mimics what the responses from the server likely look like.
What you actually receive from the server will be different.
As long as the returned status code
starts with a 2 or a 3, you should be fine.
If the returned status code starts with a 4 or a 5, something went wrong.
I list some ideas for things you can try out after the tool.
The boxes after that provide you with more information on various aspects,
which are useful for troubleshooting problems you might run into.
If you need more help, send me an email
(probably with your mail client rather than with this tool). 🙂
From: Alice <[email protected]>
To: Bob <[email protected]>
Cc: Carol <[email protected]>
Subject: Yet another message
Date: Sat, 08 May 2021 07:31:38 +0000
Message-ID: <[email protected]>
Hello,
It's me. Again.
Alice
.
Tool instructions
- Create a new account at an email service provider of your choice. If you opt for Gmail, you should read this box first.
- Enter the address of your account in the
From
field and your password in thePassword
field. Set theMode
toSubmission
. - After composing the message (
To
,Subject
, andBody
), try to submit it to the outgoing mail server with the listed commands. - The first line opens a TLS channel to the specified
Server
. All other commands are sent to the server inside this channel. - You can copy each line in bold to your clipboard by clicking on it, which includes the newline character to submit the command.
- If the mail was submitted successfully, you can add more
To
orCc
recipients. By copying only some of the generatedRCPT
TO
commands but the full message, you suppress the delivery of the message to the skipped recipients. For those that receive the message, it looks as if the message was delivered to all the recipients in the message. I already mentioned this problem above. - Besides faking recipients, you can also try to fake the sender.
Switch the mode from
Submission
toRelay
and change theFrom
field to an address that you don’t own. Now try to send the message directly to the incoming mail server of one of the recipients. If the incoming mail server and the domain which you try to send the email from are properly configured, your message should make it at best into the spam folder of the recipient. Chances are that your message will be rejected during the SMTP session or silently dropped thereafter. The incoming mail server might also graylist or blacklist your IP address. Since you usually don’t relay email from your computer, this is nothing to worry about. Forging the sender address is known as spoofing. Be careful which domains you try to impersonate. If the domain owner configured a DMARC record, they might be informed about your spoofing attempt and even receive the content of your message.
Important: Be a nice person and don’t scam others! If you spoof the sender of an email in bad faith, you likely commit a crime in most countries. I showed you this attack for educational purposes only because I believe that seeing is believing. We can improve the state of email security only if consumers start demanding better security. In this spirit, I encourage you to relay spoofed emails only to your own mailbox. If such a spoofed email lands in your inbox, ask your email service provider to be more rigorous in filtering scam emails or use the service of a different provider. You’re hopefully also more motivated now to read the rest of this article. In short, have fun with the above tool but always remember that with great power comes great responsibility!
Tool explanations
Command-line interface
Clipboard verification
How to watch your clipboard in your command-line interface. Press “control c” to exit the program.
OpenSSL versus LibreSSL
Common SMTP extensions
A transcript of a session with the outgoing mail server of Gmail when using Implicit TLS. [Brackets indicate redacted information.]
Backward compatibility STARTTLS extension
The STARTTLS
extension is listed when we connect without TLS to Gmail’s outgoing mail server.
When using -starttls smtp
, openssl
starts with a TCP connection
and upgrades it to a TLS connection by issuing the STARTTLS
command.
User authentication
Gmail authentication failure
Reverse DNS entry
Newline characters
Message termination
Origination date
Spoofed sender during submission
Limitations of the above tool
Other SMTP commands
These commands can be used at any time during a session.
VRFY
and EXPN
are usually disabled for security reasons.
What a response to the VRFY
command usually looks like.
(The reply code of a disabled VRFY
command should be 252, though.)
How Gmail responds to the HELP
command. 😄
Automatic responses
In certain configurations, mail servers send a message in response to an incoming message, which leads to the following problems.
Mail loops
Bounce messages
3. Retrieve2. Report1. SubmitMail clientIncomingmail serverOutgoingmail server
How the user learns about a delivery failure.
If the delivery of a message fails on the recipient’s side,
the bounce message (in red) is generated by a different system.
Backscatter
Password-based authentication mechanisms
The following boxes focus on password-based authentication mechanisms, which allow users to authenticate themselves to servers with only their username and password. Due to the nature of the topic, some of the later information boxes are fairly advanced. If you’re not interested in cryptography, you may want to skip them.
Dangerous reliance on TLS ClientProxyServer While the communication between the client and the proxy is protected (indicated by the blue lines), the communication between the proxy and the server is exposed in the company’s private network. ClientAttackerServer The client is misconfigured and connects directly to the attacker, who forwards all communication without raising any suspicion. ClientServerServer The malicious server (in red) has the same name as the legitimate server (in green). An attacker can impersonate the server by getting a certificate issued for their public key or by gaining access to the private key of the server and using the original server certificate.
Cryptographic hash functions InfeasibleEfficientInput ofany sizeOutput offixed size
Cryptographic hash functions are efficient to compute and infeasible to invert. For the same hash function, the same input always maps to the same output. The output is also called image and the corresponding input its preimage.
Find inputGiven output In these graphics, the given values are displayed in blue and the values to find in green. Given input 1Find input 2Same output≠ Knowing one input may not be useful to find another input which hashes to the same output. Find input 1Find input 2Same output≠ Due to the birthday paradox and attack, this is a stronger requirement than the previous one.
Secure Hash Algorithms (SHA)
Salts against pooled brute-force attacks
Nonces against replay attacks
Applications of cryptographic hash functions
HashFileFileContentproducerStorageproviderConsumerhash(File) = Hash?
As long as you get the hash from a trusted source,
the delivery of the file can be outsourced to an untrusted third party.
Salt, HashPasswordClientof userServer of providerhash(Password + Salt) = Hash?Databaseof provider
A server can reduce the damage of a leaked database by storing individually salted hashes instead of passwords.
RepeatedhashingpbkdfPasswordCryptographic key
By hashing an input repeatedly, you can turn an efficient hash function into an inefficient one.
SeedValue1: hash(1 + Seed)Value2: hash(2 + Seed)ValueX: hash(X + Seed)
Values can easily be derived from a seed (in green)
but they cannot be related to one another (in red).
?AliceBobhash(CoinFlipAlice + Nonce)CoinFlipBobCoinFlipAlice, Nonce
If CoinFlipAlice = CoinFlipBob
, Alice wins.
If CoinFlipAlice ≠ CoinFlipBob
, Bob wins.
ClientServerClientMessage, hmac(Key, ClientMessage)ServerMessage, hmac(Key, ServerMessage)
A message authentication code is appended to each message.
Leaf1Leaf2Leaf3Leaf4Node1: hash(Leaf1)Node2: hash(Leaf2)Node3: hash(Leaf3)Node4: hash(Leaf4)Node5: hash(Node1 + Node2)Node6: hash(Node3 + Node4)Root: hash(Node5 + Node6)
In order to verify that the green leaf is included in the root,
a verifier needs to know only the hashes and positions of the blue nodes.
Messagehash(Message + 1)hash(Message + 2)hash(Message + 3)hash(Message + X)
Finding a nonce which makes the hash of a message fall into a certain range requires many attempts.
Exclusive-or operation for perfect encryption
The truth table of the exclusive-or operation. The output is 1 if and only if the two inputs are unequal.
KeyKeyPlaintextCiphertextPlaintextEncryptionDecryptionAliceBobEve Eve has access to the ciphertext and knows the algorithms in blue, while the information in green is known only to Alice and Bob.
Desirable properties of authentication mechanisms
A comparison between password-based authentication mechanisms.
means that the authentication mechanism is resistant to the attack.
means that the resistance depends on choices made by programmers.
means that the authentication mechanism is vulnerable to the attack.
Challenge-Response Authentication Mechanism (CRAM) ClientServer(Connect)ChallengeResponse How challenge–response authentication works. ClientAttackerServer(Connect)(Connect)ChallengeChallengeResponseResponse
A man-in-the-middle attack on a challenge-response authentication mechanism.
Salted Challenge-Response Authentication Mechanism (SCRAM)
ClientServerMutual authentication guarantees only that the inner channel (in green) reaches the counterparty.
Channel binding can be used to ensure that the outer channel (in blue) isn’t interrupted by an attacker.
ClientServerUsername, ClientNonceServerNonce, Salt, IterationCountKeyXorHashedKeyMacKeyMac
The sequence diagram of Simplified-SCRAM
.
TLS channel bindings (SCRAM-PLUS)
Authentication on the Web
Password-authenticated key exchange (PAKE)
Access protocols
Besides proprietary protocols, most incoming mail servers allow mail clients to access the user’s mailbox with POP3 or IMAP. If your mail client and your mail server support both protocols, you should choose the latter as it’s much more powerful. The main reason for including POP3 in this article is that it’s much easier to use from the command line.
Communication logging in Thunderbird
Post Office Protocol Version 3 (POP3)
The Post Office Protocol Version 3 (POP3) is specified in RFC 1939. Similar to ESMTP, POP3 is a text-based application-layer protocol, which can be used with Implicit TLS or with Explicit TLS. POP3 with Implicit TLS is also known as POP3S. Just like SMTP, POP3 commands consist of four letters and an extension mechanism was introduced after the initial release of the standard. After authenticating the user, POP3 allows the client to list, retrieve, and delete messages. POP3 is designed to move messages from a remote queue into a local queue. It doesn’t support read statuses, mailbox folders, message uploads, or partial fetches.
The following POP3 tool works in the same way as the ESMTP tool above.
Most of the remarks I made earlier therefore still apply.
In particular, I advise you to use it only with accounts created for this purpose.
The tool uses Thunderbird’s configuration database
and Google’s DNS API
to resolve the server you want to connect to.
Copy the commands in bold to your command-line interface by clicking on them.
The text in gray mimics what the responses from the server look like.
The actual responses will be different.
Each response starts with either +OK
or -ERR
.
The former indicates that your command was successful,
the latter indicates that an error occurred.
If necessary, you can always kill the current process
and thereby the connection by pressing ^C
(control + c).
If you use Gmail, you have to enable POP3 access
in your account settings
and allow access from insecure apps.
POP3 commands
The mandatory commands of POP3.
(The USER
and PASS
commands are strictly speaking optional.)
POP3 extensions
Some optional commands of POP3. These commands extend the basic functionality of POP3. Unlike the message numbering, the IDs are guaranteed to stay the same across sessions.
The server can indicate additional behavior in its response to the CAPA
command.
LOGIN-DELAY
and EXPIRE
allow the server to conserve its resources.
Just like STARTTLS
, STLS
is advertised only in TCP connections.
Gmail doesn’t support POP3 with Explicit TLS
but Gandi does.
APOP authentication
Internet Message Access Protocol (IMAP)
The Internet Message Access Protocol (IMAP) is specified in RFC 3501. IMAP works similar to ESMTP and POP3, it just has many more commands and options. An IMAP mailbox acts as a remote drive for messages instead of files, where the drive is being shared among several clients. IMAP allows users to create, delete, and rename folders, to upload and move messages between them, to mark messages as read or as flagged, to search the mailbox remotely, and to download messages without their attachments.
The following IMAP tool works just like the ESMTP and POP3 tools above. As you might mess up your mailbox or delete messages you still wanted by accident, you should run the following commands on test accounts only. If you want to use your real account, you do so at your own risk. Certain commands have side effects, such as marking messages as read. Make sure you fully understand a command before using it. This tool also uses Thunderbird’s configuration database and Google’s DNS API to resolve the server you want to connect to. Neither IMAP nor the tool are self-explanatory. You find more information in the tooltips and the boxes below.
* 2 RECENT
* OK [UNSEEN 3]
* OK [UIDNEXT 5]
* OK [UIDVALIDITY 1]
* OK [PERMANENTFLAGS ()]
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
E OK [READ-ONLY] EXAMINE completed
{Data}
)
F OK FETCH completed
O OK LOGOUT completed
After the initial greeting by the server,
the client sends commands,
to which the server responds.
Since multiple commands can be in progress at the same time,
the client tags each command with a unique identifier,
such as A
, B
, C
, or a dot .
.
The server prefixes each line of its response with *
and completes its response with a line
which starts with the tag chosen by the client.
The tag is followed by a status response:
OK
for success, NO
for failure, or BAD
for protocol errors.
Don’t worry about reusing tags in a single session,
you can run a command repeatedly with the same tag.
If you want to fetch another message, for example,
just enter another message number and copy the generated command again.
If you use Gmail, you have to enable IMAP access
in your account settings
and allow access from insecure apps.
Protocol states
LOGOUTCLOSEUNSELECTSELECTEXAMINEUNAUTHENTICATELOGIN AUTHENTICATENot authenticatedAuthenticatedSelectedLogout
The protocol states and how to transition between them.
LOGOUT
can be called in any state except the logout state.
UNAUTHENTICATE
can also be called in the selected state.
Data formats
Message numbers
Message sets
Message flags
How a custom flag can be created and set if the IMAP server supports it.
Internal date
How to fetch when a message was received. IMAP commands IMAP extensions
JSON Meta Application Protocol (JMAP)
Over the last forty years, email in general and IMAP in particular became a patchwork of extensions. Given the complexity and the varying support of these extensions, writing a mail client is much more difficult than it should be. While there are efforts to unify the patchwork somewhat, there has also been a fresh start over the last couple of years. An IETF working group designed a modern protocol for client to server interaction: The JSON Meta Application Protocol (JMAP). JSON itself stands for JavaScript Object Notation, which is a popular format for storing and exchanging human-readable data. JMAP is specified in RFC 8620 and it can be used for more than just email. The data model for synchronizing email is specified in RFC 8621. If you don’t like the RFC formatting, you can also read the two standards here and here.
JMAP is designed to be interoperable with IMAP mailboxes and thus shares the concepts of folders and flags with IMAP. The protocol itself, however, is completely new and addresses the following shortcomings of IMAP (and message submission):
- Permanent identifiers: JMAP servers assign permanent identifiers to all objects. In the case of messages, these identifiers can no longer be invalidated and they no longer change when a message is moved from one folder to another. In the case of folders, JMAP clients can detect when a folder has been renamed and no longer need to fetch all the messages in it again.
- Efficient synchronization: JMAP provides a simple method for getting the identifiers of created, updated, and destroyed messages and folders. As we have seen above, synchronizing a mailbox with IMAP is easy only if you stay connected to the server, which isn’t an option for mobile clients.
- Push mechanism:
In order to be informed immediately about changes to a folder, such as newly arrived messages,
IMAP clients use the
IDLE
command. If they want to be informed about changes to several folders, they have to open a separate connection for each folder. JMAP, on the other hand, allows clients to subscribe to all changes on the server at once. Clients which can keep a connection to the server open can subscribe via theEventSource
interface. Other clients, such as those on mobile phones, can register a callback URL, which allows them to use their platform-specific push technology. - Batching of chained commands:
When the IMAP server doesn’t support certain extensions such as
SEARCHRES
, IMAP clients often need to wait for the response to one command before they can construct the followup command. JMAP allows clients to batch several commands and to reference the results from earlier commands in the same request. Doing so avoids round trips and makes updates more atomic (i.e. it becomes less likely that only some of the issued commands are being executed). - Widespread data format: JMAP data doesn’t have to be encoded as JSON and future standards can specify other data formats. The same is true for the transport protocol: While JMAP currently uses HTTPS as its transport protocol, other protocols can be added in the future. The choice of JSON and HTTPS is mostly due to their widespread adoption: There are suitable libraries for all relevant programming languages and software engineers know how to use those. It’s worth mentioning that JMAP doesn’t wrap binary data in JSON. Binary data is exchanged in separate connections.
- Complexity on server: JMAP moves the complexity of handling email’s message format from the client to the server. While clients can still fetch the raw message if needed, for example when implementing end-to-end security, the server has to deal with multipart messages, content encodings, line-length limits, etc. Clients can download and upload messages as a simple JSON object. Please note that this affects neither how messages are stored on servers nor how they are relayed to others. It just relieves programmers who want to integrate email from having to take care of encoding and decoding messages correctly.
- Message submission: The previous point only makes sense if clients can also submit messages for delivery in the same format. If the JMAP server supports submission, a client can instruct it to send a stored message to its recipients. The client can generate the envelope itself or let the server do it. By first storing the message as a draft and then moving it to the sent folder after sending it (see this example), JMAP also solves the double-submission problem.
- Flood control: Since it’s not always possible to anticipate how much data the server will send back, JMAP lets clients restrict the size of responses. This feature is especially valuable on devices with limited bandwidth or expensive roaming.
Support for JMAP is still quite rare, which is not surprising given that the standard was published only in 2019. We yet have to see whether it will become a relevant protocol for accessing one’s mailbox. I certainly hope so but email is really resistant to innovation.
Email filtering
It can be useful to filter incoming messages according to custom rules. For example, you may want to move certain messages to a certain folder, mark certain messages as read, or delete certain messages automatically. Most mail clients allow their users to configure such rules, which are executed when the mail client receives a new message. There are several advantages of filtering incoming mail on the server rather than on the client, though:
- Synchronization: If the filtering rules are stored on the incoming mail server, they can be inspected and edited through any of the user’s mail clients. Otherwise, users have to remember on which client they’ve created the rule that they want to modify now.
- No race conditions: If the filtering rules are stored on a mail client, then the rules are not applied when this mail client is offline. In this situation, other mail clients see unfiltered messages. If these mail clients apply rules of their own, you might run into race conditions, where the order in which clients see incoming messages determines the outcome of the filtering.
- Rules for absence: Some rules, such as sending out-of-office replies, shall run precisely when all mail clients are offline. This is not possible when the rules must be executed by mail clients.
- Rejection during delivery: Unlike clients, incoming mail servers can reject a message during its delivery. By sending the 550 response code during the SMTP session, the incoming mail server can inform the sender about the rejection without causing backscatter with bounce messages.
To achieve server-side filtering, we need a standardized mail filtering language and a standardized filter management protocol.
Mail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientOfflinemail clientof recipientOnlinemail clientof recipient How a message is delivered from the mail client of the sender to the mail clients of the recipient. Messages can be filtered by the incoming mail server (in green) or by an online mail client (in blue).
Mail filtering language (Sieve)
Sieve
is a language for filtering messages on the incoming mail server.
It is specified in RFC 5228 and it is fairly simple:
Using the control commands if
, elsif
, and else
,
you can specify under which conditions
a specific action shall be applied.
You can find plenty of examples throughout the RFC as well as
here
and here.
There are just a couple of things you should know to understand them:
- Arguments: Most commands in the Sieve language take arguments.
Mandatory arguments are determined by their position,
optional arguments are identified by a
colon followed by their name.
Some optional arguments can take arguments themselves:
:name value
. This is similar to arguments in the command-line interface but with:
instead of-
before the name. When optional arguments are not provided, their default values are used instead. - Extensions: The Sieve language is extensible.
A script has to list the extensions which it uses at the top of its code
with
require
. - Implicit keep: Each message is stored in the inbox unless it is moved to a folder, forwarded to an address, or discarded explicitly.
- String lists: Wherever a list of strings
is expected, such as
["To", "Cc"]
, a string without brackets, such as"To"
, can be used. - Prefix notation: Commands and arguments are nested not with parentheses
but by earlier tokens consuming later ones.
For example, the negation of the condition
exists "Date"
isnot exists "Date"
. This is similar to the prefix notation. - Comments: If you use
#
outside of double quotes, the incoming mail server ignores all characters including this one until the end of the line. Comments which span less or more than a line have to be enclosed in/*
and*/
. - No loops: The Sieve language doesn’t support loops. Each block is executed once or not at all.
You can generate simple filtering rules with the following tool.
Make sure that the Argument
makes sense for the chosen Action
.
Move
requires the name of a folder, Forward
an email address,
Flag
the name of a flag, and Reply
the text of the reply.
require "imap4flags"; if header :contains "Subject" "Test" {
addflag "\\Seen";
}
Users don’t have to learn the Sieve language. Mail clients can offer a graphical user interface (GUI) similar to the tool above, where users don’t have to see the generated code. You find a list of all the extensions to the Sieve mail filtering language on Wikipedia.
Out-of-office replies
A simple Sieve script for an automatic vacation response, which I’ve adapted from Gandi.
Support by email service providers and mail clients Gmail users can go to the Filters and Blocked Addresses tab of their settings and click on “Create a new filter”. While Gmail has an API for managing filters, other mail clients won’t support such a proprietary protocol.
Filter management protocol (ManageSieve)
ManageSieve is a protocol for managing Sieve scripts remotely.
It is specified in RFC 5804
and works similar to the protocols we have seen so far.
After an initial greeting from the server,
the client sends commands to which the server responds.
Just like IMAP,
responses are completed with a line which starts with OK
or NO
;
but unlike IMAP, the commands are not preceded with a tag.
Just like IMAP, multiline strings are prefixed with their length;
but unlike IMAP, the client can include a plus
to continue with the string without having to wait for a continuation response from the server.
Just like SMTP for Relay,
there’s no variant of ManageSieve which can be used with Implicit TLS.
The server sends its capabilities
automatically in its greeting and after successful
STARTTLS
and
AUTHENTICATE
commands.
As part of the capabilities, the server indicates which extensions to the Sieve language
and which SASL mechanisms it supports.
According to RFC 5804,
ManageSieve servers have to support PLAIN
over TLS
and SCRAM-SHA-1
.
The following tool shows you how to use the ManageSieve commands
from your command-line interface.
Unlike the previous tools,
you have to configure the address and the port number of the server manually
as this information is not included in Thunderbird’s configuration files.
The standard describes how to locate the ManageSieve server
with SRV
records
and the autoconfiguration tool above does query the _sieve._tcp
subdomain.
However, since virtually no one configures such SRV
records (at least not for the ManageSieve protocol),
I didn’t bother to implement this discovery mechanism here.
ManageSieve servers listen on port 4190 by default.
The Thunderbird plugin, which I mentioned earlier,
simply probes this port
on the IMAP server
in order to configure itself.
Important: Since LibreSSL doesn’t support the ManageSieve STARTTLS
command,
you have to use OpenSSL
(see the boxes below).
"NOTIFY" "{Methods}"
"SASL" "PLAIN"
"SIEVE" "{Extensions}"
"VERSION" "1.0"
OK "STARTTLS completed."
require "body"; if body :contains "Test" { discard; }
require "body"; if body :contains "Test" { discard; }
Explanation: While you can have multiple scripts on the server,
at most one of them can be active.
You cannot delete the active script.
You can deactivate the active script by activating
another script or by using an empty script name to set no script active.
You can also generate the argument to PLAIN
yourself
with echo -ne '\0000username\0000password' | openssl base64
.
LibreSSL doesn’t support ManageSieve
How to install OpenSSL on macOS
Format
The format of an email message is specified in RFC 5322. The goal of this chapter is to make you comfortable reading raw messages.
How to display the raw message
Mail clients don’t display all header fields by default. Here is how you can display the raw message as it arrived in your mailbox:
- Gmail: Open a message, click on ⋮ in the upper right corner, then on “Show original”.
- Yahoo: Open a message, click on ⋯ in the bottom middle, then on “View raw message”.
- Outlook:
- Web: Click on ⋯ in the upper right corner, then on “View” and “View message source”.
- Desktop: Double-click a message, click on the “File” menu and then select “Properties”.
- Thunderbird:
- Raw message: Select a message, click on the “More” button and then “View Source” (or use ⌘U).
- All header fields: Click on the “View” menu, then on “Headers” and “All” (or on “Normal” to go back).
- Apple Mail:
- Raw message: Click on the “View” menu, then on “Message” and “Raw Source” (or use the shortcut ⌥⌘U).
- All header fields: Click on the “View” menu, then on “Message” and “All Headers” (or use the shortcut ⇧⌘H).
- Change preferences: In the “Viewing” tab of the preferences, you can configure which header fields are displayed.
File format
Since messages, including attachments, are just text,
they can be stored as simple text files.
A common filename extension
for emails is .eml
.
Such files can be viewed with any text editor.
Desktop clients usually have an option to save a message as a file,
and among Web clients, at least Gmail allows you to download a message in the “⋮” menu,
which is located in the upper right corner.
Storage format
Apple Mail storage format
Line-length limit
According to RFC 5322,
each line of a message may consist of at most 1’000 ASCII characters,
including CR + LF.
Implementations are free to accept longer lines,
but since some implementations cannot handle longer lines,
you shouldn’t send them.
The RFC even recommends limiting lines at 80 characters
to accommodate clients that truncate longer lines in violation of the standard.
In order to leave the line wrapping to the mail client of the recipient,
the mail client of the sender has to encode the body
if the body contains lines which are too long.
If a header field is too long,
it must be broken into several lines with
folding whitespace:
{CR}{LF}
followed by at least one space or tab.
If a line in the header section of a message starts with whitespace,
its content belongs to the header field on the previous line.
The procedure of breaking lines as done by the sender is called folding,
the procedure of joining lines as done by the recipient is called unfolding.
When unfolding, runs of whitespace characters are replaced with a single
space character.
Message identification
There are three header fields to identify the current message and the previous messages in the same thread:
Message-ID
: TheMessage-ID
identifies the current message. It’s format is<{Value}@{Domain}>
. Although outgoing mail servers may add this field if it’s missing, theMessage-ID
should be chosen by the mail client. Otherwise, the copy stored in the sent folder on the incoming mail server lacks this field, which defeats its purpose. Whoever chooses theMessage-ID
should make sure that it’s unique. Mail clients often choose theValue
as a universally unique identifier (UUID) and theDomain
as the domain part of the user’s email address. The sender has to decide whether two messages are the same and thus share the sameMessage-ID
. If the client generates different versions of the same message due toBcc
recipients, it should use the sameMessage-ID
for all of them.In-Reply-To
: If a user replies to a message, theMessage-ID
of the replied-to message is put into theIn-Reply-To
header field.References
: WhileIn-Reply-To
refers only to the direct parent message, theReferences
field lists theMessage-ID
s of all ancestor messages, including the direct parent message. This is useful to reconstruct a conversation even if not all intermediary messages were sent to you. Clients compose this field by adding theMessage-ID
of the replied-to message to theReferences
of the replied-to message. When determining which messages belong to the same thread, clients use additional heuristics, such as comparing theSubject
line after stripping common prefixes, to avoid grouping messages where a person replies to a message just to send an unrelated message to the sender of the message.
Message-ID: <[email protected]>
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
An example of what the three message identification header fields look like.
The References
field contains the message ID of the In-Reply-To
field.
Mandatory header fields
Quoting the previous message
How text is usually quoted at different nesting levels. Quoting the text allows you to reply below each paragraph of the original message.
Universally Unique Identifier (UUID)
Trace information
According to RFC 5321,
whenever a mail server receives a message,
it must add a Received
header field at the beginning of the message
without changing or deleting already existing Received
header fields.
Received
header fields have the following format:
Received: from {EhloArgument} ({DnsReverseLookup} {IpAddressOfClient})
by {DomainNameOfServer}
with {Protocol}
id {SessionId}
for {AddressOfRecipient};
{DayOfWeek}, {Day} {Month} {Year} {Hour}:{Minute}:{Second} {TimeZone}
The format of Received
header fields.
The curly brackets stand for values which need to be inserted.
The with
, id
, and for
clauses are optional.
The newlines can be in other places, and additional information is often added
as comments in parentheses in various places.
According to RFC 5321,
the Protocol
is either SMTP
or ESMTP
.
RFC 3848 specified additional values:
ESMTPA
when ESMTP is used
with successful user authentication,
ESMTPS
when ESMTP is used with Implicit or Explicit TLS,
and ESMTPSA
when the session has been secured and the user has been authenticated.
RFC 8314 specifies an additional tls
clause,
which can be used after the for
clause to record the
TLS ciphersuite
which has been used.
Gmail adds such information as a comment instead:
(version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256)
.
Checking the Received
header fields of a received message gives you an idea
whether the message was secured during transport.
Note, however, that Received
header fields are not authenticated:
The mail servers through which a message passes can change the Received
header fields
that were added by mail servers through which the message already passed.
In addition, not all mail servers might support the newer protocol values,
and relays over a private network are often not protected with TLS.
A message typically has at least four Received
header fields,
which only makes sense if you look at the official architecture
instead of the simplified architecture.
A Received
header field is added by the mail submission agent (MSA), the outgoing mail transfer agent (MTA),
the incoming mail transfer agent (MTA), and the mail delivery agent (MDA).
Here is a Received
header field, which was added by my outgoing mail server:
Received: from [192.168.1.2] (unknown [203.0.113.167])
(Authenticated sender: [email protected])
by relay12.mail.gandi.net (Postfix) with ESMTPSA id 7974D200009
for <[email protected]>; Thu, 3 Dec 2020 14:14:48 +0000 (UTC)
What an actual Received
header field looks like.
I’ve only replaced my IP address with an address reserved for documentation.
Due to Network Address Translation (NAT),
the private IP address
that my computer used in the EHLO
command
and the public IP address that the outgoing mail server saw were different.
Since my public IP address doesn’t have a reverse DNS entry,
the server recorded the name of the client as unknown
.
(Authenticated sender: [email protected])
is a comment,
which the server added to indicate which user submitted the message.
(Postfix)
is also a comment, indicating the name of the server implementation.
Everything else matches the format which I’ve described above.
An incoming mail server which delivers a message
must add the MAIL FROM
address of the envelope
in a Return-Path
header field to the message.
While a message can have several Received
header fields,
it may have at most one Return-Path
header field.
If a message is resubmitted, for example by a filtering rule,
the Return-Path
header field should be removed,
and its value should be used as the MAIL FROM
address.
As we discussed earlier,
the Return-Path
header field can be different from the From
header field.
Return-Path: <[email protected]>
What a Return-Path
header field looks like.
Recover why you received a message
Local Mail Transfer Protocol (LMTP)
Content encoding
RFC 5322 specifies a format for text messages, whose lines may consist of at most 1’000 ASCII characters. Whenever the content of a message doesn’t fulfill this requirement, it must be encoded according to the Multipurpose Internet Mail Extensions (MIME) as specified in RFC 2045. When mail clients encode messages according to MIME, they indicate this with the following header field:
MIME-Version: 1.0
The header field used to indicate that a message is formatted using MIME.
In theory, the version number allows the Internet community to make changes to the standard. In practice, however, the standard didn’t specify how mail clients are supposed to handle messages with an unknown MIME version. As a consequence, you cannot change the version number without breaking email communications, which makes this header field completely useless. The version 1.0 survived the last 30 years and will likely survive the next 30 years. MIME also introduced additional message header fields, which we’ll cover in this and the following subsections.
Unless all involved SMTP servers support the BINARYMIME
extension
as specified in RFC 3030, which is rarely the case,
content containing non-ASCII characters or lines longer than 1’000 characters
must be encoded with one of the following two methods:
- Quoted-Printable:
Any byte which doesn’t represent a printable ASCII character
is encoded with the equality sign
followed by the value of the byte encoded as two hexadecimal digits.
Since
=
is used as the escape character, it has to be encoded with its hexadecimal ASCII value as=3D
. Lines may be at most 78 characters long, including{CR}{LF}
. Longer lines have to be broken by inserting={CR}{LF}
. All sequences of these three characters are removed when decoding the Quoted-Printable encoding. Since some mail servers add or remove trailing whitespace, tabs and spaces which are followed by{CR}{LF}
also need to be encoded with hexadecimal digits. Any sequence of bytes can be encoded with this method. The Quoted-Printable encoding only makes sense, though, if most of the bytes are printable ASCII characters. This is the case for those European languages which share most of their characters with the English alphabet. Texts in such languages remain largely readable when using the Quoted-Printable encoding. The probability that a random byte falls into the range of printable ASCII characters is just a bit bigger than one third, though. Thus, the size of binary data, such as images, more than doubles with this encoding. The following tool allows you to encode and decode Quoted-Printable:Decoded:Encoded:Charset: - Base64:
Binary data and non-Western-European languages are best encoded with Base64.
While hexadecimal digits encode 4 bits each, Base64 digits encode 6 bits each.
6 bits can represent 26 = 64 different values.
Base64 uses the characters
A
–Z
,a
–z
,0
–9
,+
, and/
to encode these 64 values. What makes the Base64 encoding special is that bytes and digits don’t align: Three bytes are encoded with four Base64 digits. If you shift the input by one or two bytes, the Base64 encoding looks completely different. If the size of the input is not a multiple of three, one or two equality signs are appended to the output in order to make the output a multiple of four. This procedure is known as padding. In order to respect the line-length limit, a line break is inserted after at most 76 Base64 characters. Base64 encoding increases the size of the content by 33% and the line breaks add another 2.6% on top of that. You can encode and decode Base64 with the following tool:Decoded:Encoded:Charset:
The mail client of the sender informs the mail client of the recipient with the following header field that the content is encoded:
Content-Transfer-Encoding: {Value}
This header field indicates with which method the content of the message has to be decoded.
The Value
can be quoted-printable
, base64
, or 7bit
if no content encoding has been used.
(If the 8BITMIME
or BINARYMIME
extensions are supported, the value can also be 8bit
or binary
.)
If the message already consists of only printable ASCII characters, the line-length limit can also be achieved with soft line breaks.
Character encoding (charset) ascii7 bitsiso-8859-18 bitsutf-88 to 32 bitsASCII encodes only a subset of the characters defined in ISO-8859-1, and ISO-8859-1 encodes only a fraction of the characters available in UTF-8.
Percent encoding (URL encoding)
Decoding on the command line
How to encode (-e
) and decode (-d
) Quoted-Printable with qprint
.
Use brew install qprint
to install this command on macOS.
How to encode (-e
) and decode (-d
) Quoted-Printable with the
quoted-printable
package if you already have Node.js.
How to encode (-e
) and decode (-d
) Base64 with
OpenSSL.
Use the option -A
to have no newline characters inserted or expected.
How to encode and decode Quoted-Printable, Base64, and Percent with Perl, which is likely preinstalled on your computer. You can use explainshell.com to learn more about the used options. The code uses the MIME::QuotedPrint, MIME::Base64, and URI::Escape modules.
How to encode and decode Quoted-Printable, Base64, and Percent with
Python,
which is likely preinstalled on your computer.
The commands use the quopri
,
base64
, and
urllib.parse
modules
and the first four commands operate on the raw bytes.
If you want to use a character encoding other than UTF-8,
you can just save the file which you use as the input accordingly.
Header encoding
RFC 2047 specifies how one can use non-ASCII characters
in certain header field values,
such as the subject and the display names.
Instead of introducing new header fields to specify the encoding of existing header fields,
encodings in header fields indicate which character encoding
and which content encoding has been used.
This results in the so-called Encoded-Word encoding.
Its format is as follows: =?{CharacterEncoding}?{ContentEncoding}?{EncodedText}?=
,
where CharacterEncoding
is usually either ISO-8859-1
or UTF-8
,
ContentEncoding
is either Q
for Quoted-Printable or B
for Base64,
and EncodedText
is the field value encoded according to the previous parameters.
The Quoted-Printable encoding is slightly modified when used to encode header field values:
Question marks, tabs, and underlines are escaped with their hexadecimal representation
and spaces are encoded with underlines.
In order to adhere to the line-length limit,
whitespace between adjacent Encoded Words is removed completely,
which allows the encoder to break long words with a newline
(and also to mix different character encodings).
The following tool does all of that for you.
It uses Quoted-Printable or Base64 depending on which encoding is shorter,
and it supports only ISO-8859-1
and UTF-8
.
In case you haven’t noticed yet: The ESMTP tool above
automatically encodes the Subject
and the Body
if necessary.
If you want to use non-ASCII characters in display names,
you have to paste the Encoded Word into the address field yourself.
The following boxes explain how non-ASCII characters are supported in
domain names,
which is really interesting but also fairly advanced.
Punycode encoding
Unicode normalization
The four normalization forms of Unicode. CanonicalequivalenceCompatibilityequivalence Compatibility equivalence includes canonical equivalence.
Unicode case folding
Internationalized domain names (IDNs) Arbitrary user inputRemove certain charactersCase fold all charactersnfkc-normalize the labelsReject certain charactersEncode with PunycodeDomain name in ascii
How IDNA2003 normalizes user input. The normalization fails only if the output contains prohibited characters or violates the rules for bidirectional text.
Arbitrary user inputReject symbols and punctuation marksLowercase or reject uppercase charactersnfkc-normalize or reject non-normalized labelsAccept only valid charactersEncode with PunycodeDomain name in ascii How IDNA2008 normalizes user input. The steps in gray are required but not standardized. IDNA2008 validation
Homograph attack
Email address internationalization (EAI)
Content type
Now that we can encode arbitrary content,
we need a way to inform the client how to interpret the decoded content.
This is done with the Content-Type
header field,
which has the following format:
Content-Type: {Type}/[{Tree}.]{Subtype}[+{Suffix}][; {Parameter}]*
The curly brackets need to be replaced as described below, the content in the square brackets is optional, and the asterisk indicates that there may be several parameters.
The content type is also called media type. IANA maintains a long list of registered media types. A content type consists of:
Type
: The primary content type describes the general type of data. If the client doesn’t recognize the subtype, it can use this information to decide what to do with the content. If the type istext
, for example, it can display the raw data to the user, which wouldn’t make sense for binary files. The other top-level media types areimage
,audio
,video
,font
,model
for three-dimensional models,application
for application-specific formats,message
for email messages,multipart
for multipart messages, andexample
for use in documentation.Tree
: RFC 6838 defines four registration trees in order to keep different kinds of subtypes apart. There is the standards tree, which doesn’t use a tree prefix and is reserved for formats specified by a standards organization such as IETF; the vendor tree with the prefixvnd.
for proprietary formats; the personal or vanity tree with the prefixprs.
for experimental and non-commercial formats; and the unregistered tree with the prefixx.
for unregistered and thus only locally used formats.Subtype
: The subtype is the name of the content type. See the list of examples below.Suffix
: A structured syntax suffix can be used to specify the syntax of the media type while leaving the semantics of the data to the subtype. IANA maintains a list of registered suffixes. Examples are+xml
,+json
, and+zip
.Parameter
: Parameters can be used to modify the media type. Each subtype specifies which parameters are required and which ones are optional. Optional parameters assume their default value if they are not provided. If several parameters are provided, their ordering is irrelevant, but each parameter may appear only once. The syntax of parameters isname=value
. The best-known parameter ischarset
to specify the character encoding oftext
content. IANA maintains the standardized character sets and the values of some parameters.
The type, the subtype, and the parameter names are case-insensitive.
RFC 6838 doesn’t specify whether the tree and the suffix are also case-insensitive
but I assume that this is the case.
Whether a parameter value is case sensitive depends on the parameter.
The default content type for emails
is text/plain; charset=us-ascii
.
As specified in RFC 1945,
HTTP uses the same header field with the same media types.
Example content types: text/csv
, text/html
, image/png
, image/svg+xml
,
image/vnd.adobe.photoshop
, audio/mpeg
, video/mp4
, font/otf
,
application/javascript
, application/pdf
, application/vnd.apple.pages
, and application/vnd.ms-excel
.
Enriched Text
An example Enriched Text message. Click here to use this example in the ESMTP tool above. HTML emails
An example HTML message. Click here to use this example in the ESMTP tool above.
Email styling
External CSS: Load an external style sheet with a
<link>
element.
Internal CSS: Embed the style with a
<style>
element
inside the <head>
element.
Inline CSS: Repeat the style with the
style
attribute for every element.
Styling the color of links. Click here to use this example in the ESMTP tool above.
The styles that Gmail applies to the link in the above message.
How to solve Gmail’s link rendering problem. Click here to use this example in the ESMTP tool above. While the ID and the descendant selectors are supported by almost everyone, the child selector is not.
Email markup
Dynamic content
Soft line breaks
How to respect the line-length limit and message quoting
without having to resort to Quoted-Printable encoding.
In order to make whitespace visible,
I’ve replaced each space with ␣
and marked each newline with ↵
.
Note that the leading space on the last line is required.
Otherwise, this line would be a quoted 3
.
Message compression Web clientWeb serverGET /file http/1.1Accept-Encoding: br, gziphttp/1.1 200 OKContent-Encoding: gzip
How HTTP compression works. The most used compressions are gzip and br. You can inspect these header fields with the developer tools of your browser.
How S/MIME can be used to compress messages using the ZLIB compression format as specified in RFC 1950.
The pipeline of commands to Base64-decode and ZLIB-decompress the string from the above example message.
Internationalized parameter values
Multipart messages
Now that we can send arbitrary files via email, we can design file formats to include several files in a single message body. RFC 2046 defines various content types to split a message into multiple parts. What all the multipart formats have in common is that they are text-based. This means that the various parts have to be separated with a character sequence which may not appear in any of the parts themselves. The character sequence is chosen by the sending mail client for each message and provided to the recipient in a content-type parameter called boundary. Let’s look at the two most common multipart types and leave the rest for the boxes below.
multipart/mixed
bundles independent parts into a single message. This content type is used to attach files to a message. If a client doesn’t recognize amultipart
subtype, it should treat the content asmultipart/mixed
and show the recognized parts.MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="UniqueBoundary" --UniqueBoundary Content-Type: text/plain Content-Transfer-Encoding: 7bit This message has an attachment. --UniqueBoundary Content-Type: image/png Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAAACXBIWXMAAAsSAAAL EgHS3X78AAAB4klEQVQ4y5VVwU7CQBDdKl5IVz8Fg95ZQH+BhBP/UEAxKv/AN3jG /1ATPeqlJr169WRpi/NmZ8uCEGqTl53OzrzO7MxOVdiKFB6s2gwDkeuEAeGRkBAW gkR02KvDFj4+hxMCpUq5T4h1d7LU3Zulbo+X5GQBGToC2XzC1vML1lmtPGMiciQ5 I/xgBRmtBSFHpPS+0O0rRzzz/JWf5kxf3CKSQneusea8whGyjbIQ4kI+lMHHkTou TpMjA5kZpoQpoUnoEeJ1UiKzdiDKmdRG2ldeAWI+HxvZvfIeem9wihKhO09EKWuG D4KDC4WKITpJAUbnQnREugOR3+RjWVkgkMkR8AdtlAMQzj1C4ExIHNkh4XkbIaff YlJHOAdhos3IpbCN8IDw4hHmshZeoXLpjERJG7jNfYSpd9ZloUIj52mixT8IqdId rvYX4bXsxVVLlYRVUn46vryD0wPhRPSnrqVC+Hkp7ytKwFU2w29CKK1Wk70e0seN 8otSpW0+CO9MZqIa9kSP5s85QssxqNrYLvrGxt7UXs/xqrGrXb2xmzqx6Jpik6IX Jbq+8i90heGQs+zvwXZzOFQdX+7ess5EmZ2Nk7/ja/+AHXkDdiQDduLObPeA3fEL mMvYTwWJ6Hb+An4Bgrjq/fe5+zgAAAAASUVORK5CYII= --UniqueBoundary--
An example
multipart/mixed
message. Click here to use this example in the ESMTP tool above.multipart/alternative
bundles alternative versions of the same content into a single message. This content type is used to provide a fallback version of the content for mail clients that don’t support the preferred content type. The versions are to be listed in increasing order of preference, which means that the preferred format comes last. This has the advantage that users of mail clients which don’t support multipart messages see the simplest version of the message first. Mail clients usually display the last part which has a content type that they support unless the user configured a different preference.multipart/alternative
is most commonly used to provide a plaintext version of HTML messages for users of text-based mail clients, such as Elm, Pine, and Mutt, which cannot render HTML. To give you another example, I could have included a plaintext version of the Enriched-Text message so that Gmail could display that instead of offering me to download the unrecognized content.MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="UniqueBoundary" --UniqueBoundary Content-Type: text/plain Roses are red. --UniqueBoundary Content-Type: text/enriched <bold>Roses</bold> <italic>are</italic> <color><param>red</param>red</color>. --UniqueBoundary Content-Type: text/html <html> <body> <b>Roses</b> <i>are</i> <span style="color:red;">red</span>. </body> </html> --UniqueBoundary--
An example
multipart/alternative
message. Click here to use this example in the ESMTP tool above.
Since multipart/mixed
and multipart/alternative
are content types like any other, they can be nested,
which results in a tree of message parts.
The content encoding of multipart
parts
has to be 7bit
, 8bit
, or binary
,
and the boundary between the inner parts
has to be different from the boundary between the outer parts.
Boundary delimiter
The various parts of a multipart
message.
Click here
to use this example in the ESMTP tool above.
Content disposition
How to attach a file with a name. Click here to use this example in the ESMTP tool above.
Aggregate documents
<img src="cid:[email protected]">
references the part with Content-ID: <[email protected]>
.
Click here
to use this example in the ESMTP tool above.
Other multipart subtypes
After many encoding-related sections, I want to mention two more format-related aspects before moving on to issues with email.
One-click unsubscribe
If you are subscribed to a mailing list,
you may want to unsubscribe from the list after having received a message you no longer want to receive.
Most mailing lists include a link at the bottom of each sent message,
which you can click to unsubscribe from the mailing list.
Since this is a link like any other in the message, a browser window is opened
and you might have to click on additional buttons there to finally unsubscribe from the list.
This can be a bit of a hassle, especially on mobile phones.
Fortunately, RFC 2369 specifies an easier way to achieve the same.
Mailing lists should include a List-Unsubscribe
header field
so that mail clients can provide a uniform unsubscribe experience across mailing lists:
You simply click on “Unsubscribe” and your mail client takes care of the rest.
List-Unsubscribe: <https://example.com/unsubscribe?token=XYZ>,
<mailto:[email protected]?subject=Unsubscribe>
The List-Unsubscribe
header field provides a standardized way to unsubscribe from a mailing list.
If there are several options in angle brackets, the mail client should use the first one that it supports.
To be precise, RFC 2369 didn’t require
that there is no additional user interaction.
In fact, user confirmation was often necessary in order to prevent accidental unsubscriptions
triggered by anti-spam programs which simply fetch all the links in a message.
For this reason, RFC 8058
defines an additional header field with more precise semantics:
When the user clicks on “Unsubscribe”,
the mail client sends a POST
request
to the HTTPS resource specified in the List-Unsubscribe
header field
with the value of the new List-Unsubscribe-Post
header field in the body of the request.
The List-Unsubscribe-Post
header field has to contain List-Unsubscribe=One-Click
,
and both header fields must be covered
by a valid DKIM signature.
List-Unsubscribe-Post: List-Unsubscribe=One-Click
The List-Unsubscribe-Post
header field informs the receiving mail client
that it can unsubscribe the user with a simple POST
request.
The body of the POST
request is encoded with the content type
multipart/form-data
as specified in RFC 7578
or application/x-www-form-urlencoded
as specified by the
Web Hypertext Application Technology Working Group (WHATWG)
in their URL spec.
The request has to be sent without context information such as cookies.
The user has to be authenticated with a token in the URL.
POST /unsubscribe?token=XYZ HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 26
List-Unsubscribe=One-Click
The POST
request to unsubscribe a user from a mailing list
as generated by Gmail.
You find an example POST
request using multipart/form-data
in RFC 8058.
These two header fields are not only convenient for users, they also make unsubscribing more secure since mail clients don’t include them when forwarding a message. If you want to prevent others from unsubscribing you from a mailing list, you have to remove the unsubscribe link at the bottom of a message manually before forwarding the message.
Custom header fields
IANA maintains a long list of
registered message header fields.
The ones specified in an RFC and thus endorsed by IETF are called permanent header fields.
The ones registered for private use
without official recognition are called provisional header fields.
RFC 3864 outlines the registration procedure for header fields.
It’s common to start the name of custom header fields with X-
,
but unlike in the case of content types,
there is no requirement for this.
RFC 822 just promised that
official header fields will never start with X-
.
This provision regarding extension header fields was dropped
in later revisions, though.
During my research for this article,
I’ve inspected a ton of messages in their raw format.
The funniest header field I came across is the following one from Booking.com:
X-Recruiting: Like mail headers? Come write ours: https://careers.booking.com
This is one way to reach nerds like me. I’m not that excited about email headers, though. 😅
Issues
Email is both a blessing and a curse. On the one hand, email is by far the most important decentralized messaging service that we have, which should be reason enough to cherish it. The only other decentralized messaging service which comes close to email in terms of ubiquity is the Short Message Service (SMS). On the other hand, email has become so dysfunctional that many of us would like to leave it behind. In this section, we’ll look at the issues that plague modern email. In the last chapter, we’ll discuss how some of the security-related issues are being addressed.
Unsolicited messages which are sent in large quantities are called spam or junk mail. Spam is a brand of canned pork, which was introduced in 1937. Spam is likely an abbreviation for spiced ham. It became ubiquitous during and after World War II when food was rationed. The British comedy group Monty Python made fun of this fact in a famous sketch in 1970. The term got adopted to refer to undesirable things which come in excessive quantities – including junk mail.
Any messaging service which is popular, open, and free will have spam sooner or later. Thus, spam isn’t a result of the shortcomings of email but rather a consequence of its desirable properties. Since unsolicited messages are annoying, people try to eliminate junk mail from their inboxes with heuristics, blacklists, and challenges. While such techniques make spam bearable, they don’t solve the underlying problem of unsolicited mail: Anyone in the world can add tasks to the to-do list which is your inbox. In my opinion, mail clients should separate messages from unapproved senders from your inbox so that the messages you actually want to receive don’t drown in the noise. This is similar to how I almost never accept calls from numbers that I haven’t stored in my phone. Even though this feature has to be tremendously useful for anyone who doesn’t want to be bothered by random sales people and their never-ending followups, HEY is the only mail client I know of which let’s you screen your email senders. And just like I block call centers, I also block email senders, of course. However, the default shouldn’t be “allowed unless blocked” but rather “blocked unless allowed”. Additionally, messages are typically blocked on the client-side because most mail clients still don’t support server-side filtering.
Heuristics
How binary classifiers are evaluated. When labelling spam, you want to have as few false positives as possible, even if this increases the rate of false negatives as well.
Blacklists
Graylisting
Patience
Challenges
Reputation
Address munging
Legal requirements
Privacy
If you send an email to someone, you want to share certain information with that person. Mail clients and mail servers, however, share a lot more information than what the users intended to share. In this subsection, I list all the subtle information disclosures that users likely aren’t aware of. If you know of other privacy leaks, please let me know.
Sender towards recipients
The recipient of a message often learns the following information about the sender:
- IP address: When you submit an email to an outgoing mail server,
most mail servers add your IP address to the message
as part of the trace information.
As a recipient, you find the IP address from which a message was sent in an
x-originating-ip
header field or in the square brackets in the first parentheses of the lastReceived
header field. (Each mail server through which an email passes adds an additionalReceived
header field at the top, which means that the firstReceived
header field, which was added by the outgoing mail server, is at the bottom.) There are three important implications of this. Firstly, your outgoing mail server leaks your rough physical location to all recipients. In other words, never send to your boss that you’re sick at home from your holiday apartment. Similarly, recipients can tell whether you’re still at work or went home already. Secondly, recipients can launch a denial-of-service attack. Due to network address translation (NAT), the target would typically be your router rather than your machine, but your Internet connection goes down either way. Thirdly, if you visited the website of an email recipient anonymously or pseudonymously, the recipient now knows who this user on their website is. To find out whether your outgoing mail server includes your IP address in the messages that you send, send a message to yourself and search for your IP address. You can use the following tool with an empty input field to determine your IP address. You can also use the tool to locate the IP address of someone who sent you an email. The tool uses the geolocation API of ipinfo.io.IPv4 address:If you don’t want your email service provider to leak your own IP address, you can use a Virtual Private Network (VPN) or an overlay network for anonymous communication, such as Tor. Alternatively, you can use an email service provider which values your privacy, such as ProtonMail or Tutanota. Sending messages from the web interface of an email service provider usually also helps. For example, if you compose an email on gmail.com, your IP address is not included in the outgoing message. If you submit a message from your desktop client to Gmail using SMTP, on the other hand, your IP address is added by
smtp.gmail.com
in aReceived
header field. While RFC 5321 does say that the IP address of the source should be included in theReceived
header field, email service providers should ignore the standard in this regard, in my opinion. I understand that email service providers may want to record the IP address of the sender to prevent abuse of their service, I just see no reason to share this information with the recipients of a message. In fact, it might even be illegal to do so. Many privacy acts, such as the European General Data Protection Regulation (GDPR), forbid service providers to share personal data without the user’s explicit consent. Since the third party with whom the personal data is being shared can be different for every email, the user’s consent would be required every time they send an email. If you’re a lawyer and you think that this reasoning has some merit, let me know so that we can file a class-action lawsuit to bring this industry practice to an end. - Device name: Mail servers also include the client’s argument to the
EHLO
command in theReceived
header field. RFC 5321 requires that the client uses its fully qualified domain name if it has one or its IP address otherwise. In spite of this, Thunderbird and maybe other clients use the name of your device in the local network as the argument. On macOS, you find the name of your device in the “Sharing” tab of your “System Preferences”. By default, it starts with the first name of your user account. In my case, my computer is reachable underKaspars-MacBook-Pro.local
in the local network. As a whistleblower, I might create a new email address and even use an anonymization service, such as Tor, just to have my mail client and mail server leak my real name. RFC 5321 even warns about exactly this problem. I reported this privacy bug to Mozilla Thunderbird on 2 December 2020. Until a fix is available, you can set themail.smtpserver.default.hello_argument
option in the config editor to[192.168.1.1]
. Such a value is typical for the vast majority of people due to network address translation (NAT). - Timezone: The sent date is usually encoded in the timezone of the sender.
By looking at the offset from the Greenwich Mean Time (GMT),
the recipient learns from which longitude a message was sent.
In my opinion, mail clients should always encode the
Date
field in Greenwich Mean Time. - Mail client: Many mail clients put their name with their current version
into a
User-Agent
orX-Mailer
header field. Some mail clients even include the name and the version of the operating system on which they run. While such data is usually harmless, it can provide valuable information to someone who wants to attack you. Given the intricacies of email, mail clients can also be identified by how they delimit parts, how they label files, how they style messages, how they quote messages, and so on. This is known as fingerprinting, and it allows a recipient to determine whether separate messages were sent from the same client. - Display names: Your mail client not just adds your name
as a display name in the
From
address, it also adds a display name for each recipient it knows. This can leak how you’ve stored a recipient in your address book (i.e. be careful under what name you store the colleague you’re having an affair with) and with whom of the recipients you’ve been in contact before (because mail clients usually add display names from earlier conversations automatically). As a recipient, you have to inspect the raw message to see what the sender provided because your mail client typically overwrites the display names with the information from its own address book. In my opinion, mail clients should remove the display names of recipients before sending a message. - Hidden recipients: The
Received
header field has an optionalfor
clause, which contains the address of the specified recipient. As recommended by RFC 5321, thefor
clause is skipped when there are several recipients in the envelope of the message. As a consequence, a single non-hidden recipient learns that the message was also sent to hidden recipients if thefor
clause is missing in the bottommostReceived
header field. This means that the emptyBcc
field approach is used more often than intended. - Attachments: The content disposition of attachments can include information such as when the file was created and when it was last modified. While it can be useful to preserve such information when mailing a file, sharing such information with the recipient can also be unexpected and undesirable. I don’t know how mail clients can determine the preferred option without cluttering the user experience. By default, they should err on the side of caution, which many do.
Assuming that all recipients can be trusted is foolish. If someone pretends to be interested in your work, you’ll likely reply to them.
Recipient towards sender
There are two ways in which the sender can track the recipient: By including remote content in the message and by redirecting external links. If the sender can trick you into replying to them or your mail client sends a read receipt, then all the above privacy issues also apply, of course.
Remote content
HTML emails can include remote content,
which is fetched by the mail client when it renders the message.
Images are by far the most common type of remote content.
They are usually included with the <img>
element
or with the background-image
property.
Some mail clients support external style sheets
through the <link>
element,
but internal CSS can also have
@import
statements
to load Web fonts
and other styles with the url()
function.
There are other elements, such as <audio>
,
<video>
, and
<iframe>
,
which can also be used to include remote content, but not all mail clients support them.
2. Fetch content1. Fetch emailMailserverMailclientWebserver After fetching the email from the trusted mail server, the mail client fetches the remote content from the untrusted web server.
Remote content violates three fundamental principles of email:
- Offline reading: Since mail clients usually fetch the remote content only when you open the message, substantial parts of the message can be missing when your computer is not connected to the Internet. Since most mail clients don’t cache the remote resources, being online when you open the message for the first time isn’t enough.
- Immutable content: Most users probably think that
once they have received an email, the sender can no longer modify it because
your inbox contains an independent copy of the message, to which the sender has no access.
Unfortunately, this assumption doesn’t hold for HTML emails with remote content.
Since remote content isn’t cached, different content can be provided every time you open a message.
Some clever engineers used this circumstance to include a dynamic Twitter feed in an email.
If you’re not aware of this “feature”, though, you might fall for a scammer
who seemingly predicted the development of some market accurately.
And even if the remote content isn’t modified,
you can no longer view the original message once the sender stops hosting it.
The situation gets even worse if the domain on which the remote content is hosted is transferred to a new owner
or if the web server which hosts the remote content is compromised by an attacker.
Furthermore, remote content isn’t covered by message signatures.
In theory, some of these security issues could be addressed with a technique known as
subresource integrity (SRI).
If the sender included the hash of the resource
in the original message, then the resource could no longer be modified afterwards.
Unfortunately, subresource integrity is
specified only for
<script>
and<link>
elements. While a future revision of the specification might add support for integrity checks to other elements, there are no plans for this yet. - Reading privacy: Whenever your mail client fetches a remote resource,
the web server operator learns when and from where the resource has been accessed.
Since most mail clients include a
User-Agent
header field in their HTTP request, the web server operator also learns which mail client you use. For the reasons mentioned in the previous point, senders should reference only remote content which they control. Email newsletters often include remote content with a personalized URL just to track who opened the message when and from where. Based on this data, the sender can determine what percentage of recipients opened the email, which is known as the open rate. It’s important to note that your privacy when reading emails is not worse than when browsing the Web. The crucial difference is that on the Web, you go to a website, whereas in the case of email, the website comes to you. Since you don’t want to provide your IP address to anyone, you should disable remote content in your mail client.
In my opinion, remote content should never have been supported by mail clients. If people insist on incorporating related files into a message, they can use aggregate documents. Now that remote content is used so widely, we have to live with the above drawbacks.
Proxying remote content 3. Fetchcontent2. Fetch content1. Fetch emailMailserverProxyserverMailclientWebserver How remote content can be fetched via a proxy server in order to protect the user’s privacy to some degree.
How to disable remote content
Link tracking
Emails often contain links to websites. Instead of linking to the target site directly, the sender can rewrite the link in such a way that your web browser sends a request to their tracking server, which in turn redirects your browser to the actual web server:
5. Request4. Redirect3. Request2. Open1. FetchMailserverMailclientWebbrowserTrackingserverWebserver How clicks on links can be tracked by the sender of a message. In general, you cannot trust the servers in orange.
Unlike tracking pixels, link tracking also works in plaintext emails and when remote content is disabled. If the target website isn’t identifiable in the tracking link, you have no other choice than to request its address from the tracking server if you really want to see the advertised content. The sender can use tracking links to measure what percentage of recipients opened the link, which is known as the click-through rate (CTR). The same technique is often used on social media in combination with URL shortening to determine the reach of a post.
Since seeing is believing, I wrote a little tool to track emails. You can generate a unique token and then subscribe to the associated events below. You can send the tracking link and the tracking image to someone using your mail client or the ESMTP tool above. I’ve deployed the tracking server on heroku.com. As you can see in its source code, my server doesn’t store anything but Heroku logs the last 1’500 requests, which includes the token, the link, and your IP address. I don’t persist the log file, but I might check it from time to time for troubleshooting. In order to determine where a request was made from, the tool uses the free API from ipinfo.io. You can also use the tool to see from where social media apps request an URL to generate a link preview or to convince yourself that the Tor browser indeed connects from a different location every time you restart it.
Security
Security and the lack thereof have been a topic throughout this article. In this section, I shine a light on some additional aspects.
Spoofing
As we saw earlier, the sender of an email can easily be spoofed because at least historically emails aren’t authenticated. Somewhat frustratingly, RFC 5321 and some companies see forged sender addresses more as a feature than as a bug. Criminals abuse this “feature” to trick unsuspecting users into performing actions or disclosing information, which they wouldn’t do otherwise. Exploiting the credulity of people is known as social engineering. Besides impersonating a trusted organization for phishing, a common attack is to send a victim an email which seemingly comes from their own address. In the message, the attacker claims that they’ve compromised the victim’s computer and that they’ve recorded the victim masturbating to porn. The attacker threatens to send the recording to all the victim’s contacts unless they receive a payment, usually in Bitcoin, within a couple of days. This form of blackmailing is known as sextortion. If you receive such an email yourself, how do you know that the attacker’s claim is wrong? First of all, you know now that the sender address of emails can easily be forged and that there is no reason to assume that your account has been compromised. But more importantly, if there was an easy way to increase the fraction of people who pay the ransom, criminals would certainly make use of it. In the case of sextortion, they would just have to include a screenshot of the recording and the addresses of some contacts to make presumably the large majority of people pay. Given that this is (usually) not the case, there’s no reason to worry. Do people fall for this crap? The answer is yes, unfortunately. The first time I’ve received such a message was on 13 January 2019. The fraudster demanded 356 Euro in Bitcoin to remain silent and was stupid enough to provide the same Bitcoin address to several victims. Since all Bitcoin transactions are public, we know exactly how much money they made: 5.379 BTC, which was worth around 20’000 USD at the time. This also means that they had no way to know who of their victims actually paid, which made their threat even less credible to anyone who has a basic understanding of blockchains.
Besides social engineering, spoofed sender addresses can be abused where emails are used for authentication. For example, people can often unsubscribe from mailing lists via email. Even if this is not the case, many mailing lists remove subscribers to whom several messages in a row couldn’t be delivered automatically. Unless a mailing list uses unpredictable variable envelope return paths (VERP), bounce messages can easily be forged, which means that you can unsubscribe other people from the mailing list. Similarly, it’s often the case that only approved senders can send a message to all subscribers of a mailing list. Anyone who knows how to spoof emails can easily bypass this restriction and spam the mailing list.
Email address spoofing can be prevented by enforcing domain authentication, which I’ll cover in the last chapter of this article.
Phishing
Impersonating a trusted organization to obtain sensitive information or payments from gullible users is known as phishing. Phishing emails often direct their victims to a fraudulent website which looks exactly like the legitimate website. By providing a pretext, the attacker tries to get the victims to perform a specific action, such as entering their username and password or initiating a payment with their credit card. Phishing attacks can target specific individuals or a diverse group of people. If they’re not just an advance-fee scam, they usually require some technical skills to execute them. This is why most phishing attacks are motivated by financial gain rather than a desire to harass or stalk the victim. While requesting a payment leads to a direct success for the criminals, usernames and passwords can be used to launch further attacks from the victim’s account. For example, the credentials of an employee can be used to infiltrate a company in order to obtain trade secrets or to install ransomware on their computers.
Phishing attacks come in all shapes and sizes, but you can reduce your risk by sticking to the following principles:
- Always be suspicious: If an email prompts you to perform a certain action, your alarm bells should ring. Have you been prompted for similar actions before? Is the time frame to perform the action unusually short? Is there a reasonable default option if you don’t perform the action? Does the action involve the disclosure of sensitive information or a payment?
- Don’t click on links: Phishing attacks require that you take the bait. Create a bookmark for all the websites where you have an account. Make it a habit to navigate to these websites yourself instead of following links. If an email says that a subscription is about to expire, log in to the website of the service provider with the bookmark and not the link. Using a bookmark (or a search engine) to navigate to a website is better than relying on the address autocompletion of your browser. If you clicked on a dubious link by mistake in the past, the fraudulent URL is still in your browser’s history and you may not be able to recognize it as such.
- Hover over links: If you can’t suppress your urge to click on a link,
move your mouse over the link first and verify whether the status bar
at the bottom of the window indeed displays the address you want to visit.
You should always do this because the text of a link can be misleading.
For example, www.google.com takes you to
Bing, not Google.
You should check the destination of a link before you click on it.
If you check the destination of a link only in the opened browser window,
you have already confirmed to the attacker that you click on links,
and the visited website might have already infected your computer
with malware.
Unfortunately, link tracking can make it quite difficult to recognize
whether the destination of a link is legitimate.
Furthermore, not all companies prime their users to trust only a single domain.
For example, PayPal, of all companies,
directs their users to
paypal-communication.com
instead ofpaypal.com
when informing them about changes to the general terms and conditions. Additionally, homograph attacks can make it difficult or even impossible to recognize that the target domain is not the legitimate one. This is one more reason why you shouldn’t click on links in the first place. The only exception to this rule are links to articles on which you won’t perform any actions. However, this means that you have to remember for each tab of your browser whether the address came from a trusted or an untrusted source. Anything you open on an untrusted page can also not be trusted. Some mail clients, such as Apple Mail, don’t have a status bar and show the destination address in a tooltip instead. And yes, Apple Mail is smart enough to override any tooltips that a sender provided with thetitle
attribute. I’ve tested this. - Use a password manager: Seriously. Password managers not only allow you to have a long, randomly-generated password for each website, they also prevent you from entering them on the wrong websites. To be precise, you can still paste your passwords into any input fields you like. Password managers just won’t do this for you if the domain is different. This is just one more level of defense, which is especially useful for innocuous-looking actions that don’t trigger your alarm bells. For example, some websites require you to log in before you can unsubscribe from their newsletter, and such an email and login can be bogus, of course.
- Verify the sender: Who sent you the email?
Since the sender of an email can (still) be spoofed,
a trusted sender shouldn’t lower your level of suspicion much.
If the domain in the
From
address doesn’t belong to the impersonated organization, though, you should almost certainly ignore and delete the message. Mail clients could do way more to protect their users from phishing attacks. For example, changing the policy for incoming emails from “allowed unless blocked” to “blocked unless allowed” would likely help a lot in shifting the mindset of users. Mail clients could also display the country of origin for each message, warn the user if the message isn’t authenticated or if the clicked link leads to a domain which is different from theFrom
address, etc. - Disable display names: While spoofing sender addresses can be prevented by technical means, the sender can choose their display name at will. Since sender-chosen means attacker-chosen, users shouldn’t be confronted with unverified display names. Unfortunately, all the mail clients I’ve checked handle this aspect so badly that I had to write a separate box on this topic.
- Confirm out-of-band: If a known sender asks you to perform an action which has far-reaching or irreversible consequences, contact the sender through a different communication channel and let them confirm the request before executing it. Obeying orders blindly is dangerous from a security perspective and subordinates should be trained and encouraged to question them.
Confidentiality and integrity
As we saw earlier, the percentage of emails which are encrypted and authenticated in transit increased significantly over the last decade. When you send an email, though, there is no guarantee that the confidentiality and integrity of your message is protected when it is relayed from your outgoing mail server to the incoming mail server of the recipient. This is especially problematic when email is used to perform security-critical operations, such as password resets. Due to backward compatibility, the email protocols are secure only against passive attackers. I will cover the efforts to make email secure against active attackers in the last chapter.
In my opinion, mail clients should warn their users if the incoming mail server of one of the recipients doesn’t support strict transport security. You can increase the pressure on email service providers only by increasing the awareness of users. Gmail provides an easy way to see whether a received message has been authenticated and encrypted in transit, which allows users to assess the authenticity and, somewhat misleadingly, the confidentiality of a message at least after it has been transmitted:
You can click on the little triangle to see more details in Gmail’s web interface. mailed-by
indicates a successful SPF check, signed-by
a valid DKIM signature. security
indicates that the outgoing mail server of the sender used STARTTLS.
Reliable delivery (availability)
Besides confidentiality and integrity, information security is also concerned with the availability of a service. Since your message might be silently discarded as spam or land in the recipient’s spam folder, which they don’t check on a regular basis, you can never be certain that a (new) recipient received your message in their inbox. Most people minimize this risk by not hosting their emails themselves. Once domain authentication is commonplace, which solves the problem of backscatter, we can hopefully fight spam with other techniques so that self-hosting becomes feasible again.
Custom email filters are another source of unreliability.
When users receive too many emails which they don’t want or can’t handle,
they are tempted to set up a rule which moves or deletes them automatically.
Personally, I have a rule which deletes all messages
which contain certain keywords, such as “lotto winner”, in their subject.
I’ve recently also added some top-level domains,
such as .cheap
and .city
, to this list.
If the From
address ends in one of these domains, the message is deleted immediately.
My custom anti-spam rule, which also includes the domains of sales companies, does wonders for my inbox.
The problem with custom email filters, though, is that they often work
like shotguns:
They certainly hit the messages you wanted to remove from your inbox,
but due to their simplicity, they likely bring down legitimate messages as well.
As long as senders send emails automatically, recipients will remove them automatically.
Casualties are to be expected in such a setup.
Quoting HTML messages
HTML emails can be styled. When you reply to or forward such an email, your mail client has to make sure that the quoted message cannot change the appearance of your own message. If the quoted message isn’t escaped properly, an attacker can inject text into the victim’s response. When quoting an HTML message, mail clients need to ensure the following two things:
- Scoped styles: The style of the quoted message may not leak into the surrounding message.
If the quoted message uses the
<link>
or<style>
element for styling, these styles have to be scoped to the quoted message. Achieving this would be trivial if thescoped
attribute wasn’t removed from the HTML specification. If we’re lucky, we might get an@scope
selector in a future CSS standard. Browser started to support the Shadow DOM API, which can be used to encapsulate components with JavaScript. Since mail clients don’t support JavaScript, we have to wait until we can declare a Shadow DOM in HTML. So how do mail clients handle this? Gmail simply removes the<style>
element when quoting an HTML message. If you want to make sure that your message is still displayed properly when it is replied to or forwarded, you have to keep inlining the styles. Apple Mail inlines internal CSS when quoting an HTML message. Yahoo Mail moves the<style>
element from the<head>
into the<body>
and prefixes each rule with an ID, which it also assigns to the<div>
element which contains the quoted message. Thunderbird only moves the<style>
element from the<head>
into the<body>
and thus fails to scope the styles. Outlook.com behaves differently for replies and forwarded messages: It fails like Thunderbird in the former case and inline styles incorrectly in the latter case. - No overlays: CSS can be used to move HTML elements away from their default position in a document.
This becomes a problem when HTML elements in the quoted message can be moved above the
attribution line
since email users are trained to perceive everything above the attribution line
as coming from the sender of the message.
I can think of three ways how HTML elements can be moved around with CSS,
but I wouldn’t be surprised if CSS has more ways to achieve this.
Firstly, there is the
position
property, with which elements can be moved relative to their default position or to an absolute position in the document. Secondly, thetransform
property can be used totranslate
HTML elements (and toscale
and torotate
them). Thirdly, negative margins have a similar effect asposition: relative;
. Without having tested all possibilities, I have the impression that webmail clients handle this quite well. For example, Gmail doesn’t listposition
andtransform
under supported CSS properties and also removes negative margins before displaying a message. Desktop clients, on the other hand, struggle with this.position: absolute;
andmargin-top: -200px;
work in Apple Mail and in Thunderbird. Restricting styles to inline CSS isn’t enough to scope the styles. Doing so would make it more difficult, though, to show the injected text only in the reply but not when composing the reply.
If analyzing the raw message before forwarding or replying to a message is too much to ask from you, you have only two options to avoid these issues: Choose a mail client which cares about your security or enforce that all messages are composed in plaintext. Apple Mail allows you to configure this in the “Composing” tab of your “Preferences”: Change the “Message format” to “Plain text” and disable “Use the same message format as the original message”. If you use Thunderbird, you can disable “Compose messages in HTML format” under the “Composition & Addressing” tab of your “Account Settings”.
Thunderbird example exploit
How to exploit Thunderbird’s failure to scope the styles of the quoted message.
Click here
to use this example in the ESMTP tool above.
The attack uses the ::before
pseudo-element
to inject the text, and p[_moz_dirty]
to hide the injected text during composition.
Outlook.com example exploit
How to exploit Outlook.com’s failure to scope the styles of the quoted message.
Click here
to use this example in the ESMTP tool above.
Since Outlook.com doesn’t copy styles like div[style]:before
and div:first-child:before
to the reply,
I had to abuse the <hr>
element
to make the injected text appear only once.
Different appearances
Another issue with email is that the same message can appear differently to different recipients.
This is a problem whenever you refer to the content of an earlier message,
no matter whether you quote the message
or reference it in the In-Reply-To
header field.
Until mail clients address this issue, you must repeat the content you refer to.
Emails can appear differently for three reasons:
-
multipart/alternative
: Multipart messages can include different versions of the same content so that the mail client of the recipient can display the last version whose content type it supports. However, nothing guarantees that the various parts contain the same content. Spam filters might flag messages whose alternative parts diverge too much from one another, but determining whether different parts contain the same content is more difficult than it seems. Let’s look at an example:A simple HTML message whose plaintext version conveys a different content. Click here to use this example in the ESMTP tool above.<html> <body> Hi boss, can you confirm to our accountant in Cc that my monthly salary is increased by USD 100<span style="font-size: 0;">0</span> starting next month? </body> </html>
If your boss uses an HTML-capable mail client, they will see
USD 100
in the message. When your boss replies to this message with “Yes, that’s what we agreed.”, all the mail clients I usually mention in this article generate aContent-Type: text/plain
version of the reply, which includesUSD 1000
. If you know that your accountant uses a plaintext-only mail client, this attack will work. On most HTML-capable mail clients, you can see the plaintext version only by inspecting the raw message. Thunderbird, however, allows you to change which part is being displayed by switching the “Message Body As” in the “View” menu. If you usedisplay: none;
instead offont-size: 0;
, Apple Mail won’t include the additional zero in the plaintext reply as it uses something likeinnerText
to determine the plaintext content. There are plenty of ways to hide content with CSS, though, and the plaintext conversion algorithm would have to consider them all. Since computing what is actually being displayed is impractical, the solution has to be to force all content to render in HTML messages by disabling those CSS properties. Since already the original message could have contained conflicting alternative parts, mail clients which take security seriously should probably warn their users when they reply tomultipart/alternative
messages because most mail clients hide the quoted messages in email conversations. All the mail clients I’ve tested generate the quoted message in the plaintext part of the reply from the HTML part that they’ve displayed to the user. If it wasn’t for malicious CSS styles, mail clients wouldn’t prepend your reply to content you haven’t seen. The only problem that remains is that theIn-Reply-To
header field doesn’t specify which alternative part your message refers to. -
Conditional styles: Even without alternative parts, the same message can be rendered differently on different devices due to media queries. The following message shows a different text on devices with a small screen than on devices with a large screen:
A simple HTML message which is displayed differently on different devices. Click here to use this example in the ESMTP tool above.<html> <head> <style type="text/css"> @media (max-width: 599px) { .large { display: none; } } @media (min-width: 600px) { .small { display: none; } } .touch { display: none; } @media (pointer: coarse) { .touch { display: inline; } } </style> </head> <body> <p> You have a <span class="large">large</span> <span class="small">small</span> <span class="touch">touch</span> screen. </p> </body> </html>
Media queries are useful to design websites for various screen sizes, which is known as responsive web design. Since emails are read on a wide variety of devices, media queries are an important technique to make them look good on all devices. Since media queries and selectors aren’t allowed in the
style
attribute, conditional rendering is much easier in mail clients which support internal or external CSS, which is the vast majority by now. In order to prevent this attack, Thunderbird no longer supports media queries. In my opinion, this is the wrong approach and the fix should rather be to force all content to render. Styles should affect only how content is displayed, not which content is being displayed. The supported media features vary greatly among clients. For example, the screen width media queries are supported by Gmail, Outlook.com, Yahoo Mail, and Apple Mail (also on iOS). Thepointer
media query, which can be used to detect a touch screen, is removed by the Gmail and Yahoo Mail webclients. -
Different implementations: As long as different users use different mail clients which sanitize emails differently, attackers can draft messages which are displayed differently to different recipients. Since it’s easy to learn which mail client someone uses, it’s often not difficult to have some part of a message be shown or hidden for a specific recipient. I’ve drafted such a message for you:
A simple HTML message which is displayed differently by different clients. Click here to use this example in the ESMTP tool above.<html> <head> <style type="text/css"> .apple-mail, .outlook { display: none; } @media (pointer) { .apple-mail { display: inline; } div .apple-mail { display: none; } div .outlook { display: inline; } } @media (min-width: 0px) { .thunderbird { display: none; } } p:first-child .gmail { display: none; } .yahoo-mail { display: none; } p:first-child .yahoo-mail { display: inline; } body .yahoo-mail { display: none !important; } </style> </head> <body> <p> You're reading this message <span class="apple-mail">in Apple Mail.</span> <span class="thunderbird">in Thunderbird.</span> <span class="gmail">on mail.google.com.</span> <span class="outlook">on outlook.live.com.</span> <span class="yahoo-mail">on mail.yahoo.com.</span> </p> </body> </html>
As long as not all mail clients prevent senders from hiding content with CSS,
email styling can be abused.
Don’t we have the same problem with websites?
In principle, yes, but the difference lies in the expectation of users.
On the Web, you know that pages are often customized and that their content can change at any moment.
In the case of email, however, you expect that everyone sees the same content,
especially when you quote another message.
If you reply to messages without quoting them,
an attacker can deliver a different message with the same
Message-ID
to each of the recipients.
As I wrote earlier:
Just because someone is listed as another recipient
doesn’t mean that they received the same message as you.
The abuse of conditional CSS rules as a signing oracle was discovered and published by
Jens Müller and his colleagues in 2019.
The problem with diverging multipart/alternative
parts was discussed thereafter
in this Thunderbird issue.
Hide content with CSS
The many ways to hide text and images with CSS. If you can think of another way, let me know.
Complexity
Since you made it to this paragraph, I probably don’t need to convince you that email is incredibly complex. Email is a system that has been retrofitted to modern requirements for 40 years. It’s no wonder then that what we have today is a complicated patchwork of extensions. Just to be clear: I don’t want to criticize anyone in this section. Most of the design decisions that led us to the current situation were reasonable at the time. I still think it’s a good idea to assess what brought us here as this allows us to appreciate what we have now. In my view, the following limitations of early email are responsible for most of today’s complexity:
- Text-based protocols: Using characters to delimit the various parts of protocols and messages makes it easy for us to interact with servers manually, but it also prevents us from sending arbitrary content without escaping it first. SMTP and POP3 require periods to be escaped, IMAP, Sieve, and ManageSieve require user-provided arguments to be escaped, and multipart messages require unique boundaries. None of this is tragic but conversions are always a potential source of errors and incompatibilities.
- Line-length limit:
Since each line of a message may consist of at most 1’000 characters,
folding whitespace
is required for header fields,
long text lines have to be broken
without conflicting with quoting conventions,
and system-specific newline characters must be converted to
{CR}{LF}
. - ASCII-only characters:
Email is older than ISO 8859
and Unicode.
To remain backward compatible,
non-ASCII characters have to be encoded in the message body,
in header fields, in domain names,
in parameter values, and in URLs.
To make things even more complicated, all these encodings are different.
When the involved servers support
SMTPUTF8
, UTF-8 can be used in the local part of an email address, but internationalized messages have to be downgraded for clients which don’t support UTF-8. - No submission protocol: In the early years of email, mail clients could submit outgoing messages to any mail server without authentication. As a consequence, mail submission and mail retrieval were handled completely differently. In order to make the change for existing mail clients as small as possible, the mail submission protocol was forked from ESMTP rather than being incorporated into access protocols, such as POP3 and IMAP. Unless their mail service provider is in a configuration database, users have to configure both their incoming mail server and their outgoing mail server to this day. If they change their passwords, they usually have to enter it twice in their settings. Furthermore, it can happen that they can receive messages but cannot send them and vice versa. For ordinary users, this is really confusing. This distinction between incoming mail server and outgoing mail server is also the reason why messages have to be submitted twice if you want to record the sent messages in your mailbox. After two decades of little progress with regard to access protocols, JMAP finally addresses this and many other issues.
- No transport security: Emails and account passwords couldn’t be secured in transit for more than a decade. Once Transport Layer Security (TLS) became popular, existing protocols were retrofitted so that all communication could be encrypted and authenticated. All the protocols were extended to support Explicit TLS, but all of them require different commands to activate TLS, which makes it difficult to use some of them from the command line. The introduction of protocol variants which use Implicit TLS required additional port numbers, which confuses ordinary users even more. Since mail servers don’t know whether other mail servers support TLS, the communication between them is still vulnerable to downgrade attacks. I’ll cover in the last chapter of this article how such attacks can be prevented.
- No sender authentication: Since emails aren’t authenticated, it’s quite easy to spoof the sender of a message. This aggravates problems such as spam and phishing, and it can lead to undesirable backscatter. I’ll explain the mechanisms which are used to alleviate this issue in the last chapter.
Benign inconsistencies
Unreasonable decisions
Innovation
Besides JMAP, dynamic content, and what we’ll discuss in the last chapter, there was barely any innovation over the last two decades. This is a pity given that email is the only decentralized communication service with global adoption. I can only speculate about the reasons for the lack of innovation:
- Complexity: The enormous complexity of email can deter software engineers from entering the field. Patching a heavily patched system further is also not appealing to many young talents. I hope this article can motivate more people to shape the future of email in a positive way.
- Fragmentation: The email ecosystem is so fragmented that no single organization can push the industry forward. The innovation that we see, such as email markup and dynamic content, often remains limited to just a few companies. If you want to write a mail client for a general audience, you have to support IMAP. If you have to deal with the intricacies of IMAP anyway, you don’t gain anything by implementing a newer access protocol such as JMAP as well. As long as all mail clients which people want to use support IMAP, existing email service providers have little incentive to support JMAP.
- Saturation: The email market is saturated with free solutions for clients, servers, and hosting. The low willingness to pay for a product or service makes it really hard to build an innovative business in this space. Combined with the inertia of users, there is almost no economic pressure to innovate. Email service providers with a strong focus on privacy are the only exception to this rule because more and more people realize that if they don’t pay for a service, they’re the product and not the customer.
Format innovation
Since Skype failed to innovate,
it was superseded by Zoom.
The same fate is happening to WhatsApp:
Telegram is showing us
how much room for innovation there is for a messaging app.
There’s plenty of features I would like to see in email.
For a start, we still have no No-Reply
header field,
no Proof-Of-Work
header field,
no header field to reference the previous message
by its hash
(ideally using a Merkle tree for MIME parts
so that attachments can be removed from a message without invalidating its hash),
no header fields for the sender’s contact details
to replace email signatures,
no content type to initiate and reply to surveys, etc.
Some features, such as message compression, exist in theory but not in practice.
Other features, which originated in the alternative email system X.400,
were formally specified as IETF email header fields in order to increase compatibility between the two systems
but were never recommended for general use.
Among these header fields are Supersedes
to replace a sent message with a revised version,
Expires
to indicate when a message loses its validity,
and Reply-By
to request a response in the specified time period.
Client innovation
Given the decentralized nature of email, protocol and format innovations are difficult to achieve.
However, nothing hinders mail clients from innovating at the edge of the network.
I’ve mentioned plenty of ideas throughout this article.
Among them are sender approval, automatic challenges,
Bcc
recovery, privacy features such as
proxying remote content via Tor
(and even submitting emails via Tor as long as email service providers leak the IP addresses of their users),
and security features such as preventing malicious display names
and different appearances of messages.
It would be great if my mail client displayed whether a received message was successfully
authenticated with SPF and DKIM
(just like Gmail).
I would like to see native support for DNS-based autoconfiguration,
Sieve and ManageSieve,
as well as PGP.
I don’t understand why mail clients separate the outbox from the inbox.
(I don’t know any other messaging app which does this,
and just because IMAP uses folders doesn’t mean you have to display them.)
I think it would be great if my mail client could
timestamp all the emails that I send.
Whenever I submit a responsible disclosure, I do this manually.
Fixes
The last chapter of this article is dedicated to recent standards which address some of the aforementioned security issues. We’ll study how spoofing is prevented with domain authentication and how confidentiality and integrity is ensured in the presence of an active attacker with strict transport security. Many of the approaches rely on the Domain Name System (DNS) to provide additional information. This is secure only if the records are authenticated with DNSSEC. I will no longer mention this aspect in the remaining subsections. Some of the steps have to be performed by the owner of the domain rather than the email service provider. If you use a custom domain for your emails, you should definitely read the part about domain authentication to make sure that your domain is configured properly. Since email is a decentralized service, we can improve its security only in a collective effort.
Domain authentication
Historically, the sender of an email was not authenticated:
Anyone could relay a message to anyone
using any From
address they wanted.
Impersonating another sender is known as spoofing.
While the prevention of spoofing won’t eliminate spam and phishing on its own
because spammers can implement the following standards as well and phishing remains possible
with similar domains and malicious display names,
it’s an important prerequisite for other techniques, such as flagging unknown senders.
As we’ve seen earlier, email spoofing is addressed in two steps:
The incoming mail server of the recipient verifies
that the other party is authorized to send emails on behalf of the sender’s domain
and the outgoing mail server of this domain ensures that the local part of the From
address
belongs to the user who submitted the message.
Userauthen-ticationUserauthen-ticationDomainauthenticationUserauthen-ticationMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient The incoming mail server of the recipient authenticates the outgoing mail server of the sender and the outgoing mail server of the sender authenticates the user who submits the message.
As the title suggests, this subsection is only about the first part of the problem, namely how a domain owner can specify which mail servers are authorized to send messages on behalf of the domain and how receiving mail servers can verify whether the sending mail server is indeed authorized for the claimed domain. The second part is usually solved with password-based authentication mechanisms. The following techniques don’t prevent spoofing if the outgoing mail server of the sender is compromised or if the attacker can create an account at the same email service provider and impersonate another user during submission.
Before you continue, make sure that you understand the difference between a message and its envelope.
There are three complementary standards for domain authentication:
- Sender Policy Framework (SPF):
List the IP addresses of your outgoing mail servers in a DNS record at your domain.
SPF protects only the
MAIL
FROM
address, which is used for bounce messages, and can cause problems with email forwarding. - DomainKeys Identified Mail (DKIM): Let the outgoing mail servers sign outgoing messages and publish the public key in a DNS record at the sender’s domain. The signature usually survives email forwarding but introduces non-repudiation.
- Domain-based Message Authentication, Reporting and Conformance (DMARC):
Publish a policy, which tells recipients what to do with emails
that fail the SPF or DKIM check, in a DNS record at your domain.
Without a DMARC record, the recipient cannot know whether the sender uses DKIM.
By publishing a DMARC policy, you also require the domain in the
From
address to match the domain in theMAIL
FROM
address and the DKIM header field. Moreover, you can specify an email address to which receiving mail servers can send aggregate reports so that you know how your DMARC policy affects the delivery of your own messages.
Adoption benefits
Domain owner
How to perform WHOIS queries yourself.
Once the TCP connection is established, you just enter the domain name of interest followed by a new line.
(Telnet should convert “return”
into {CR}{LF}
automatically.)
As soon as the server has sent the answer, it closes the connection.
Privacy implications
Name chaining attacks
Sender Policy Framework (SPF)
The Sender Policy Framework (SPF)
is specified in RFC 7208.
As a domain owner, you list the IP addresses of your outgoing mail servers in a TXT
record at your domain.
Incoming mail servers then check whether the IP address of the sending mail server is included
in the SPF record of the domain which was used in the MAIL
FROM
command.
If this is not the case, incoming mail servers can reject the message
during the SMTP session.
RFC 7372 defines
enhanced status codes
with which the server can indicate failed SPF validation.
Since IP addresses cannot be spoofed
without access to the target’s local network,
this procedure authenticates the sender’s domain.
So how do you create the SPF record for your domain? If you don’t run your email server yourself, your SPF record will consist of:
- Version:
Every SPF record has to start with
v=spf1
. - Includes:
An SPF record can
include
the IP addresses of another SPF record. Search for the appropriate record from your email service provider. For example, putinclude:_spf.google.com
(source) into your SPF record if you use Google Workspace. Since mailing list providers such as Mailchimp use an address of their own in theMAIL
FROM
command so that they can handle bounce messages for you, you don’t need to add the addresses of their servers to your SPF record. - Default:
Provide an explicit default result for any sender
which didn’t match one of the previous mechanisms.
If you want incoming mail servers to reject messages with a spoofed
MAIL
FROM
domain, use-all
. If you want incoming mail servers to just flag such messages as potentially fraudulent, use~all
. In order not to disrupt email forwarding, incoming mail servers are unlikely to enforce your SPF policy. They are much more likely to enforce the domain policy of your DMARC record.
An SPF record created according to the above steps looks as follows: v=spf1 include:_spf.google.com -all
.
On domains from which you don’t send any emails,
you should use v=spf1 -all
.
The full syntax of SPF records is much more powerful than this but rarely needed.
I will cover SPF in more detail in the boxes below.
There are a lot of things
that can go wrong when configuring an SPF record.
For a start, a domain may have at most one SPF record
and the number of additional DNS lookups an SPF record may trigger
is limited.
Instead of listing all the pitfalls here, I’ve built a tool
which performs 30 different checks on your SPF record.
It uses Google’s DNS API to query the records.
Please note that this tool warns you only about common mistakes,
it doesn’t verify whether your outgoing mail servers are included in the record.
You still have to test your setup by sending emails
and checking the Received-SPF
header field.
By not evaluating whether an IP address passes SPF validation,
the tool is also limited in other regards.
SPF-Received header field
The format of the Received-SPF
as specified in RFC 7208.
The curly brackets need to be replaced with actual values.
The content in square brackets is optional and
the asterisk indicates that the preceding content can be repeated.
An example Received-SPF
header field. The values are intended to make the result verifiable.
Protecting subdomains
Email forwarding SPF qualifiers
The curly brackets need to be replaced with a qualifier, a mechanism or a modifier. The content in square brackets is optional. The asterisk indicates that the preceding content can be repeated.
The four qualifiers and the evaluation results to which they lead.
SPF mechanisms SPF modifiers SPF macros
HELO identity
TXT size limits
How dig
displays the OPT
pseudo resource record.
dig
increases the UDP payload limit to 4096.
As you can also see, dig
sets the do
(DNSSEC OK) flag of EDNS there
to ask for DNSSEC records.
SPF record type
DomainKeys Identified Mail (DKIM)
DomainKeys Identified Mail (DKIM)
is specified in RFC 6376.
(DomainKeys was a predecessor
designed by Yahoo
and the name survived the IETF standardization process.)
DKIM allows a domain owner to take responsibility for a message
by signing its body and selected header fields.
Any mail server through which a message passes can add a DKIM signature.
Unlike S/MIME and PGP, which alter the body of a message,
DKIM signatures are added in a new header field,
which makes them unobtrusive for users whose mail clients don’t support DKIM.
Also unlike S/MIME and PGP, DKIM uses the Domain Name System
as its public-key infrastructure,
which is secure only in combination with DNSSEC.
The owner of a domain can publish several public keys
so that different servers can use different private keys for signing.
The ability to publish several keys is also useful for introducing a new key before revoking an old one.
Each public key is identified by a unique Selector
within its Domain
and is published in TXT
record at {Selector}._domainkey.{Domain}
.
The selector can contain periods, which allows large organizations to split the namespace
of their DKIM keys into several administrative zones.
Both the Domain
and the Selector
are included in the
DKIM-Signature
header field
so that verifiers know how to retrieve the appropriate public key.
Since the public key used to verify a signature is retrieved from the stated domain,
a valid DKIM signature authenticates this domain.
The standard doesn’t specify
which entities add and verify DKIM signatures.
Since DKIM keys are valid for the whole domain and cannot be restricted to individual users,
messages are usually signed by the outgoing mail server after authenticating the user.
If you are the only user of your domain and your email service provider doesn’t support DKIM,
you could add a DKIM signature to a message before submitting it to the outgoing mail server
but I’m not aware of any mail client which supports this.
Since DKIM keys can be revoked at any time after a message has been delivered,
DKIM signatures are typically verified by incoming mail servers, which record the result in the
Authentication-Results
header field for later use.
While the mail client of the recipient could verify DKIM signatures as well,
it would have to record the result before the signature expires.
Since emails are usually synchronized to new mail clients via IMAP,
the DKIM-verifying mail client would have to replace the messages in the user’s remote mailbox for archiving.
As noted earlier, Gmail displays the domain which signed a message.
Similar functionality can be added to Thunderbird through
an add-on.
Another reason for verifying DKIM signatures on incoming mail servers is
to filter spam and phishing emails before they reach the user’s mailbox.
DKIM doesn’t specify how to handle messages which don’t have a valid signature from a suitable domain.
It simply allows the recipient to use the reputation of a domain to assess a given message.
How a domain owner can ask recipients to reject emails which don’t pass domain authentication
is the topic of the next subsection.
If your email service provider supports DKIM for custom domains,
it likely has an article on how to generate a DKIM key for your domain
and how to publish the public key in your DNS zone.
For example, this guide
shows you how to set up DKIM for Google Workspace.
If you have to generate the signing key yourself,
you find the instructions to do so below.
The following two tools help you generate and validate DKIM records.
Instead of explaining the various configuration options here,
you can simply hover your mouse over them to read a short description
of their purpose in a tooltip.
For a longer explanation, you can consult RFC 6376.
The last three options are rarely used and just there for the sake of completeness.
When RSA is used as the signing algorithm,
the DKIM record can become quite long,
which requires that the user interface of your domain name registrar
splits the TXT
data into strings of at most 255 bytes for you.
Unlike SPF and DMARC,
you don’t have to configure a DKIM record for unused domains.
Only if you have a wildcard CNAME
record and don’t trust the target domain not to spoof emails,
you should configure a TXT
record with a value of v=DKIM1; p=
at *._domainkey
in your DNS zone.
The second tool uses Google’s DNS API
to query the DKIM records.
If it finds a record, it loads its content into the first tool
to make it easier to analyze and modify existing DKIM records.
v=DKIM1; p=
Non-repudiation DKIM-Signature header field
An example DKIM-Signature
header field from a message sent with Gmail.
The various tags of the DKIM-Signature
header field. The last four tags are less common and the
IANA registry lists even more tags.
Body and header canonicalization
Allow others to extend the body
Signing messages of subdomains
Replay attacks for spamming
Generating the signing key
Authorized Third-Party Signatures (ATPS)
Author Domain Signing Practices (ADSP)
Domain-based Message Authentication, Reporting and Conformance (DMARC)
Increasing the security of a decentralized system is always difficult because enforcing new requirements prematurely disrupts the reliability of the system. In order to minimize the disruption for users, all changes to the system have to be backward compatible. As we will see in the next section, some improvements involve only two parties. Email authentication, on the other hand, involves many parties and is, therefore, quite difficult to deploy. To authenticate emails, the outgoing mail server of the sender has to implement SPF and/or DKIM, email forwarders may not break the authentication, and the incoming mail server of the recipient has to verify the authentication. It only makes sense to authenticate emails if unauthentic emails are somehow penalized. In the short term, this could mean that unauthentic emails are quarantined as potential spam. In the long term, the goal should be to reject or discard all unauthentic emails even if they are delivered by a reputable mail server. While it is always up to the mail system of the recipient to decide what to do with incoming messages, it can enforce domain authentication only if just a small percentage of legitimate mail is affected by it. Domain-based Message Authentication, Reporting and Conformance (DMARC), which is specified in RFC 7489, allows domain owners to deploy domain authentication gradually, to monitor its effect on the delivery of their emails, to detect overlooked sources of legitimate mail, and to ask for strict enforcement when the amount of disruption seems acceptable.
There are three aspects to understanding DMARC:
- Authentication: A message is considered to be authentic if the domain of the
From
address matches the SPF-authenticated domain of theMAIL
FROM
address or the domain of a valid DKIM signature. The owner of the sending domain can require the matching to bestrict
, in which case the domains have to be identical, orrelaxed
, in which case only the organizational domains after removing any subdomains have to be the same. This is known as identifier alignment and the alignment can be configured separately for SPF and DKIM. If theFrom
field consists of several addresses, which is valid according to RFC 5322, the recipient can either reject the message or authenticate all domains and apply the most strict policy among the unauthentic domains. - Reports: Domain owners can ask receiving mail servers to send them aggregate reports in regular intervals and failure reports for messages which failed authentication. These DMARC reports allow domain owners to monitor their deployment of domain authentication, to detect unauthorized sources of legitimate messages, such as webshops and continuous integration systems, and to be informed immediately when their domain is abused for phishing.
- Policies: Domain owners can specify how receiving mail servers shall handle unauthentic messages.
If your domain doesn’t yet have a DMARC record (see below),
you should start with aggregate reports and a domain policy of
none
. This allows you to be informed about authentication failures without affecting how unauthentic messages are handled. Once you’re confident that you’ve authorized all legitimate sources of email with SPF, you should set the domain policy toquarantine
, which requests receiving mail servers to treat unauthentic emails as suspicious. Since it’s not under your control whether your recipients use alias addresses, you should move to thereject
policy only once you’ve also deployed DKIM. Otherwise, your messages might not even reach the spam folder of your recipients if they use email forwarding.
Domain owners publish their preferences in a TXT
record at _dmarc.{Domain}
.
The following two tools help you generate and validate the DMARC record for your domain.
Given the remarks above, most parameters should be self-explanatory.
If this is not the case, you can hover your mouse over them to read a short description and
all options are also documented in RFC 7489, of course.
Domains which aren’t used to send emails from should have a DMARC record of v=DMARC1; p=reject
.
If you want to be informed about spoofing attempts, you can also include a reporting address.
The second tool uses Google’s DNS API
to query the DMARC record of the given domain.
If it finds a record, it loads its content into the first tool
to make it easier to analyze and modify existing records.
Even if you use the first tool to generate your DMARC record,
you should check your record with the second tool as there are still many mistakes that you can make.
For example, if reports shall be sent to a different domain,
the report receiver has to approve this.
Another example is that the subdomain policy has an effect only
in DMARC records of organizational domains.
The second tool performs more than twenty such checks and warns you about potential configuration errors.
v=DMARC1; p=reject
Organizational domain
Subdomain policy
Unix time
Aggregate reports
An aggregate report, which was triggered by a Mailchimp signup and sent by Google.
As you can see, the message passed both DKIM
and SPF authentication.
However, the MAIL
FROM
address didn’t align with the From
address,
which caused SPF to fail in the DMARC evaluation.
As written earlier,
this failure is intentional since Mailchimp wants to handle bounce messages for you,
which prevents SPF from aligning.
Failure reports
The structure of failure reports.
[A|B]
means either A
or B
and […]
stands for more header fields.
multipart/report
and text/rfc822-headers
are specified in RFC 6522,
message/rfc822
is specified in RFC 2046, and
message/feedback-report
is specified in RFC 5965.
IANA maintains a list of
feedback report header fields.
The DMARC-specific Identity-Alignment
header field is defined
in RFC 7489.
Report approval
Authentication-Results header field
The format of the Authentication-Results
as specified in RFC 8601.
The curly brackets need to be replaced with actual values.
The content in square brackets is optional and
the asterisk indicates that the preceding content can be repeated.
The Authentication-Results
header field added by Google when I send an email to Gmail.
[…]
is the same comment as in the SPF-Received
header field.
Gmail doesn’t quite adhere to the standard.
To begin with, it adds the Authentication-Results
below the Received
header field.
Since SPF doesn’t authenticate the local part of the MAIL
FROM
address,
it should not be included in the smtp.mailfrom
property.
Furthermore, I have no idea why Gmail includes [email protected]
rather than header.d=ef1p.com
in the DKIM result.
To be fair, the RFC has one example
using the d
tag and
one example using the i
tag.
Authenticated Received Chain (ARC)
The ARC header fields that Gmail added to the message from which I took the
Authentication-Results
header field in the previous box.
DNS queries from your command line
Brand Indicators for Message Identification (BIMI)
Brand Indicators for Message Identification (BIMI) is an emerging standard, which allows mail clients to display the logo of the sending company for emails which passed DMARC authentication. It is specified in various drafts. Unlike SPF, DKIM, and DMARC, BIMI is not a domain authentication mechanism. The idea is that companies can refer to an SVG image, which needs to be certified by a certification authority, in a DNS record at their domain. By ensuring that trademarks cannot be abused by scammers, BIMI has the potential to eliminate homograph attacks and phishing. Another goal of BIMI is to increase DMARC adoption among companies which value marketing more than security.
The tool below uses Google’s DNS API
to query the BIMI record of the given domain.
BIMI records are identified with a selector just like DKIM records
so that companies can use different logos for different purposes.
The default selector is default
.
Google, Yahoo, and Fastmail are running BIMI pilots.
Companies which already have a BIMI record include cnn.com
, linkedin.com
, and ebay.com
.
BIMI DNS record
An imaginary BIMI record with reasonably short addresses. The files can be hosted on different domains.
BIMI header fields
How the BIMI results should be recorded.
[a|b]
means a
or b
.
The values are described in the BIMI draft.
Verified mark certificate (VMC) SVG Tiny Portable/Secure profile
Transport security
As discussed earlier, ESMTP uses the STARTTLS extension to upgrade an insecure TCP connection to a secure TLS connection:
ClientServer(Open tcp connection)220 server.example.comEHLO client.example.org250-server.example.com250-PIPELINING250 starttlsstarttls220 Go ahead(Start tls negotiation)(Negotiate a tls session)(Continue with tls)
The sequence diagram of STARTTLS
.
In order to remain backward compatible,
the client can use the STARTTLS
extension only if the server supports it.
Since the server indicates support for STARTTLS
over the insecure channel,
an attacker who can intercept and alter the packets between the client and the server
can simply strip the STARTTLS
capability from the server’s response to the EHLO
command:
ClientAttackerServer(Open tcp connection)(Open tcp connection)220 server.example.com220 server.example.comEHLO client.example.orgEHLO client.example.org250-server.example.com250-PIPELINING250 starttls250-server.example.com250 PIPELINING(Continue without tls)(Continue with or without tls) How a man in the middle can prevent the two parties from upgrading their connection to TLS.
This attack is known as STRIPTLS. The problem is not Explicit TLS but rather the opportunistic use of TLS for the sake of backward compatibility. If the client is willing to continue without the security provided by TLS, Implicit TLS suffers from the same problem:
ClientAttackerServer(Open tls connection)(Open tcp connection)(Open tcp or tls connection)220 server.example.com220 server.example.comEHLO client.example.orgEHLO client.example.org The attacker can drop the client’s TLS connection until it gives up and connects to the server with TCP.
While the opportunistic use of TLS is also a problem for submission, access, and filtering protocols, mail clients always communicate with the same few servers and should not fall back to insecure communication after the initial configuration. Additionally, cleartext is considered obsolete for email submission and access. For these reasons, we’re interested only in securing ESMTP for Relay between the outgoing mail server of the sender and the incoming mail server of the recipient in this section. As discussed earlier, there are three ways to achieve secure transport in the presence of an active adversary without sacrificing backward compatibility:
- Previous connections: If previous connections were secure, abort when TLS is no longer available. MTA-STS works like this.
- Authenticated channel: The recipient can indicate support for TLS through an authenticated channel. DANE works like this.
- User configuration: Let the user require that their messages may be delivered only with TLS. REQUIRETLS makes this possible.
Server authentication
REQUIRETLS extension
How a client asks an ESMTP server to forward the message only with TLS
to other servers which support REQUIRETLS
as well.
How a client asks an ESMTP server to forward the message
even if DANE
or MTA-STS fails.
At the moment, No
is the only valid value.
DNS-Based Authentication of Named Entities (DANE)
DNS-Based Authentication of Named Entities (DANE) is specified in RFC 6698. RFC 7671 updates and clarifies some aspects of DANE and RFC 7672 specifies how DANE is applied to SMTP. DANE relies on DNSSEC for three different purposes:
- DNS authentication:
Domain names are used to reference services,
which are often provided by external service providers.
Since changes are easier if the service providers can manage their address records themselves,
indirections with
MX
,SRV
, andCNAME
records are quite common in the Domain Name System. The same is true for security-related DNS records, such asTLSA
records, which are introduced by DANE. (Officially,TLSA
is not an acronym but simply the name of the record type. Personally, I like to think ofTLSA
as Transport Layer Security Anchor.) Letting the service providers configure the necessaryTLSA
records at their domains has some advantages. However, theTLSA
records can be trusted only if the DNS records are authenticated with DNSSEC both in the zone of the customer and in the zone of the service provider. If the reply to theMX
,SRV
, orCNAME
query can be spoofed by an attacker, the attacker can pose as the legitimate service provider to unsuspecting clients. - Downgrade resistance:
By configuring
TLSA
records at the appropriate subdomain, a service provider indicates that its server supports TLS. Thanks to DNSSEC’s authenticated denial of existence, an attacker cannot suppress the retrieval of theTLSA
records, which makes DANE resistant to downgrade attacks. Before you can deploy DANE, you have to deploy DNSSEC. If a client encounters an unsigned domain, it continues with opportunistic encryption. If a client learns from the superzone that the subzone is signed but cannot retrieve the signedTLSA
records or a signed statement of their absence, it aborts the connection. - Trust anchor:
In order to prevent a man-in-the-middle attack,
the client has to authenticate the server.
Instead of relying on the traditional public-key infrastructure (PKI),
DANE requires service providers to put the public key of their server
or the public key of a trust anchor of their choosing
into their
TLSA
records. DANE clients then verify whether the server’s public key is confirmed directly or indirectly by one of the server’sTLSA
records. Relying on DNSSEC rather than on traditional certification authorities (CAs) has several advantages.
The tool below queries the MX
records of the given domain and the TLSA
records of each mail server.
It uses Google’s DNS API for the DNS queries
and performs only rudimentary checks on the format of the TLSA
records.
It doesn’t validate whether DNSSEC and DANE are deployed correctly.
If you want to check this, you can use this validator.
I cover how you can generate
and verify TLSA
records yourself below.
You can deploy DANE only if your email service provider supports it.
If your email service provider has configured TLSA
records for their servers,
all that you have to do is to enable DNSSEC on your custom domain.
PKI comparison TLSA record type
One of the TLSA
records at _25._tcp.mail.protonmail.ch
.
(I cover the location of the record in the next box.)
Which checks clients have to perform for each certificate usage according to RFC 7671.
Which combinations of DANE parameters you should and should not use.
The verbs in the recommendation column are specified in RFC 2119.
The last two rows are applicable only to SMTP for Relay on port 25.
TLSA record location
Multiple TLSA records
Name matching
Client behavior
How the client has to handle the various situations according to RFC 7672 and RFC 7673.
How to generate a TLSA record
How to verify a TLSA record
How to compute the certificate association data directly from the server certificate.
2>
/dev/null
suppresses the error output.
DANE on outgoing mail server HTTP Public-Key Pinning (HPKP)
A Public-Key-Pins
response header field with a validity period of 30 days
and a report URI
for validation failures.
Mail Transfer Agent Strict Transport Security (MTA-STS)
Mail Transfer Agent Strict Transport Security (MTA-STS) is specified in RFC 8461. MTA-STS is a PKIX-based alternative to DANE for those who cannot or don’t want to deploy DNSSEC on their domain. It lets receiving domains indicate their support for PKIX-authenticated TLS with the following two resources:
- DNS record:
A
TXT
record is used to inform the sender that the receiving domain has an MTA-STS policy and whether the policy has been changed since the last time the sender retrieved it. Since small DNS records are retrieved with UDP, this is much faster than retrieving the policy file, which requires a TCP and a TLS handshake. - Policy file:
The sender fetches the MTA-STS policy with HTTPS from the receiving domain.
The MTA-STS policy indicates what the sender shall do
if it cannot authenticate the incoming mail server of the recipient with the presented PKIX certificate.
Since MTA-STS doesn’t require that DNS records are authenticated with DNSSEC,
the policy file is also used to authenticate the
MX
records of the receiving domain. This allows clients to match the presented certificate against the name of the mail server.
The tool below queries the MTS-STS record and the policy file of the given domain.
It uses Google’s DNS API for the DNS query
and the email tracking server,
which I’ve deployed on Heroku,
as a proxy server.
This is necessary because the policy file is usually served without the header field which is required
for cross-origin resource sharing (CORS).
As you can see in its source code,
my proxy server doesn’t store anything but
Heroku logs the last 1’500 requests,
which includes the queried domain and your IP address.
I don’t persist the log file
but I might check it from time to time for troubleshooting.
The tool checks the syntax of the DNS record and the policy file
but it verifies neither the MX
records nor whether the mail server has a valid PKIX certificate.
Comparison to DANE
Coexistence with DANE
Can DANE override MTA-STS validation? Unfortunately, the standard is silent on this.
MTA-STS DNS record
The TXT
record at _mta-sts.gmail.com
in April 2021.
MTA-STS policy file
The policy file at
https://mta-sts.gmail.com/.well-known/mta-sts.txt
in April 2021.
HTTP Strict Transport Security (HSTS) ClientAttackerServer(Open tcp connection)GET / http/1.0Host: www.example.com(Open tcp connection)GET / http/1.0Host: www.example.comhttp/1.0 301 Moved PermanentlyLocation: https://www.example.com/(Close tcp connection)(Open tls connection)GET / http/1.0Host: www.example.comhttp/1.0 200 ok[Headers and body](Close tls connection)http/1.0 200 ok[Rewritten headers and body](Close tcp connection)
How a man in the middle
can prevent the client from upgrading its TCP connection to TLS.
If the attacker knows that the server redirects http://www.example.com/
to https://www.example.com/
,
it can skip the first connection to the server, of course.
A Strict-Transport-Security
response header field with a validity period of 365 days.
STARTTLS Policy List
SMTP TLS Reporting (TLSRPT)
SMTP TLS Reporting (TLSRPT) is specified in RFC 8460. With it, domain owners can ask sending mail servers to report transport security failures to them, which allows them to detect misconfigurations and attacks. If you’re certain that all emails are still being delivered to you, you’re much more likely to enforce strict transport security. Just like DMARC reporting, TLSRPT uses a DNS record to specify the endpoints to which reports should be sent once a day.
The following tool queries the TLSRPT record with Google’s DNS API and checks its format with a regular expression:
TLSRPT DNS record
The TXT
record at _smtp._tls.gmail.com
.
Whitespace is allowed before and after semicolons.
TLSRPT report format
A report which I’ve received from Google with the filename
google.com!ef1p.com!1617926400!1618012799!001.json.gz
in an email with the subject
Report Domain: ef1p.com Submitter: google.com Report-ID: <[email protected]>
.
I’m not sure what total-successful-session-count
means if the policy-type
is no-policy-found
.
Would it count as a failure if TLS cannot be negotiated?
Or are all sessions successful if no policy was found?
You find an example report with failure details in RFC 8460.
TLSRPT report conditions
Reporting in header fields
End-to-end security
Instead of relying on mail servers to perform domain authentication and enforce transport security, senders and recipients can take matters into their own hands and secure their communication themselves. This idea is often referred to as end-to-end encryption (E2EE). Since protecting the authenticity of the content is usually just as important as protecting its confidentiality, I prefer the term end-to-end security. As we’ve seen earlier, arbitrary content can be sent via email. As long as the sender and the recipient agree on which cryptographic algorithms and which encoding they want to use, they can use any technique they want, such as one-time pad encryption combined with a message authentication code (MAC). While end-to-end security doesn’t have to be standardized, doing so is valuable for two reasons: The more people use the same technique, the more useful it becomes for each user, which is known as a network effect, and if everyone uses the same technique, it can be integrated into mail clients, which makes it easier to use.
imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient If the mail client of the sender encrypts and authenticates the message for the mail client of the recipient, none of the mail servers have to be trusted (beyond delivering or storing the message).
End-to-end security has two advantages:
- No trust in mail servers required: In theory, if you don’t trust any email service provider, you can just host your emails yourself. In practice, however, running your own mail server on your own hardware is a hassle. Beyond the technical complexity of running a mail server, you may want to share the infrastructure with other users in order to reduce costs. Without end-to-end security, everyone has to trust the administrator of the mail servers. If you employ end-to-end security on all your messages, you can choose any free email service provider who delivers your messages reliably.
- Secrets on clients instead of servers: In order to receive emails from anyone, incoming mail servers have to be reachable from anywhere on the Internet. If a security hole is found in the used software, mail servers become vulnerable immediately. Given that servers are typically shared by many users, they are a prime target for attacks. If you employ end-to-end security with all your contacts, an attacker who compromised your mail server can neither read your communication nor send messages in your name. If, on the other hand, your mail client or your computer is compromised, it’s over with and without end-to-end security.
End-to-end security also has some disadvantages:
- No remote search:
Your mail client has to store all messages locally
if you want to be able to search for a message based on its content.
Without end-to-end encryption, your mail client can ask your incoming mail server to perform the search
using the
SEARCH
command of IMAP. While storing all messages locally is no longer a problem for modern smartphones, it might still be one for smartwatches. End-to-end security requires thick clients instead of thin clients. - No partial downloads: IMAP allows clients to fetch just certain parts of multipart messages. Since the body of a message is usually signed and encrypted as a single unit, bandwidth-constrained mail clients cannot download the text of an end-to-end secured message without its attachments.
- Archiving of messages: If you lose your decryption key, you can no longer access your messages (unless your mail client stores them in plaintext on your computer). While this can be an annoyance for individuals, it can be a real problem for companies, who must archive their electronic communication in order to avert spoliation of potential evidence.
- Message filtering: Mail servers cannot scan encrypted messages for malware or discard them as spam based on their content.
There are two main standards for end-to-end security: Secure/Multipurpose Internet Mail Extensions (S/MIME) and Pretty Good Privacy (PGP), which was standardized as OpenPGP. Unlike the other fixes in this chapter, both of them have existed since the 90s. The main difference between them is how public keys are authenticated, distributed, and revoked. Otherwise, they are quite similar.
Aspect
Secure/Multipurpose Internet
Mail Extensions (S/MIME)
Pretty Good Privacy (PGP)
Standards
RFC 8551: S/MIME formats
RFC 8550: Certificate handling
RFC 5652: Message syntax
RFC 3370: Algorithms
RFC 4880: OpenPGP formats
RFC 3156: Content types
Certificate format
RFC 5280: X.509
Specific to OpenPGP
Public-key binding
Certification authorities
Web of trust
Public-key distribution
(before relying on DNSSEC)
Attached to the message
as part of the signature or
through an internal directory
Public key server,
personal website,
or Autocrypt
header field
Public-key revocation
(before relying on DNSSEC)
Certificate Revocation Lists (CRL) or
Online Certificate Status Protocol (OCSP)
by the issuing certification authority
Key revocation signature
by the owner of the key
DANE resource record type
SMIMEA
OPENPGPKEY
Content type for encryption
application/pkcs7-mime
application/pgp-encrypted
Content type for signature
application/pkcs7-signature
application/pgp-signature
Primary user group
Business world
Security specialists
Costs for users
You have to pay for the certificate but
there are free offers for personal use
None
Comparison of Secure/Multipurpose Internet Mail Extensions (S/MIME) and Pretty Good Privacy (PGP).
Modes of operation Message, statusSigned messageRecipientSigned andencryptedmessageAttackerSigned messageSenderMessageSignEncryptDecryptVerifyPrivate key of senderPublic key of recipientPrivate key of recipientPublic key of sender How a message is signed and encrypted by the sender and then decrypted and verified by the recipient. If you don’t know the public key of the recipient, you can only sign your message.
Deniable authentication
Which mechanism ensures which properties. More is not always better.
Compression before encryption
Multipart message nesting
S/MIME signature using the application/pkcs7-mime
format.
↗
S/MIME signature using the multipart/signed
format.
↗
micalg
stands for message integrity check (MIC) algorithm.
The advantage of this format is that users can read the message even if their mail client doesn’t support S/MIME.
S/MIME encryption with integrity protection. ↗
PGP signature with the message in the first part.
↗
The checksum is a 24-bit
cyclic redundancy check (CRC).
Some implementations use BEGIN PGP SIGNATURE
instead of BEGIN PGP MESSAGE
.
PGP encryption with metadata in the first part.
↗
If the plaintext has been signed,
you get the format of the previous example without MIME-Version: 1.0
after decryption.
Securing header fields PGP signature with protected header fields. ↗ The original subject is replaced with three dots only when the message is encrypted.
SMIMEA resource record
OPENPGPKEY resource record
SSHFP resource record
If you like my work, please consider supporting me with a donation so that I can keep publishing articles which are freely available. To be informed about new articles, follow this blog on Twitter, Reddit, or Telegram, or subscribe to its news feed using RSS/Atom. The copyright of this article and its graphics belong to Kaspar Etter. You can share this article in any form as long as you give proper attribution.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK