25

Email explained from first principles at SSHFP resource record

 2 years ago
source link: https://explained-from-first-principles.com/email/#sshfp-resource-record
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Email

explained from first principles

This article and its code were first published on 7 May 2021. If you like the article, please share it with your friends on social media: ef1p.com/email. You can also join the discussion on Reddit or download the article.

If you are visiting this website for the first time, then please first read the front page, where I explain the intention of this blog and how to best make use of it. As far as your privacy is concerned, all data entered on this page is stored locally in your browser unless noted otherwise. While I researched the content on this page thoroughly, you take or omit actions based on it at your own risk. In no event shall I as the author be liable for any damages arising from information or advice on this website or on referenced websites.

Preface

Being one of the oldest services on the Internet, email has been with us for decades and will remain with us for at least another decade. Even though email plays an important role in everyday life, most people know very little about how it works. Before we roll up our sleeves and change this, here are a few things that you should know:

Terminology

Concepts

Before diving into the technical aspects of email, let’s first look at email from the perspective of its users.

Message

The purpose of email is to send messages over the Internet. A message is a recorded piece of information which is delivered asynchronously from a sender to one or several recipients. Asynchronous communication means that a message can be consumed at an arbitrary point after it has been produced, rather than having to interact with the sender concurrently. A message can be transmitted with a physical object, such as a letter, or with a physical signal, such as an acoustic or electromagnetic wave. While humans have delivered messages in the form of objects for millennia with couriers and pigeons, it’s only since the invention of the optical telegraph in the late 18th century and the invention of the electrical telegraph in the middle of the 19th century that we can signal arbitrary messages over long distances. The fundamental principle of communication stayed the same over all those years: You can either start a new conversation or continue an existing one by replying to a previous message.

Mailbox

A mailbox is a box for incoming mail (also called an inbox), into which everyone can deposit messages but ideally only the intended recipient can retrieve them. In some countries, the privacy of such messages is legally protected by the secrecy of correspondence.

Provider

There are three things that set email apart from the traditional postal system, which is sometimes also referred to as snail mail:

  1. Email conveys digital data, whereas a letter is an analog item. The former is much more useful for further processing.
  2. Email enables instant global delivery at a marginal cost of zero. The only fee you pay is for your access to the Internet.
  3. Mailboxes for email are provided and operated by companies, which are called email service providers. While you could operate your own server since email is an open and decentralized system, this is rarely done in practice for reasons we discuss later on.
Which are the most popular email service providers?

Address

Email addresses are used to identify the sender and the recipient(s) of a message. They consist of a username followed by the @ symbol and a domain name. The domain name allows the sender to first determine and then connect to the mail server of each recipient. The username allows the mail server to determine the mailbox to which a message should be delivered. The hierarchical Domain Name System ensures that the domain name is unique, whereas the email service provider has to ensure that the name of each user is unique within its domain. There doesn’t have to be a one-to-one correspondence between addresses and mailboxes: A mailbox can be identified by several addresses, and an email sent to a single address can be delivered to multiple mailboxes.

Display name How Apple Mail shows the display name in the To and From fields – if you have Smart Addresses disabled, which you totally should. The @ symbol

Normalization

Subaddressing Go to the Accounts and Import tab of your settings and click on “Add another email address” under “Send mail as”.Afterwards, enter the preferred display name and subaddress in the new window. You can leave the box “Treat as an alias” checked.
(In either case, Gmail asks the recipient to reply to your subaddress, while the main address is used in the Return-Path header field.)
Click on the button “Next Step” and you’re done. You can now select a different From address the next time you compose a message. Alias address

Mailing list

Address syntax What a standard allowsWhat is actually being used Often only a subset of a standard finds adoption, while some things become convention without a formal standard.

Common addresses

Most of these addresses are encouraged by RFC 2142: “Mailbox names for common services, roles, and functions”. Role-based addresses are usually configured as aliases so that incoming emails can be forwarded to several people.

Recipients

You can address the recipients of a message in three different ways:

  • The To field contains the address(es) of the primary recipient(s). As a sender, you expect the primary recipient(s) to read and often to react to your message. The expected reaction can be a reply or that they perform the requested task.
  • The Cc field contains the address(es) of the secondary recipient(s). As a sender, you want to keep the secondary recipient(s) informed without expecting them to read or react to your message. (Cc stands for carbon copy.)
  • The Bcc field contains the address(es) of the hidden recipient(s). Their address(es) are not to be revealed to other recipients of the message. The field is usually fully preserved in your folder of sent messages but fully removed in the version of the email that is delivered to others. Alternatively, a different message could be delivered to each hidden recipient where their address alone is listed in the Bcc field. The standard also allows hidden recipients to see each other; they just have to be removed for the primary and secondary recipients. The vague semantics of this feature leads to several problems. (Bcc stands for blind carbon copy.)

Important: Just because someone is listed as another recipient doesn’t mean that they received the same message as you. The reason for this could be innocuous or malicious. On the one hand, it may be that the email could simply not be delivered to them. On the other hand, the sender might have delivered the message only to you in order to mislead you. Your email service provider has no way of verifying that the same message has also been delivered to the other recipients. This allows a fraudster to fake a relationship that they do not have or to lead you to believe that they have done the introduction you asked them for, even when this is not the case. If you reply to all, your reply would also be sent to the faked recipients, of course.

Group construct

Sender

There are two relevant fields to indicate the originator of a message:

  • The From field contains the address of the person who is responsible for the content of the message.
  • The Reply-To field indicates the address(es) to which replies should be sent. If absent, replies are sent to the From address.

Important: The core email protocols do not authenticate the sender of an email. It’s called spoofing when the sender uses a From address which doesn’t belong to them. Forged sender addresses are a huge problem for the security of email. There are additional standards to authenticate emails. For them to have the desired effect, though, both the sender and the recipients have to use them.

Sender field

No reply

Subject

The Subject field identifies the topic of a message. Its content is restricted to a single line but the line can be of arbitrary length. (We’ll talk about encoding later.) RFC 5322 also defines other informational fields, namely Comments and Keywords, but I’ve never seen them being used. All informational fields are optional, which means an email doesn’t need a subject line. The mail clients I’ve checked, though, include the Subject field even when it’s empty. While the message is transmitted with an empty Subject field, mail clients usually display “(No subject)” instead of nothing.

Prefixes

Last but not least, an email has a body (which is strictly speaking optional). The body contains the actual content of a message. It can be formatted in different ways and can consist of different parts. Splitting the body into several parts is useful, for example, to send a plaintext version alongside an HTML-encoded message or to attach files to an email. We’ll discuss later how all of this works.

Size limit

Architecture

There are four separate aspects to understand email from a technical perspective:

  • Format: What is the syntax of email messages?
  • Protocols: How are these messages transmitted?
  • Entities: Who transmits these messages to whom?
  • Architecture: How are these entities arranged?

Let’s go through them one by one in the opposite order.

Simplified architecture

One reason why email is so hard to grasp is because the official terminology is unnecessarily complicated in most circumstances. Throughout this article, we’ll work with a much simpler version. Email follows the client-server model: A client opens a connection to a server in order to request some service. In all the graphics where arrows represent an exchange of data, the arrows point from the client to the server; i.e. in the direction of the request, not the response. The following entities and protocols are involved in the transmission of a message from a sender to a recipient:

imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient

The simplified email architecture. We’ll discuss each entity in the next section and the protocols thereafter.

Standardization SenderServerClientUserRecipientServerClientUser How emails are submitted and accessed (in blue) is independent from how emails are exchanged between servers (in green).

Webmail WebWeb serverWeb browserUserMailMail serverMail clientUserWebmailMail serverMail clientWeb serverWeb browserUser+=& In the case of webmail, the mail client is accessed via a web server using a web browser. DataDataCodeWeb serverWeb browserMail serverMail clientCodeWebMailvs. On the left, the code to interact with the data comes from the server. On the right, the logic is inside the client and only data is exchanged.

Official architecture

For the sake of completeness and to enable you to understand the linked articles, this subsection covers the official terminology as used, for example, in RFC 5598. In the official documents, there are five instead of three entities, with each of them having a more complicated name and, of course, an associated three-letter acronym (TLA):

TLA Name Description MUA Mail user agent Client to compose, send, receive, and read emails, such as
Microsoft Outlook, Apple Mail, and Mozilla Thunderbird. MSA
MSS Mail submission agent
Mail submission server Server to receive outgoing emails from authenticated users
and to queue them for delivery by the mail transfer agent (MTA). MTA Mail transfer agent Server to deliver the queued emails and to receive them on the other end.
It then forwards the received emails to the mail delivery agent (MDA). MDA Mail delivery agent Server to receive emails from the local mail transfer agent (MTA)
and to store them in the message store (MS) of the recipient. MS
MAS Message store
Mail access server Server to store the emails received from the mail delivery agent (MDA)
and to deliver them to the mail user agent (MUA) of the recipient.

The terminology used by the Internet Engineering Task Force (IETF) in its official documents, such as this one. The terms in italics are used in some newer documents, such as this one. I added them because I like them better.

These terms are not as precise as they seem to be and the boundaries are often fluid in practice. Having more entities also changes the architecture. What follows is a nicer version of this ASCII graphic, which is a masterpiece to be appreciated in its own right.

Mail useragent (mua)Mail submissionagent (msa)Mail transferagent (mta)Mail deliveryagent (mda)Messagestore (ms)Mail useragent (mua)Mail submissionagent (msa)Mail transferagent (mta)Mail deliveryagent (mda)Messagestore (ms)SenderRecipient

The official Internet Mail Architecture with SMTP connections in green and IMAP connections in blue.

None of the servers have to be a single machine. In addition, the incoming MTA and the outgoing MTA don’t have to be the same.

Entities

There are three entities in the simplified architecture: the mail client, the outgoing mail server, and the incoming mail server.

Mail client

The mail client is a computer program to compose, send, retrieve, and read emails. It provides the interface through which users handle email. The mail client runs either locally on the user’s device or remotely on a web server. Examples of the former kind are Microsoft Outlook, Apple Mail, and Mozilla Thunderbird. Examples of the latter are Google Gmail and Yahoo! Mail when accessed through a web browser. (Both companies also provide mobile apps for Android and iOS, which fall into the former category.)

The mail client connects to the outgoing mail server to submit messages for delivery to other users and to the incoming mail server to fetch new messages from the user’s mailbox. Both servers authenticate the user, typically with a username and a password. The mail client connects to the incoming mail server through a different interface than outgoing mail servers do, which can be seen on the recipient’s side of the simplified mail architecture:

imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient

The recipient’s mail client connects to the incoming mail server using a different port and protocol than outgoing mail servers. It’s usually also a different machine with a different domain name and IP address than the one outgoing mail servers connect to.

This distinction is apparent in the official mail architecture, where the message store (MS) and the mail transfer agent (MTA) reside in different boxes. By giving the impression that the incoming mail server is a single machine, the simplified model doesn’t explain why the incoming mail server needs to be configured in the mail client of its user but not in the outgoing mail servers of other users. Since the simplified architecture is less confusing in every other regard, it’s still the preferred model for the scope of this article.

Configuration The simplified email architecture corresponds to what mail clients like Apple Mail display to you.
The domain of the address (ef1p.com) is different from the domain of the servers (mail.gandi.net).
The host names of the incoming mail server and the outgoing mail server are usually not the same. Custom domains

Autoconfiguration

Insert the appropriate Domain and use 0 0 0 . for all the services which are not supported by your email service provider.

Configuration database

Outgoing mail server

The outgoing mail server accepts messages from mail clients and queues them for delivery. It then determines the incoming mail server of each recipient and delivers the message to them. The outgoing mail server acts as a server in the interaction with mail clients but assumes the role of a client when relaying the message to incoming mail servers. (Connections are always initiated by clients.) If the outgoing mail server cannot deliver a message, it sends a bounce message to the user who submitted the message. While the outgoing mail server should not change the content of a message, it adds information about the submitter at the top. Before accepting a message, the outgoing mail server authenticates the user, typically based on a username and a password.

Why do we need outgoing mail servers when mail clients could simply deliver the messages directly? imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionsmtp fordirectmessagedelivery?Mail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient A hypothetical email architecture without outgoing mail servers. Userauthen-ticationUserauthen-ticationDomainauthenticationUserauthen-ticationMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient The incoming mail server verifies that the outgoing mail server is authorized to send messages from the claimed domain, while the outgoing mail server of the sender ensures that each user uses their own address in the From field.

How to avoid submitting the same message to both the outgoing mail server and the incoming mail server? imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient The mail client submits the same message to both the outgoing mail server and the incoming mail server. 2. Store1. SubmitMail clientIncomingmail serverOutgoingmail server Gmail automatically stores sent messages. 2. Submit1. StoreMail clientIncomingmail serverOutgoingmail server The Courier IMAP server can deliver emails. 3. Fetch2. Reference1. StoreMail clientIncomingmail serverOutgoingmail server The lemonade profile enables the outgoing mail server to fetch content for delivery from the incoming mail server.

Incoming mail server

The incoming mail server waits for connections from outgoing mail servers of other users. When an outgoing mail server connects to transmit a message, the incoming mail server records the message together with other information from the session, such as the sender’s IP address. The incoming mail server can reject the incoming message for a number of reasons: The recipient might not exist, their mailbox might be full, the message might be too long, or the sender might not be trusted. If the message is rejected, the outgoing mail server can either try to retransmit it at some later point or inform the user about the failed delivery. If, on the other hand, the incoming mail server accepts the message, it also assumes responsibility for delivering the message. If it fails to do so, for example when the message needs to be forwarded, then the incoming mail server should notify the author of the message.

Once the session with the outgoing mail server is over, the incoming mail server adds the additional information collected during the session to the top of the accepted message. It then evaluates whether the message is likely spam. Depending on the score of this evaluation, the message is either delivered to the recipient’s inbox, quarantined to the recipient’s spam folder, or discarded without notifying the author. While the last option violates the principle that mail is either delivered or returned, the alternative is often worse. This is why the standard explicitly allows incoming mail servers to drop received messages silently. If the receiving address is an alias, the incoming mail server forwards the message to the configured email address instead of delivering it to an inbox. In case the address denotes a mailing list, the incoming mail server sends the message to all subscribers of the list. The incoming mail server also applies filters and generates automatic responses, such as delivery failures and out-of-office replies.

The incoming mail server waits for connections from mail clients on a different interface. In order to access the mailbox of its user, the mail client has to present appropriate credentials. The user’s email address and password are often used to authenticate the client, which is granted unlimited access to the mailbox on success. If the incoming mail server supports OAuth, the mail client can present an access token to gain potentially limited access to the user’s mailbox. The scopes offered by Gmail are an example of what limited access can look like. While restricted authorization is common for other services, it’s not yet the norm for email. Once the client is authenticated, it can retrieve, deposit, and delete messages. It can also mark them as read or flag them for later attention.

Address resolution

How do outgoing mail servers find the incoming mail server of a recipient? As we learned above, an email address consists of a username and a domain name, separated by the @ symbol. A sender finds the incoming mail server of a recipient by querying the Domain Name System (DNS) for mail exchange (MX) records of the used domain name. If no such records exist, the sender queries for address records (A or AAAA) of the domain name instead. If the DNS response is not authenticated with DNSSEC, mail might be sent to the server of an attacker. TLS can prevent this only if the sender requires that the recipient’s domain is included in the server certificate, which is usually not the case. A standard for securing MX records with TLS exists, though.

A domain can list several servers that handle incoming mail. MX records assign a priority to each incoming mail server. The lower the number, the higher its priority. This is useful for providing redundancy in case the most preferred server is not responding. Several servers with the same priority can be used for load balancing. You can use the following tool to look up the incoming mail servers of a domain you are interested in. It uses an API by Google to query the Domain Name System and an API by ipinfo.io to determine the geographic location of each server. The latter is just to remind you that the Internet is a physical infrastructure. Outgoing mail servers need to know only the IP address of the incoming mail server, of course. (A remark on the subdomains you might encounter: spool is a synonym for buffer/queue, fb probably stands for fallback and alt for alternative.)

Domain:

Null MX record

Dotless domains

Name collisions

Protocols

The above entities communicate with two kinds of protocols: They use delivery protocols to deliver messages and access protocols to access the user’s mailbox. As discussed earlier, only SMTP for message relay is mandatory. All other protocols can be replaced in a proprietary setup. For example, there are efforts to combine message submission and mailbox access in a standardized way.

imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient

The simplified email architecture with delivery protocols in green and access protocols in blue.

Use of TLS

Historically, SMTP, POP3, and IMAP ran directly on top of the transport layer using the Transmission Control Protocol (TCP), which means that the communication was neither encrypted nor authenticated. Anyone with access to one of the networks through which the communication was routed could therefore read and potentially alter your messages. Even your user password might have been transmitted in the clear. In theory, the solution is straightforward: Use Transport Layer Security (TLS) to encrypt and authenticate the communication between each pair of entities. In practice, however, you want to be backward compatible: A server that expects requests to be in a specific format cannot suddenly handle a request for a TLS handshake. There are two ways around this problem:

  • Implicit TLS: Introduce a new port number for each service on which the communication starts directly with a TLS handshake. The protocol variant which uses TLS implicitly is denoted by appending an S to its name. For example, IMAP becomes IMAPS.
  • Explicit TLS or STARTTLS, sometimes mistakenly called opportunistic TLS: Allow the client to upgrade an insecure connection to a secure connection with a command once the server has indicated that it supports TLS. The communication is secured only if the client requests this explicitly. The server cannot require the upgrade to TLS as this would break backward compatibility.

With one notable exception, most longstanding email protocols were adapted to support both Implicit TLS and Explicit TLS.

Implicit TLS versus Explicit TLS TLS settings in mail clients

Mail clients often use other names for Implicit TLS and Explicit TLS.

The server settings in Apple Mail when “Automatically manage connection settings” is disabled. Also somewhat disappointingly, Apple Mail uses Explicit TLS rather than Implicit TLS by default. Encryption on the Web

Deployment statistics

Port numbers

Every protocol specifies a default port on which servers listen for incoming requests. Instead of scattering the port numbers used by various email protocols throughout the following subsections, here is a table with all the relevant information for future reference:

Protocol Port for Implicit TLS Port for Explicit TLS SMTP for Submission 465 (587) SMTP for Relay – 25 POP3 995 (110) IMAP 993 (143) JMAP via HTTPS 443 – ManageSieve – 4190

The port numbers used by the various email protocols.
Since RFC 8314, Implicit TLS is the preferred option and cleartext is considered obsolete on the port for Explicit TLS.

Why has SMTP for Relay no port for Implicit TLS?

Delivery protocols

Submission versus relay

Header fields and body

We’ll have a closer look at the format of messages in the next section but, because we already want to transmit messages in this section, we have to cover the basics now. A message consists of several header fields and an optional body, which follows after an empty line. Each header field has to be on a separate line but can, if necessary, span several lines. Identical to HTTP, header fields are formatted as Name: Value. What follows is a simple example message. You can find more examples in RFC 5322.

From: Alice <[email protected]>
To: Bob <[email protected]>
Cc: Carol <[email protected]>
Bcc: IETF <[email protected]>
Subject: A simple example message
Date: Thu, 01 Oct 2020 14:56:37 +0200
Message-ID: <[email protected]>

Hello Bob,

I think we should switch our roles.
How about you contact me from now on?

Best regards,
Alice

A simple example message with a sender, three recipients, a subject, a date, a message ID, and a body.

Message versus envelope

While outgoing mail servers may add missing header fields and sign each message, incoming mail servers should only add trace information to the top of a message and leave the message as is otherwise. The information relevant for handling the message, such as the addresses to deliver the message to and the address to report failures to, belongs to the so-called envelope. The envelope is specific to the Simple Mail Transfer Protocol (SMTP) and it can change completely during the delivery of a message. The message, on the other hand, mostly stays the same during delivery and its format is also used by two access protocols. The important thing to remember is that emails are delivered based on the addresses in the envelope and not the addresses in the header section of the message. Somewhat unfortunately, the fields in the envelope are called similarly to some header fields in the message: MAIL FROM for the address to report failures to and RCPT TO for each address to deliver the message to.

Diverging envelope example Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Outgoingmail serverof example.orgSubmission: The mail client of Alice removes the Bcc header field from the message and submits the message with all recipient addresses in the envelope, including the ones of Bcc recipients, to the outgoing mail server. Automatic responses shall be sent to the mailbox of Alice.

Outgoingmail serverof example.orgEnvelopeMAIL FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Incomingmail serverof example.comFirst relay: The outgoing mail server is now responsible for delivering the message to the recipients that the mail client specified. It sees that two recipient addresses are handled by the same domain and delivers the message in a single envelope to the incoming mail server of this domain. The outgoing mail server could also connect to the incoming mail server twice, delivering the message once for Bob and once for Carol. I don’t know which approach is more common in practice. RFC 5321 just says that, when the same message is delivered to multiple recipients in the same session, it should be delivered with a command sequence of MAIL FROM, RCPT TO, RCPT TO, DATA rather than MAIL FROM, RCPT TO, DATA, MAIL FROM, RCPT TO, DATA. We’ll discuss how the envelope corresponds to protocol messages soon.

Incomingmail serverof example.comEnvelopeMAIL FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Incomingmail serverof example.netAlias: The incoming mail server of example.com knows that [email protected] is an alias for [email protected]. It thus forwards the original message without any modifications to the incoming mail server of example.net. A potential delivery failure is still reported to Alice.

Outgoingmail serverof example.orgEnvelopeMAIL FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Incomingmail serverof ietf.orgSecond relay: The outgoing mail server of Alice also has to deliver the message to the recipient [email protected], so it does that.

Incomingmail serverof ietf.orgEnvelopeMAIL FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Cc: Carol <[email protected]>Incomingmail serverof example.netMailing list: It turns out that [email protected] is a mailing list. It is now the task of the mail server of ietf.org to deliver the message to all subscribers of this list, with one of them being [email protected]. RFC 5321 requires that the bounce address as specified in the MAIL FROM field of the envelope is changed to the entity who administers the mailing list. The entity can be a person but is typically a piece of software, which keeps track of delivery failures in order to revise the list. The RFC also demands that the From field in the message remains the same. Mailing list tools often modify the message in some ways, for example by adding a field to the header and a footer to the body in order to let recipients of the message unsubscribe from the mailing list. Alias addresses and mailing lists cause difficulties for domain authentication.

Who removes the Bcc header field?

Another example message with several Bcc recipients.

Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Outgoingmail serverof example.org The Bcc field is removed from the message for all recipients. Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Outgoingmail serverof example.org The non-Bcc recipients get the redacted message. Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Bcc: Carol <[email protected]>,David <[email protected]>Outgoingmail serverof example.org The Bcc recipients get the original message. Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Outgoingmail serverof example.org All non-Bcc recipients get the same redacted message. Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Bcc: Carol <[email protected]>Outgoingmail serverof example.org Carol in Bcc gets her own version of the message. Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Bcc: David <[email protected]>Outgoingmail serverof example.org And so does David. Mail client [email protected] FROM:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>RCPT TO:<[email protected]>MessageFrom: Alice <[email protected]>To: Bob <[email protected]>Bcc:Outgoingmail serverof example.org An empty Bcc field indicates that there were hidden recipients without disclosing them.

How does Gmail recover the Bcc field of sent messages?

Simple Mail Transfer Protocol (SMTP)

The Simple Mail Transfer Protocol (SMTP) was first specified in RFC 821 in 1982. As its name suggests, it is a fairly simple protocol:

ClientServer(Open connection)220 server.example.comHELO client.example.org250 server.example.comMAIL FROM:<[email protected]>250 OkRCPT TO:<[email protected]>250 OkDATA354 Go ahead(Actual message)250 OkQUIT221 Bye(Close connection)

After opening a TCP connection on port 25, the client sends commands and the server responds with status codes. Once they greeted each other, the client transmits the envelope, followed by the DATA command and the message.

Command syntax

Field terminology

Extended Simple Mail Transfer Protocol (ESMTP)

A framework for extending SMTP was introduced in RFC 1425 in 1993. The extensible protocol, which is backward compatible with SMTP, is called the Extended Simple Mail Transfer Protocol (ESMTP). ESMTP was revised in RFC 1651 (1994), RFC 1869 (1995), RFC 2821 (2001), and most recently in RFC 5321 (2008). The basic idea behind ESMTP is that the client greets the server with the “extended hello” command EHLO instead of the old “hello” command HELO. This indicates to the server that the client understands ESMTP. The server responds with all the SMTP extensions it supports. For the rest of the session, the client can then make use of the server’s advertised capabilities.

ESMTP tool

Let’s put theory into practice. The following tool generates the command sequence to submit or relay an email with parameters of your choice. One way of using the tool is simply to observe how parameter changes affect the protocol flow. The reason why I built this tool, however, is that you can copy the commands to your command-line interface and send messages without the assistance of a mail client. Since you shouldn’t enter your email password on a random website like this one, I recommend that you use the mode for submission only with demo accounts which you’ve created for this purpose. The password is stored in the local storage of your browser without any protections until you erase the history. Having said that, the tool is open source like the rest of this website and if you don’t trust me that this website is served from those files, you can also build and run this website locally. The tool uses Thunderbird’s database and Google’s DNS API to resolve the server you want to connect to and the API by ipinfo.io to determine your IP address when you click on Determine next to the Client field. The text in gray mimics what the responses from the server likely look like. What you actually receive from the server will be different. As long as the returned status code starts with a 2 or a 3, you should be fine. If the returned status code starts with a 4 or a 5, something went wrong. I list some ideas for things you can try out after the tool. The boxes after that provide you with more information on various aspects, which are useful for troubleshooting problems you might run into. If you need more help, send me an email (probably with your mail client rather than with this tool). 🙂

Mode:Security:Recipients:Domain:Server:Port:Client:Pipelining:Username:Credential:Password:Challenge:
From:To:Cc:Bcc:Subject:Date:Identifier:Content:Body:
$ openssl s_client -quiet -crlf -connect submission.example.org:465
220 submission.example.org ESMTP Implementation
EHLO localhost
250-submission.example.org at your service, localhost
250-ENHANCEDSTATUSCODES
250 AUTH PLAIN
AUTH PLAIN AGFsaWNlQGV4YW1wbGUub3JnAHBhc3N3b3Jk
235 2.7.0 Authentication successful
MAIL FROM:<[email protected]>
250 2.1.0 Ok
250 2.1.5 Ok
250 2.1.5 Ok
250 2.1.5 Ok
DATA
354 End data with <CR><LF>.<CR><LF>

From: Alice <[email protected]>
To: Bob <[email protected]>
Cc: Carol <[email protected]>
Subject: Yet another message
Date: Sat, 08 May 2021 07:31:38 +0000
Message-ID: <[email protected]>

Hello, It's me. Again. Alice
.

250 2.0.0 Ok
QUIT
221 2.0.0 Bye
Tool instructions
  1. Create a new account at an email service provider of your choice. If you opt for Gmail, you should read this box first.
  2. Enter the address of your account in the From field and your password in the Password field. Set the Mode to Submission.
  3. After composing the message (To, Subject, and Body), try to submit it to the outgoing mail server with the listed commands.
  4. The first line opens a TLS channel to the specified Server. All other commands are sent to the server inside this channel.
  5. You can copy each line in bold to your clipboard by clicking on it, which includes the newline character to submit the command.
  6. If the mail was submitted successfully, you can add more To or Cc recipients. By copying only some of the generated RCPT TO commands but the full message, you suppress the delivery of the message to the skipped recipients. For those that receive the message, it looks as if the message was delivered to all the recipients in the message. I already mentioned this problem above.
  7. Besides faking recipients, you can also try to fake the sender. Switch the mode from Submission to Relay and change the From field to an address that you don’t own. Now try to send the message directly to the incoming mail server of one of the recipients. If the incoming mail server and the domain which you try to send the email from are properly configured, your message should make it at best into the spam folder of the recipient. Chances are that your message will be rejected during the SMTP session or silently dropped thereafter. The incoming mail server might also graylist or blacklist your IP address. Since you usually don’t relay email from your computer, this is nothing to worry about. Forging the sender address is known as spoofing. Be careful which domains you try to impersonate. If the domain owner configured a DMARC record, they might be informed about your spoofing attempt and even receive the content of your message.

Important: Be a nice person and don’t scam others! If you spoof the sender of an email in bad faith, you likely commit a crime in most countries. I showed you this attack for educational purposes only because I believe that seeing is believing. We can improve the state of email security only if consumers start demanding better security. In this spirit, I encourage you to relay spoofed emails only to your own mailbox. If such a spoofed email lands in your inbox, ask your email service provider to be more rigorous in filtering scam emails or use the service of a different provider. You’re hopefully also more motivated now to read the rest of this article. In short, have fun with the above tool but always remember that with great power comes great responsibility!

Tool explanations

Command-line interface

Clipboard verification

How to watch your clipboard in your command-line interface. Press “control c” to exit the program.

OpenSSL versus LibreSSL

Common SMTP extensions

A transcript of a session with the outgoing mail server of Gmail when using Implicit TLS. [Brackets indicate redacted information.]

Backward compatibility STARTTLS extension

The STARTTLS extension is listed when we connect without TLS to Gmail’s outgoing mail server.

When using -starttls smtp, openssl starts with a TCP connection and upgrades it to a TLS connection by issuing the STARTTLS command.

User authentication

Gmail authentication failure

Reverse DNS entry

Newline characters

Message termination

Origination date

Spoofed sender during submission

Limitations of the above tool

Other SMTP commands

These commands can be used at any time during a session. VRFY and EXPN are usually disabled for security reasons.

What a response to the VRFY command usually looks like. (The reply code of a disabled VRFY command should be 252, though.)

How Gmail responds to the HELP command. 😄

Automatic responses

In certain configurations, mail servers send a message in response to an incoming message, which leads to the following problems.

Mail loops

Bounce messages 3. Retrieve2. Report1. SubmitMail clientIncomingmail serverOutgoingmail server How the user learns about a delivery failure.
If the delivery of a message fails on the recipient’s side,
the bounce message (in red) is generated by a different system.

Backscatter

Password-based authentication mechanisms

The following boxes focus on password-based authentication mechanisms, which allow users to authenticate themselves to servers with only their username and password. Due to the nature of the topic, some of the later information boxes are fairly advanced. If you’re not interested in cryptography, you may want to skip them.

Dangerous reliance on TLS ClientProxyServer While the communication between the client and the proxy is protected (indicated by the blue lines), the communication between the proxy and the server is exposed in the company’s private network. ClientAttackerServer The client is misconfigured and connects directly to the attacker, who forwards all communication without raising any suspicion. ClientServerServer The malicious server (in red) has the same name as the legitimate server (in green). An attacker can impersonate the server by getting a certificate issued for their public key or by gaining access to the private key of the server and using the original server certificate.

Cryptographic hash functions InfeasibleEfficientInput ofany sizeOutput offixed size

Cryptographic hash functions are efficient to compute and infeasible to invert. For the same hash function, the same input always maps to the same output. The output is also called image and the corresponding input its preimage.

Find inputGiven output In these graphics, the given values are displayed in blue and the values to find in green. Given input 1Find input 2Same output≠ Knowing one input may not be useful to find another input which hashes to the same output. Find input 1Find input 2Same output≠ Due to the birthday paradox and attack, this is a stronger requirement than the previous one.

Secure Hash Algorithms (SHA)

Salts against pooled brute-force attacks

Nonces against replay attacks

Applications of cryptographic hash functions HashFileFileContentproducerStorageproviderConsumerhash(File) = Hash? As long as you get the hash from a trusted source, the delivery of the file can be outsourced to an untrusted third party. Salt, HashPasswordClientof userServer of providerhash(Password + Salt) = Hash?Databaseof provider A server can reduce the damage of a leaked database by storing individually salted hashes instead of passwords. RepeatedhashingpbkdfPasswordCryptographic key By hashing an input repeatedly, you can turn an efficient hash function into an inefficient one. SeedValue1​: hash(1 + Seed)Value2​: hash(2 + Seed)ValueX​: hash(X + Seed) Values can easily be derived from a seed (in green) but they cannot be related to one another (in red). ?AliceBobhash(CoinFlipAlice + Nonce)CoinFlipBobCoinFlipAlice, Nonce If CoinFlipAlice = CoinFlipBob, Alice wins. If CoinFlipAlice ≠ CoinFlipBob, Bob wins. ClientServerClientMessage, hmac(Key, ClientMessage)ServerMessage, hmac(Key, ServerMessage) A message authentication code is appended to each message. Leaf1​Leaf2​Leaf3​Leaf4​Node1​: hash(Leaf1​)Node2​: hash(Leaf2​)Node3​: hash(Leaf3​)Node4​: hash(Leaf4​)Node5​: hash(Node1​ + Node2​)Node6​: hash(Node3​ + Node4​)Root: hash(Node5​ + Node6​) In order to verify that the green leaf is included in the root, a verifier needs to know only the hashes and positions of the blue nodes. Messagehash(Message + 1)hash(Message + 2)hash(Message + 3)hash(Message + X) Finding a nonce which makes the hash of a message fall into a certain range requires many attempts.

Exclusive-or operation for perfect encryption

The truth table of the exclusive-or operation. The output is 1 if and only if the two inputs are unequal.

KeyKeyPlaintextCiphertextPlaintextEncryptionDecryptionAliceBobEve Eve has access to the ciphertext and knows the algorithms in blue, while the information in green is known only to Alice and Bob.

Desirable properties of authentication mechanisms

A comparison between password-based authentication mechanisms.
means that the authentication mechanism is resistant to the attack.
means that the resistance depends on choices made by programmers.
means that the authentication mechanism is vulnerable to the attack.

Challenge-Response Authentication Mechanism (CRAM) ClientServer(Connect)ChallengeResponse How challenge–response authentication works. ClientAttackerServer(Connect)(Connect)ChallengeChallengeResponseResponse

A man-in-the-middle attack on a challenge-response authentication mechanism.

Salted Challenge-Response Authentication Mechanism (SCRAM) ClientServerMutual authentication guarantees only that the inner channel (in green) reaches the counterparty.
Channel binding can be used to ensure that the outer channel (in blue) isn’t interrupted by an attacker. ClientServerUsername, ClientNonceServerNonce, Salt, IterationCountKeyXorHashedKeyMacKeyMac The sequence diagram of Simplified-SCRAM. TLS channel bindings (SCRAM-PLUS)

Authentication on the Web

Password-authenticated key exchange (PAKE)

Access protocols

Besides proprietary protocols, most incoming mail servers allow mail clients to access the user’s mailbox with POP3 or IMAP. If your mail client and your mail server support both protocols, you should choose the latter as it’s much more powerful. The main reason for including POP3 in this article is that it’s much easier to use from the command line.

Communication logging in Thunderbird

Post Office Protocol Version 3 (POP3)

The Post Office Protocol Version 3 (POP3) is specified in RFC 1939. Similar to ESMTP, POP3 is a text-based application-layer protocol, which can be used with Implicit TLS or with Explicit TLS. POP3 with Implicit TLS is also known as POP3S. Just like SMTP, POP3 commands consist of four letters and an extension mechanism was introduced after the initial release of the standard. After authenticating the user, POP3 allows the client to list, retrieve, and delete messages. POP3 is designed to move messages from a remote queue into a local queue. It doesn’t support read statuses, mailbox folders, message uploads, or partial fetches.

The following POP3 tool works in the same way as the ESMTP tool above. Most of the remarks I made earlier therefore still apply. In particular, I advise you to use it only with accounts created for this purpose. The tool uses Thunderbird’s configuration database and Google’s DNS API to resolve the server you want to connect to. Copy the commands in bold to your command-line interface by clicking on them. The text in gray mimics what the responses from the server look like. The actual responses will be different. Each response starts with either +OK or -ERR. The former indicates that your command was successful, the latter indicates that an error occurred. If necessary, you can always kill the current process and thereby the connection by pressing ^C (control + c). If you use Gmail, you have to enable POP3 access in your account settings and allow access from insecure apps.

Address:Password:Security:Server:Port:
Username:Credential:Challenge:Delete:
$ openssl s_client -quiet -crlf -connect pop3.example.org:995
+OK Implementation ready.
PASS password
+OK Logged in.
LIST
+OK 1 messages:
RETR 1
+OK Message follows:
{HeaderAndBody}
QUIT
+OK Logging out.

POP3 commands

The mandatory commands of POP3. (The USER and PASS commands are strictly speaking optional.) POP3 extensions

Some optional commands of POP3. These commands extend the basic functionality of POP3. Unlike the message numbering, the IDs are guaranteed to stay the same across sessions.

The server can indicate additional behavior in its response to the CAPA command. LOGIN-DELAY and EXPIRE allow the server to conserve its resources.

Just like STARTTLS, STLS is advertised only in TCP connections. Gmail doesn’t support POP3 with Explicit TLS but Gandi does.

APOP authentication

Internet Message Access Protocol (IMAP)

The Internet Message Access Protocol (IMAP) is specified in RFC 3501. IMAP works similar to ESMTP and POP3, it just has many more commands and options. An IMAP mailbox acts as a remote drive for messages instead of files, where the drive is being shared among several clients. IMAP allows users to create, delete, and rename folders, to upload and move messages between them, to mark messages as read or as flagged, to search the mailbox remotely, and to download messages without their attachments.

The following IMAP tool works just like the ESMTP and POP3 tools above. As you might mess up your mailbox or delete messages you still wanted by accident, you should run the following commands on test accounts only. If you want to use your real account, you do so at your own risk. Certain commands have side effects, such as marking messages as read. Make sure you fully understand a command before using it. This tool also uses Thunderbird’s configuration database and Google’s DNS API to resolve the server you want to connect to. Neither IMAP nor the tool are self-explanatory. You find more information in the tooltips and the boxes below.

Address:Password:Security:Server:Port:Username:List:Write:Fetch:
Search:Messages:Criterion:Value:Date:Delete:Append:Idle:
$ openssl s_client -quiet -crlf -connect imap.example.org:993
* OK Implementation ready
I LOGIN [email protected] password
I OK LOGIN completed
E EXAMINE "INBOX"
* 4 EXISTS
* 2 RECENT
* OK [UNSEEN 3]
* OK [UIDNEXT 5]
* OK [UIDVALIDITY 1]
* OK [PERMANENTFLAGS ()]
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
E OK [READ-ONLY] EXAMINE completed
F FETCH 4 ((BODY[]))
* 4 FETCH ((BODY[]) {123}
{Data}
)
F OK FETCH completed
O LOGOUT
* BYE Logging out
O OK LOGOUT completed

After the initial greeting by the server, the client sends commands, to which the server responds. Since multiple commands can be in progress at the same time, the client tags each command with a unique identifier, such as A, B, C, or a dot .. The server prefixes each line of its response with * and completes its response with a line which starts with the tag chosen by the client. The tag is followed by a status response: OK for success, NO for failure, or BAD for protocol errors. Don’t worry about reusing tags in a single session, you can run a command repeatedly with the same tag. If you want to fetch another message, for example, just enter another message number and copy the generated command again. If you use Gmail, you have to enable IMAP access in your account settings and allow access from insecure apps.

Protocol states LOGOUTCLOSEUNSELECTSELECTEXAMINEUNAUTHENTICATELOGIN AUTHENTICATENot authenticatedAuthenticatedSelectedLogout The protocol states and how to transition between them. LOGOUT can be called in any state except the logout state. UNAUTHENTICATE can also be called in the selected state.

Data formats

Message numbers

Message sets

Message flags

How a custom flag can be created and set if the IMAP server supports it.

Internal date

How to fetch when a message was received. IMAP commands IMAP extensions

JSON Meta Application Protocol (JMAP)

Over the last forty years, email in general and IMAP in particular became a patchwork of extensions. Given the complexity and the varying support of these extensions, writing a mail client is much more difficult than it should be. While there are efforts to unify the patchwork somewhat, there has also been a fresh start over the last couple of years. An IETF working group designed a modern protocol for client to server interaction: The JSON Meta Application Protocol (JMAP). JSON itself stands for JavaScript Object Notation, which is a popular format for storing and exchanging human-readable data. JMAP is specified in RFC 8620 and it can be used for more than just email. The data model for synchronizing email is specified in RFC 8621. If you don’t like the RFC formatting, you can also read the two standards here and here.

JMAP is designed to be interoperable with IMAP mailboxes and thus shares the concepts of folders and flags with IMAP. The protocol itself, however, is completely new and addresses the following shortcomings of IMAP (and message submission):

  • Permanent identifiers: JMAP servers assign permanent identifiers to all objects. In the case of messages, these identifiers can no longer be invalidated and they no longer change when a message is moved from one folder to another. In the case of folders, JMAP clients can detect when a folder has been renamed and no longer need to fetch all the messages in it again.
  • Efficient synchronization: JMAP provides a simple method for getting the identifiers of created, updated, and destroyed messages and folders. As we have seen above, synchronizing a mailbox with IMAP is easy only if you stay connected to the server, which isn’t an option for mobile clients.
  • Push mechanism: In order to be informed immediately about changes to a folder, such as newly arrived messages, IMAP clients use the IDLE command. If they want to be informed about changes to several folders, they have to open a separate connection for each folder. JMAP, on the other hand, allows clients to subscribe to all changes on the server at once. Clients which can keep a connection to the server open can subscribe via the EventSource interface. Other clients, such as those on mobile phones, can register a callback URL, which allows them to use their platform-specific push technology.
  • Batching of chained commands: When the IMAP server doesn’t support certain extensions such as SEARCHRES, IMAP clients often need to wait for the response to one command before they can construct the followup command. JMAP allows clients to batch several commands and to reference the results from earlier commands in the same request. Doing so avoids round trips and makes updates more atomic (i.e. it becomes less likely that only some of the issued commands are being executed).
  • Widespread data format: JMAP data doesn’t have to be encoded as JSON and future standards can specify other data formats. The same is true for the transport protocol: While JMAP currently uses HTTPS as its transport protocol, other protocols can be added in the future. The choice of JSON and HTTPS is mostly due to their widespread adoption: There are suitable libraries for all relevant programming languages and software engineers know how to use those. It’s worth mentioning that JMAP doesn’t wrap binary data in JSON. Binary data is exchanged in separate connections.
  • Complexity on server: JMAP moves the complexity of handling email’s message format from the client to the server. While clients can still fetch the raw message if needed, for example when implementing end-to-end security, the server has to deal with multipart messages, content encodings, line-length limits, etc. Clients can download and upload messages as a simple JSON object. Please note that this affects neither how messages are stored on servers nor how they are relayed to others. It just relieves programmers who want to integrate email from having to take care of encoding and decoding messages correctly.
  • Message submission: The previous point only makes sense if clients can also submit messages for delivery in the same format. If the JMAP server supports submission, a client can instruct it to send a stored message to its recipients. The client can generate the envelope itself or let the server do it. By first storing the message as a draft and then moving it to the sent folder after sending it (see this example), JMAP also solves the double-submission problem.
  • Flood control: Since it’s not always possible to anticipate how much data the server will send back, JMAP lets clients restrict the size of responses. This feature is especially valuable on devices with limited bandwidth or expensive roaming.

Support for JMAP is still quite rare, which is not surprising given that the standard was published only in 2019. We yet have to see whether it will become a relevant protocol for accessing one’s mailbox. I certainly hope so but email is really resistant to innovation.

Email filtering

It can be useful to filter incoming messages according to custom rules. For example, you may want to move certain messages to a certain folder, mark certain messages as read, or delete certain messages automatically. Most mail clients allow their users to configure such rules, which are executed when the mail client receives a new message. There are several advantages of filtering incoming mail on the server rather than on the client, though:

  • Synchronization: If the filtering rules are stored on the incoming mail server, they can be inspected and edited through any of the user’s mail clients. Otherwise, users have to remember on which client they’ve created the rule that they want to modify now.
  • No race conditions: If the filtering rules are stored on a mail client, then the rules are not applied when this mail client is offline. In this situation, other mail clients see unfiltered messages. If these mail clients apply rules of their own, you might run into race conditions, where the order in which clients see incoming messages determines the outcome of the filtering.
  • Rules for absence: Some rules, such as sending out-of-office replies, shall run precisely when all mail clients are offline. This is not possible when the rules must be executed by mail clients.
  • Rejection during delivery: Unlike clients, incoming mail servers can reject a message during its delivery. By sending the 550 response code during the SMTP session, the incoming mail server can inform the sender about the rejection without causing backscatter with bounce messages.

To achieve server-side filtering, we need a standardized mail filtering language and a standardized filter management protocol.

Mail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientOfflinemail clientof recipientOnlinemail clientof recipient How a message is delivered from the mail client of the sender to the mail clients of the recipient. Messages can be filtered by the incoming mail server (in green) or by an online mail client (in blue).

Mail filtering language (Sieve)

Sieve is a language for filtering messages on the incoming mail server. It is specified in RFC 5228 and it is fairly simple: Using the control commands if, elsif, and else, you can specify under which conditions a specific action shall be applied. You can find plenty of examples throughout the RFC as well as here and here. There are just a couple of things you should know to understand them:

  • Arguments: Most commands in the Sieve language take arguments. Mandatory arguments are determined by their position, optional arguments are identified by a colon followed by their name. Some optional arguments can take arguments themselves: :name value. This is similar to arguments in the command-line interface but with : instead of - before the name. When optional arguments are not provided, their default values are used instead.
  • Extensions: The Sieve language is extensible. A script has to list the extensions which it uses at the top of its code with require.
  • Implicit keep: Each message is stored in the inbox unless it is moved to a folder, forwarded to an address, or discarded explicitly.
  • String lists: Wherever a list of strings is expected, such as ["To", "Cc"], a string without brackets, such as "To", can be used.
  • Prefix notation: Commands and arguments are nested not with parentheses but by earlier tokens consuming later ones. For example, the negation of the condition exists "Date" is not exists "Date". This is similar to the prefix notation.
  • Comments: If you use # outside of double quotes, the incoming mail server ignores all characters including this one until the end of the line. Comments which span less or more than a line have to be enclosed in /* and */.
  • No loops: The Sieve language doesn’t support loops. Each block is executed once or not at all.

You can generate simple filtering rules with the following tool. Make sure that the Argument makes sense for the chosen Action. Move requires the name of a folder, Forward an email address, Flag the name of a flag, and Reply the text of the reply.

Condition:Address part:Match type:Value:
Negation:Action:Argument:
require "imap4flags";
if header :contains "Subject" "Test" {
addflag "\\Seen";
}

Users don’t have to learn the Sieve language. Mail clients can offer a graphical user interface (GUI) similar to the tool above, where users don’t have to see the generated code. You find a list of all the extensions to the Sieve mail filtering language on Wikipedia.

Out-of-office replies

A simple Sieve script for an automatic vacation response, which I’ve adapted from Gandi.

Support by email service providers and mail clients Gmail users can go to the Filters and Blocked Addresses tab of their settings and click on “Create a new filter”. While Gmail has an API for managing filters, other mail clients won’t support such a proprietary protocol.

Filter management protocol (ManageSieve)

ManageSieve is a protocol for managing Sieve scripts remotely. It is specified in RFC 5804 and works similar to the protocols we have seen so far. After an initial greeting from the server, the client sends commands to which the server responds. Just like IMAP, responses are completed with a line which starts with OK or NO; but unlike IMAP, the commands are not preceded with a tag. Just like IMAP, multiline strings are prefixed with their length; but unlike IMAP, the client can include a plus to continue with the string without having to wait for a continuation response from the server. Just like SMTP for Relay, there’s no variant of ManageSieve which can be used with Implicit TLS. The server sends its capabilities automatically in its greeting and after successful STARTTLS and AUTHENTICATE commands. As part of the capabilities, the server indicates which extensions to the Sieve language and which SASL mechanisms it supports. According to RFC 5804, ManageSieve servers have to support PLAIN over TLS and SCRAM-SHA-1.

The following tool shows you how to use the ManageSieve commands from your command-line interface. Unlike the previous tools, you have to configure the address and the port number of the server manually as this information is not included in Thunderbird’s configuration files. The standard describes how to locate the ManageSieve server with SRV records and the autoconfiguration tool above does query the _sieve._tcp subdomain. However, since virtually no one configures such SRV records (at least not for the ManageSieve protocol), I didn’t bother to implement this discovery mechanism here. ManageSieve servers listen on port 4190 by default. The Thunderbird plugin, which I mentioned earlier, simply probes this port on the IMAP server in order to configure itself.

Important: Since LibreSSL doesn’t support the ManageSieve STARTTLS command, you have to use OpenSSL (see the boxes below).

OpenSSL:Server:Port:Username:Password:List:
Action:Name:Script:
$ openssl s_client -quiet -crlf -starttls sieve -connect sieve.example.org:4190
"IMPLEMENTATION" "{NameAndVersion}"
"NOTIFY" "{Methods}"
"SASL" "PLAIN"
"SIEVE" "{Extensions}"
"VERSION" "1.0"
OK "STARTTLS completed."
AUTHENTICATE "PLAIN" "AGFsaWNlQGV4YW1wbGUub3JnAHBhc3N3b3Jk"
OK "AUTHENTICATE completed."
CHECKSCRIPT {62+}
require "body"; if body :contains "Test" { discard; }
OK "CHECKSCRIPT completed."
PUTSCRIPT "MyScript" {62+}
require "body"; if body :contains "Test" { discard; }
OK "PUTSCRIPT completed."
SETACTIVE "MyScript"
OK "SETACTIVE completed."
LOGOUT
OK "LOGOUT completed."

Explanation: While you can have multiple scripts on the server, at most one of them can be active. You cannot delete the active script. You can deactivate the active script by activating another script or by using an empty script name to set no script active.
You can also generate the argument to PLAIN yourself with echo -ne '\0000username\0000password' | openssl base64.

LibreSSL doesn’t support ManageSieve

How to install OpenSSL on macOS

Format

The format of an email message is specified in RFC 5322. The goal of this chapter is to make you comfortable reading raw messages.

How to display the raw message

Mail clients don’t display all header fields by default. Here is how you can display the raw message as it arrived in your mailbox:

  • Gmail: Open a message, click on ⋮ in the upper right corner, then on “Show original”.
  • Yahoo: Open a message, click on ⋯ in the bottom middle, then on “View raw message”.
  • Outlook:
    • Web: Click on ⋯ in the upper right corner, then on “View” and “View message source”.
    • Desktop: Double-click a message, click on the “File” menu and then select “Properties”.
  • Thunderbird:
    • Raw message: Select a message, click on the “More” button and then “View Source” (or use ⌘U).
    • All header fields: Click on the “View” menu, then on “Headers” and “All” (or on “Normal” to go back).
  • Apple Mail:
    • Raw message: Click on the “View” menu, then on “Message” and “Raw Source” (or use the shortcut ⌥⌘U).
    • All header fields: Click on the “View” menu, then on “Message” and “All Headers” (or use the shortcut ⇧⌘H).
    • Change preferences: In the “Viewing” tab of the preferences, you can configure which header fields are displayed.

File format

Since messages, including attachments, are just text, they can be stored as simple text files. A common filename extension for emails is .eml. Such files can be viewed with any text editor. Desktop clients usually have an option to save a message as a file, and among Web clients, at least Gmail allows you to download a message in the “⋮” menu, which is located in the upper right corner.

Storage format

Apple Mail storage format

Line-length limit

According to RFC 5322, each line of a message may consist of at most 1’000 ASCII characters, including CR + LF. Implementations are free to accept longer lines, but since some implementations cannot handle longer lines, you shouldn’t send them. The RFC even recommends limiting lines at 80 characters to accommodate clients that truncate longer lines in violation of the standard. In order to leave the line wrapping to the mail client of the recipient, the mail client of the sender has to encode the body if the body contains lines which are too long. If a header field is too long, it must be broken into several lines with folding whitespace: {CR}{LF} followed by at least one space or tab. If a line in the header section of a message starts with whitespace, its content belongs to the header field on the previous line. The procedure of breaking lines as done by the sender is called folding, the procedure of joining lines as done by the recipient is called unfolding. When unfolding, runs of whitespace characters are replaced with a single space character.

Message identification

There are three header fields to identify the current message and the previous messages in the same thread:

  • Message-ID: The Message-ID identifies the current message. It’s format is <{Value}@{Domain}>. Although outgoing mail servers may add this field if it’s missing, the Message-ID should be chosen by the mail client. Otherwise, the copy stored in the sent folder on the incoming mail server lacks this field, which defeats its purpose. Whoever chooses the Message-ID should make sure that it’s unique. Mail clients often choose the Value as a universally unique identifier (UUID) and the Domain as the domain part of the user’s email address. The sender has to decide whether two messages are the same and thus share the same Message-ID. If the client generates different versions of the same message due to Bcc recipients, it should use the same Message-ID for all of them.
  • In-Reply-To: If a user replies to a message, the Message-ID of the replied-to message is put into the In-Reply-To header field.
  • References: While In-Reply-To refers only to the direct parent message, the References field lists the Message-IDs of all ancestor messages, including the direct parent message. This is useful to reconstruct a conversation even if not all intermediary messages were sent to you. Clients compose this field by adding the Message-ID of the replied-to message to the References of the replied-to message. When determining which messages belong to the same thread, clients use additional heuristics, such as comparing the Subject line after stripping common prefixes, to avoid grouping messages where a person replies to a message just to send an unrelated message to the sender of the message.
Message-ID:  <[email protected]>
In-Reply-To: <[email protected]>
References:  <[email protected]>
             <[email protected]>

An example of what the three message identification header fields look like. The References field contains the message ID of the In-Reply-To field.

Mandatory header fields

Quoting the previous message

How text is usually quoted at different nesting levels. Quoting the text allows you to reply below each paragraph of the original message.

Universally Unique Identifier (UUID)

Trace information

According to RFC 5321, whenever a mail server receives a message, it must add a Received header field at the beginning of the message without changing or deleting already existing Received header fields. Received header fields have the following format:

Received: from {EhloArgument} ({DnsReverseLookup} {IpAddressOfClient})
    by {DomainNameOfServer}
    with {Protocol}
    id {SessionId}
    for {AddressOfRecipient};
    {DayOfWeek}, {Day} {Month} {Year} {Hour}:{Minute}:{Second} {TimeZone}

The format of Received header fields. The curly brackets stand for values which need to be inserted. The with, id, and for clauses are optional. The newlines can be in other places, and additional information is often added as comments in parentheses in various places.

According to RFC 5321, the Protocol is either SMTP or ESMTP. RFC 3848 specified additional values: ESMTPA when ESMTP is used with successful user authentication, ESMTPS when ESMTP is used with Implicit or Explicit TLS, and ESMTPSA when the session has been secured and the user has been authenticated. RFC 8314 specifies an additional tls clause, which can be used after the for clause to record the TLS ciphersuite which has been used. Gmail adds such information as a comment instead: (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256). Checking the Received header fields of a received message gives you an idea whether the message was secured during transport. Note, however, that Received header fields are not authenticated: The mail servers through which a message passes can change the Received header fields that were added by mail servers through which the message already passed. In addition, not all mail servers might support the newer protocol values, and relays over a private network are often not protected with TLS. A message typically has at least four Received header fields, which only makes sense if you look at the official architecture instead of the simplified architecture. A Received header field is added by the mail submission agent (MSA), the outgoing mail transfer agent (MTA), the incoming mail transfer agent (MTA), and the mail delivery agent (MDA). Here is a Received header field, which was added by my outgoing mail server:

Received: from [192.168.1.2] (unknown [203.0.113.167])
    (Authenticated sender: [email protected])
    by relay12.mail.gandi.net (Postfix) with ESMTPSA id 7974D200009
    for <[email protected]>; Thu, 3 Dec 2020 14:14:48 +0000 (UTC)

What an actual Received header field looks like. I’ve only replaced my IP address with an address reserved for documentation. Due to Network Address Translation (NAT), the private IP address that my computer used in the EHLO command and the public IP address that the outgoing mail server saw were different. Since my public IP address doesn’t have a reverse DNS entry, the server recorded the name of the client as unknown. (Authenticated sender: [email protected]) is a comment, which the server added to indicate which user submitted the message. (Postfix) is also a comment, indicating the name of the server implementation. Everything else matches the format which I’ve described above.

An incoming mail server which delivers a message must add the MAIL FROM address of the envelope in a Return-Path header field to the message. While a message can have several Received header fields, it may have at most one Return-Path header field. If a message is resubmitted, for example by a filtering rule, the Return-Path header field should be removed, and its value should be used as the MAIL FROM address. As we discussed earlier, the Return-Path header field can be different from the From header field.

Return-Path: <[email protected]>

What a Return-Path header field looks like.

Recover why you received a message

Local Mail Transfer Protocol (LMTP)

Content encoding

RFC 5322 specifies a format for text messages, whose lines may consist of at most 1’000 ASCII characters. Whenever the content of a message doesn’t fulfill this requirement, it must be encoded according to the Multipurpose Internet Mail Extensions (MIME) as specified in RFC 2045. When mail clients encode messages according to MIME, they indicate this with the following header field:

MIME-Version: 1.0

The header field used to indicate that a message is formatted using MIME.

In theory, the version number allows the Internet community to make changes to the standard. In practice, however, the standard didn’t specify how mail clients are supposed to handle messages with an unknown MIME version. As a consequence, you cannot change the version number without breaking email communications, which makes this header field completely useless. The version 1.0 survived the last 30 years and will likely survive the next 30 years. MIME also introduced additional message header fields, which we’ll cover in this and the following subsections.

Unless all involved SMTP servers support the BINARYMIME extension as specified in RFC 3030, which is rarely the case, content containing non-ASCII characters or lines longer than 1’000 characters must be encoded with one of the following two methods:

  • Quoted-Printable: Any byte which doesn’t represent a printable ASCII character is encoded with the equality sign followed by the value of the byte encoded as two hexadecimal digits. Since = is used as the escape character, it has to be encoded with its hexadecimal ASCII value as =3D. Lines may be at most 78 characters long, including {CR}{LF}. Longer lines have to be broken by inserting ={CR}{LF}. All sequences of these three characters are removed when decoding the Quoted-Printable encoding. Since some mail servers add or remove trailing whitespace, tabs and spaces which are followed by {CR}{LF} also need to be encoded with hexadecimal digits. Any sequence of bytes can be encoded with this method. The Quoted-Printable encoding only makes sense, though, if most of the bytes are printable ASCII characters. This is the case for those European languages which share most of their characters with the English alphabet. Texts in such languages remain largely readable when using the Quoted-Printable encoding. The probability that a random byte falls into the range of printable ASCII characters is just a bit bigger than one third, though. Thus, the size of binary data, such as images, more than doubles with this encoding. The following tool allows you to encode and decode Quoted-Printable:
    Decoded:Encoded:Charset:
  • Base64: Binary data and non-Western-European languages are best encoded with Base64. While hexadecimal digits encode 4 bits each, Base64 digits encode 6 bits each. 6 bits can represent 26 = 64 different values. Base64 uses the characters AZ, a – z, 0 – 9, +, and / to encode these 64 values. What makes the Base64 encoding special is that bytes and digits don’t align: Three bytes are encoded with four Base64 digits. If you shift the input by one or two bytes, the Base64 encoding looks completely different. If the size of the input is not a multiple of three, one or two equality signs are appended to the output in order to make the output a multiple of four. This procedure is known as padding. In order to respect the line-length limit, a line break is inserted after at most 76 Base64 characters. Base64 encoding increases the size of the content by 33% and the line breaks add another 2.6% on top of that. You can encode and decode Base64 with the following tool:
    Decoded:Encoded:Charset:

The mail client of the sender informs the mail client of the recipient with the following header field that the content is encoded:

Content-Transfer-Encoding: {Value}

This header field indicates with which method the content of the message has to be decoded.
The Value can be quoted-printable, base64, or 7bit if no content encoding has been used.
(If the 8BITMIME or BINARYMIME extensions are supported, the value can also be 8bit or binary.)

If the message already consists of only printable ASCII characters, the line-length limit can also be achieved with soft line breaks.

Character encoding (charset) ascii7 bitsiso-8859-18 bitsutf-88 to 32 bitsASCII encodes only a subset of the characters defined in ISO-8859-1, and ISO-8859-1 encodes only a fraction of the characters available in UTF-8.

Percent encoding (URL encoding)

Decoding on the command line

How to encode (-e) and decode (-d) Quoted-Printable with qprint. Use brew install qprint to install this command on macOS.

How to encode (-e) and decode (-d) Quoted-Printable with the quoted-printable package if you already have Node.js.

How to encode (-e) and decode (-d) Base64 with OpenSSL. Use the option -A to have no newline characters inserted or expected.

How to encode and decode Quoted-Printable, Base64, and Percent with Perl, which is likely preinstalled on your computer. You can use explainshell.com to learn more about the used options. The code uses the MIME::QuotedPrint, MIME::Base64, and URI::Escape modules.

How to encode and decode Quoted-Printable, Base64, and Percent with Python, which is likely preinstalled on your computer.
The commands use the quopri, base64, and urllib.parse modules and the first four commands operate on the raw bytes.
If you want to use a character encoding other than UTF-8, you can just save the file which you use as the input accordingly.

Header encoding

RFC 2047 specifies how one can use non-ASCII characters in certain header field values, such as the subject and the display names. Instead of introducing new header fields to specify the encoding of existing header fields, encodings in header fields indicate which character encoding and which content encoding has been used. This results in the so-called Encoded-Word encoding. Its format is as follows: =?{CharacterEncoding}?{ContentEncoding}?{EncodedText}?=, where CharacterEncoding is usually either ISO-8859-1 or UTF-8, ContentEncoding is either Q for Quoted-Printable or B for Base64, and EncodedText is the field value encoded according to the previous parameters. The Quoted-Printable encoding is slightly modified when used to encode header field values: Question marks, tabs, and underlines are escaped with their hexadecimal representation and spaces are encoded with underlines. In order to adhere to the line-length limit, whitespace between adjacent Encoded Words is removed completely, which allows the encoder to break long words with a newline (and also to mix different character encodings). The following tool does all of that for you. It uses Quoted-Printable or Base64 depending on which encoding is shorter, and it supports only ISO-8859-1 and UTF-8.

Decoded:Encoded:

In case you haven’t noticed yet: The ESMTP tool above automatically encodes the Subject and the Body if necessary. If you want to use non-ASCII characters in display names, you have to paste the Encoded Word into the address field yourself. The following boxes explain how non-ASCII characters are supported in domain names, which is really interesting but also fairly advanced.

Punycode encoding

Unicode normalization

The four normalization forms of Unicode. CanonicalequivalenceCompatibilityequivalence Compatibility equivalence includes canonical equivalence.

Unicode case folding

Internationalized domain names (IDNs) Arbitrary user inputRemove certain charactersCase fold all charactersnfkc-normalize the labelsReject certain charactersEncode with PunycodeDomain name in ascii

How IDNA2003 normalizes user input. The normalization fails only if the output contains prohibited characters or violates the rules for bidirectional text.

Arbitrary user inputReject symbols and punctuation marksLowercase or reject uppercase charactersnfkc-normalize or reject non-normalized labelsAccept only valid charactersEncode with PunycodeDomain name in ascii How IDNA2008 normalizes user input. The steps in gray are required but not standardized. IDNA2008 validation

Homograph attack

Email address internationalization (EAI)

Content type

Now that we can encode arbitrary content, we need a way to inform the client how to interpret the decoded content. This is done with the Content-Type header field, which has the following format:

Content-Type: {Type}/[{Tree}.]{Subtype}[+{Suffix}][; {Parameter}]*

The curly brackets need to be replaced as described below, the content in the square brackets is optional, and the asterisk indicates that there may be several parameters.

The content type is also called media type. IANA maintains a long list of registered media types. A content type consists of:

The type, the subtype, and the parameter names are case-insensitive. RFC 6838 doesn’t specify whether the tree and the suffix are also case-insensitive but I assume that this is the case. Whether a parameter value is case sensitive depends on the parameter. The default content type for emails is text/plain; charset=us-ascii. As specified in RFC 1945, HTTP uses the same header field with the same media types.

Example content types: text/csv, text/html, image/png, image/svg+xml, image/vnd.adobe.photoshop, audio/mpeg, video/mp4, font/otf, application/javascript, application/pdf, application/vnd.apple.pages, and application/vnd.ms-excel.

Enriched Text

An example Enriched Text message. Click here to use this example in the ESMTP tool above. HTML emails

An example HTML message. Click here to use this example in the ESMTP tool above.

Email styling External CSS: Load an external style sheet with a <link> element.

Internal CSS: Embed the style with a <style> element inside the <head> element.

Inline CSS: Repeat the style with the style attribute for every element.

Styling the color of links. Click here to use this example in the ESMTP tool above.

The styles that Gmail applies to the link in the above message.

How to solve Gmail’s link rendering problem. Click here to use this example in the ESMTP tool above. While the ID and the descendant selectors are supported by almost everyone, the child selector is not.

Email markup

Dynamic content

Soft line breaks

How to respect the line-length limit and message quoting without having to resort to Quoted-Printable encoding.
In order to make whitespace visible, I’ve replaced each space with and marked each newline with .
Note that the leading space on the last line is required. Otherwise, this line would be a quoted 3.

Message compression Web clientWeb serverGET /file http/1.1Accept-Encoding: br, gziphttp/1.1 200 OKContent-Encoding: gzip

How HTTP compression works. The most used compressions are gzip and br. You can inspect these header fields with the developer tools of your browser.

How S/MIME can be used to compress messages using the ZLIB compression format as specified in RFC 1950.

The pipeline of commands to Base64-decode and ZLIB-decompress the string from the above example message.

Internationalized parameter values

Multipart messages

Now that we can send arbitrary files via email, we can design file formats to include several files in a single message body. RFC 2046 defines various content types to split a message into multiple parts. What all the multipart formats have in common is that they are text-based. This means that the various parts have to be separated with a character sequence which may not appear in any of the parts themselves. The character sequence is chosen by the sending mail client for each message and provided to the recipient in a content-type parameter called boundary. Let’s look at the two most common multipart types and leave the rest for the boxes below.

  • multipart/mixed bundles independent parts into a single message. This content type is used to attach files to a message. If a client doesn’t recognize a multipart subtype, it should treat the content as multipart/mixed and show the recognized parts.
    MIME-Version: 1.0
    Content-Type: multipart/mixed; boundary="UniqueBoundary"
    
    --UniqueBoundary
    Content-Type: text/plain
    Content-Transfer-Encoding: 7bit
    
    This message has an attachment.
    
    --UniqueBoundary
    Content-Type: image/png
    Content-Transfer-Encoding: base64
    
    iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAAACXBIWXMAAAsSAAAL
    EgHS3X78AAAB4klEQVQ4y5VVwU7CQBDdKl5IVz8Fg95ZQH+BhBP/UEAxKv/AN3jG
    /1ATPeqlJr169WRpi/NmZ8uCEGqTl53OzrzO7MxOVdiKFB6s2gwDkeuEAeGRkBAW
    gkR02KvDFj4+hxMCpUq5T4h1d7LU3Zulbo+X5GQBGToC2XzC1vML1lmtPGMiciQ5
    I/xgBRmtBSFHpPS+0O0rRzzz/JWf5kxf3CKSQneusea8whGyjbIQ4kI+lMHHkTou
    TpMjA5kZpoQpoUnoEeJ1UiKzdiDKmdRG2ldeAWI+HxvZvfIeem9wihKhO09EKWuG
    D4KDC4WKITpJAUbnQnREugOR3+RjWVkgkMkR8AdtlAMQzj1C4ExIHNkh4XkbIaff
    YlJHOAdhos3IpbCN8IDw4hHmshZeoXLpjERJG7jNfYSpd9ZloUIj52mixT8IqdId
    rvYX4bXsxVVLlYRVUn46vryD0wPhRPSnrqVC+Hkp7ytKwFU2w29CKK1Wk70e0seN
    8otSpW0+CO9MZqIa9kSP5s85QssxqNrYLvrGxt7UXs/xqrGrXb2xmzqx6Jpik6IX
    Jbq+8i90heGQs+zvwXZzOFQdX+7ess5EmZ2Nk7/ja/+AHXkDdiQDduLObPeA3fEL
    mMvYTwWJ6Hb+An4Bgrjq/fe5+zgAAAAASUVORK5CYII=
    
    --UniqueBoundary--
    

    An example multipart/mixed message. Click here to use this example in the ESMTP tool above.

  • multipart/alternative bundles alternative versions of the same content into a single message. This content type is used to provide a fallback version of the content for mail clients that don’t support the preferred content type. The versions are to be listed in increasing order of preference, which means that the preferred format comes last. This has the advantage that users of mail clients which don’t support multipart messages see the simplest version of the message first. Mail clients usually display the last part which has a content type that they support unless the user configured a different preference. multipart/alternative is most commonly used to provide a plaintext version of HTML messages for users of text-based mail clients, such as Elm, Pine, and Mutt, which cannot render HTML. To give you another example, I could have included a plaintext version of the Enriched-Text message so that Gmail could display that instead of offering me to download the unrecognized content.
    MIME-Version: 1.0
    Content-Type: multipart/alternative; boundary="UniqueBoundary"
    
    --UniqueBoundary
    Content-Type: text/plain
    
    Roses are red.
    
    --UniqueBoundary
    Content-Type: text/enriched
    
    <bold>Roses</bold> <italic>are</italic>
    <color><param>red</param>red</color>.
    
    --UniqueBoundary
    Content-Type: text/html
    
    <html>
      <body>
        <b>Roses</b> <i>are</i>
        <span style="color:red;">red</span>.
      </body>
    </html>
    
    --UniqueBoundary--
    

    An example multipart/alternative message. Click here to use this example in the ESMTP tool above.

Since multipart/mixed and multipart/alternative are content types like any other, they can be nested, which results in a tree of message parts. The content encoding of multipart parts has to be 7bit, 8bit, or binary, and the boundary between the inner parts has to be different from the boundary between the outer parts.

Boundary delimiter

The various parts of a multipart message. Click here to use this example in the ESMTP tool above.

Content disposition

How to attach a file with a name. Click here to use this example in the ESMTP tool above.

Aggregate documents <img src="cid:[email protected]"> references the part with Content-ID: <[email protected]>. Click here to use this example in the ESMTP tool above.

Other multipart subtypes

After many encoding-related sections, I want to mention two more format-related aspects before moving on to issues with email.

One-click unsubscribe

If you are subscribed to a mailing list, you may want to unsubscribe from the list after having received a message you no longer want to receive. Most mailing lists include a link at the bottom of each sent message, which you can click to unsubscribe from the mailing list. Since this is a link like any other in the message, a browser window is opened and you might have to click on additional buttons there to finally unsubscribe from the list. This can be a bit of a hassle, especially on mobile phones. Fortunately, RFC 2369 specifies an easier way to achieve the same. Mailing lists should include a List-Unsubscribe header field so that mail clients can provide a uniform unsubscribe experience across mailing lists: You simply click on “Unsubscribe” and your mail client takes care of the rest.

List-Unsubscribe: <https://example.com/unsubscribe?token=XYZ>,
  <mailto:[email protected]?subject=Unsubscribe>

The List-Unsubscribe header field provides a standardized way to unsubscribe from a mailing list. If there are several options in angle brackets, the mail client should use the first one that it supports.

To be precise, RFC 2369 didn’t require that there is no additional user interaction. In fact, user confirmation was often necessary in order to prevent accidental unsubscriptions triggered by anti-spam programs which simply fetch all the links in a message. For this reason, RFC 8058 defines an additional header field with more precise semantics: When the user clicks on “Unsubscribe”, the mail client sends a POST request to the HTTPS resource specified in the List-Unsubscribe header field with the value of the new List-Unsubscribe-Post header field in the body of the request. The List-Unsubscribe-Post header field has to contain List-Unsubscribe=One-Click, and both header fields must be covered by a valid DKIM signature.

List-Unsubscribe-Post: List-Unsubscribe=One-Click

The List-Unsubscribe-Post header field informs the receiving mail client that it can unsubscribe the user with a simple POST request.

The body of the POST request is encoded with the content type multipart/form-data as specified in RFC 7578 or application/x-www-form-urlencoded as specified by the Web Hypertext Application Technology Working Group (WHATWG) in their URL spec. The request has to be sent without context information such as cookies. The user has to be authenticated with a token in the URL.

POST /unsubscribe?token=XYZ HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 26

List-Unsubscribe=One-Click

The POST request to unsubscribe a user from a mailing list as generated by Gmail. You find an example POST request using multipart/form-data in RFC 8058.

These two header fields are not only convenient for users, they also make unsubscribing more secure since mail clients don’t include them when forwarding a message. If you want to prevent others from unsubscribing you from a mailing list, you have to remove the unsubscribe link at the bottom of a message manually before forwarding the message.

Custom header fields

IANA maintains a long list of registered message header fields. The ones specified in an RFC and thus endorsed by IETF are called permanent header fields. The ones registered for private use without official recognition are called provisional header fields. RFC 3864 outlines the registration procedure for header fields. It’s common to start the name of custom header fields with X-, but unlike in the case of content types, there is no requirement for this. RFC 822 just promised that official header fields will never start with X-. This provision regarding extension header fields was dropped in later revisions, though. During my research for this article, I’ve inspected a ton of messages in their raw format. The funniest header field I came across is the following one from Booking.com:

X-Recruiting: Like mail headers? Come write ours: https://careers.booking.com

This is one way to reach nerds like me. I’m not that excited about email headers, though. 😅

Issues

Email is both a blessing and a curse. On the one hand, email is by far the most important decentralized messaging service that we have, which should be reason enough to cherish it. The only other decentralized messaging service which comes close to email in terms of ubiquity is the Short Message Service (SMS). On the other hand, email has become so dysfunctional that many of us would like to leave it behind. In this section, we’ll look at the issues that plague modern email. In the last chapter, we’ll discuss how some of the security-related issues are being addressed.

Unsolicited messages which are sent in large quantities are called spam or junk mail. Spam is a brand of canned pork, which was introduced in 1937. Spam is likely an abbreviation for spiced ham. It became ubiquitous during and after World War II when food was rationed. The British comedy group Monty Python made fun of this fact in a famous sketch in 1970. The term got adopted to refer to undesirable things which come in excessive quantities – including junk mail.

Any messaging service which is popular, open, and free will have spam sooner or later. Thus, spam isn’t a result of the shortcomings of email but rather a consequence of its desirable properties. Since unsolicited messages are annoying, people try to eliminate junk mail from their inboxes with heuristics, blacklists, and challenges. While such techniques make spam bearable, they don’t solve the underlying problem of unsolicited mail: Anyone in the world can add tasks to the to-do list which is your inbox. In my opinion, mail clients should separate messages from unapproved senders from your inbox so that the messages you actually want to receive don’t drown in the noise. This is similar to how I almost never accept calls from numbers that I haven’t stored in my phone. Even though this feature has to be tremendously useful for anyone who doesn’t want to be bothered by random sales people and their never-ending followups, HEY is the only mail client I know of which let’s you screen your email senders. And just like I block call centers, I also block email senders, of course. However, the default shouldn’t be “allowed unless blocked” but rather “blocked unless allowed”. Additionally, messages are typically blocked on the client-side because most mail clients still don’t support server-side filtering.

Heuristics

How binary classifiers are evaluated. When labelling spam, you want to have as few false positives as possible, even if this increases the rate of false negatives as well.

Blacklists

Graylisting

Patience

Challenges

Reputation

Address munging

Legal requirements

Privacy

If you send an email to someone, you want to share certain information with that person. Mail clients and mail servers, however, share a lot more information than what the users intended to share. In this subsection, I list all the subtle information disclosures that users likely aren’t aware of. If you know of other privacy leaks, please let me know.

Sender towards recipients

The recipient of a message often learns the following information about the sender:

  • IP address: When you submit an email to an outgoing mail server, most mail servers add your IP address to the message as part of the trace information. As a recipient, you find the IP address from which a message was sent in an x-originating-ip header field or in the square brackets in the first parentheses of the last Received header field. (Each mail server through which an email passes adds an additional Received header field at the top, which means that the first Received header field, which was added by the outgoing mail server, is at the bottom.) There are three important implications of this. Firstly, your outgoing mail server leaks your rough physical location to all recipients. In other words, never send to your boss that you’re sick at home from your holiday apartment. Similarly, recipients can tell whether you’re still at work or went home already. Secondly, recipients can launch a denial-of-service attack. Due to network address translation (NAT), the target would typically be your router rather than your machine, but your Internet connection goes down either way. Thirdly, if you visited the website of an email recipient anonymously or pseudonymously, the recipient now knows who this user on their website is. To find out whether your outgoing mail server includes your IP address in the messages that you send, send a message to yourself and search for your IP address. You can use the following tool with an empty input field to determine your IP address. You can also use the tool to locate the IP address of someone who sent you an email. The tool uses the geolocation API of ipinfo.io.
    IPv4 address:

    If you don’t want your email service provider to leak your own IP address, you can use a Virtual Private Network (VPN) or an overlay network for anonymous communication, such as Tor. Alternatively, you can use an email service provider which values your privacy, such as ProtonMail or Tutanota. Sending messages from the web interface of an email service provider usually also helps. For example, if you compose an email on gmail.com, your IP address is not included in the outgoing message. If you submit a message from your desktop client to Gmail using SMTP, on the other hand, your IP address is added by smtp.gmail.com in a Received header field. While RFC 5321 does say that the IP address of the source should be included in the Received header field, email service providers should ignore the standard in this regard, in my opinion. I understand that email service providers may want to record the IP address of the sender to prevent abuse of their service, I just see no reason to share this information with the recipients of a message. In fact, it might even be illegal to do so. Many privacy acts, such as the European General Data Protection Regulation (GDPR), forbid service providers to share personal data without the user’s explicit consent. Since the third party with whom the personal data is being shared can be different for every email, the user’s consent would be required every time they send an email. If you’re a lawyer and you think that this reasoning has some merit, let me know so that we can file a class-action lawsuit to bring this industry practice to an end.

  • Device name: Mail servers also include the client’s argument to the EHLO command in the Received header field. RFC 5321 requires that the client uses its fully qualified domain name if it has one or its IP address otherwise. In spite of this, Thunderbird and maybe other clients use the name of your device in the local network as the argument. On macOS, you find the name of your device in the “Sharing” tab of your “System Preferences”. By default, it starts with the first name of your user account. In my case, my computer is reachable under Kaspars-MacBook-Pro.local in the local network. As a whistleblower, I might create a new email address and even use an anonymization service, such as Tor, just to have my mail client and mail server leak my real name. RFC 5321 even warns about exactly this problem. I reported this privacy bug to Mozilla Thunderbird on 2 December 2020. Until a fix is available, you can set the mail.smtpserver.default.hello_argument option in the config editor to [192.168.1.1]. Such a value is typical for the vast majority of people due to network address translation (NAT).
  • Timezone: The sent date is usually encoded in the timezone of the sender. By looking at the offset from the Greenwich Mean Time (GMT), the recipient learns from which longitude a message was sent. In my opinion, mail clients should always encode the Date field in Greenwich Mean Time.
  • Mail client: Many mail clients put their name with their current version into a User-Agent or X-Mailer header field. Some mail clients even include the name and the version of the operating system on which they run. While such data is usually harmless, it can provide valuable information to someone who wants to attack you. Given the intricacies of email, mail clients can also be identified by how they delimit parts, how they label files, how they style messages, how they quote messages, and so on. This is known as fingerprinting, and it allows a recipient to determine whether separate messages were sent from the same client.
  • Display names: Your mail client not just adds your name as a display name in the From address, it also adds a display name for each recipient it knows. This can leak how you’ve stored a recipient in your address book (i.e. be careful under what name you store the colleague you’re having an affair with) and with whom of the recipients you’ve been in contact before (because mail clients usually add display names from earlier conversations automatically). As a recipient, you have to inspect the raw message to see what the sender provided because your mail client typically overwrites the display names with the information from its own address book. In my opinion, mail clients should remove the display names of recipients before sending a message.
  • Hidden recipients: The Received header field has an optional for clause, which contains the address of the specified recipient. As recommended by RFC 5321, the for clause is skipped when there are several recipients in the envelope of the message. As a consequence, a single non-hidden recipient learns that the message was also sent to hidden recipients if the for clause is missing in the bottommost Received header field. This means that the empty Bcc field approach is used more often than intended.
  • Attachments: The content disposition of attachments can include information such as when the file was created and when it was last modified. While it can be useful to preserve such information when mailing a file, sharing such information with the recipient can also be unexpected and undesirable. I don’t know how mail clients can determine the preferred option without cluttering the user experience. By default, they should err on the side of caution, which many do.

Assuming that all recipients can be trusted is foolish. If someone pretends to be interested in your work, you’ll likely reply to them.

Recipient towards sender

There are two ways in which the sender can track the recipient: By including remote content in the message and by redirecting external links. If the sender can trick you into replying to them or your mail client sends a read receipt, then all the above privacy issues also apply, of course.

Remote content

HTML emails can include remote content, which is fetched by the mail client when it renders the message. Images are by far the most common type of remote content. They are usually included with the <img> element or with the background-image property. Some mail clients support external style sheets through the <link> element, but internal CSS can also have @import statements to load Web fonts and other styles with the url() function. There are other elements, such as <audio>, <video>, and <iframe>, which can also be used to include remote content, but not all mail clients support them.

2. Fetch content1. Fetch emailMailserverMailclientWebserver After fetching the email from the trusted mail server, the mail client fetches the remote content from the untrusted web server.

Remote content violates three fundamental principles of email:

  1. Offline reading: Since mail clients usually fetch the remote content only when you open the message, substantial parts of the message can be missing when your computer is not connected to the Internet. Since most mail clients don’t cache the remote resources, being online when you open the message for the first time isn’t enough.
  2. Immutable content: Most users probably think that once they have received an email, the sender can no longer modify it because your inbox contains an independent copy of the message, to which the sender has no access. Unfortunately, this assumption doesn’t hold for HTML emails with remote content. Since remote content isn’t cached, different content can be provided every time you open a message. Some clever engineers used this circumstance to include a dynamic Twitter feed in an email. If you’re not aware of this “feature”, though, you might fall for a scammer who seemingly predicted the development of some market accurately. And even if the remote content isn’t modified, you can no longer view the original message once the sender stops hosting it. The situation gets even worse if the domain on which the remote content is hosted is transferred to a new owner or if the web server which hosts the remote content is compromised by an attacker. Furthermore, remote content isn’t covered by message signatures. In theory, some of these security issues could be addressed with a technique known as subresource integrity (SRI). If the sender included the hash of the resource in the original message, then the resource could no longer be modified afterwards. Unfortunately, subresource integrity is specified only for <script> and <link> elements. While a future revision of the specification might add support for integrity checks to other elements, there are no plans for this yet.
  3. Reading privacy: Whenever your mail client fetches a remote resource, the web server operator learns when and from where the resource has been accessed. Since most mail clients include a User-Agent header field in their HTTP request, the web server operator also learns which mail client you use. For the reasons mentioned in the previous point, senders should reference only remote content which they control. Email newsletters often include remote content with a personalized URL just to track who opened the message when and from where. Based on this data, the sender can determine what percentage of recipients opened the email, which is known as the open rate. It’s important to note that your privacy when reading emails is not worse than when browsing the Web. The crucial difference is that on the Web, you go to a website, whereas in the case of email, the website comes to you. Since you don’t want to provide your IP address to anyone, you should disable remote content in your mail client.

In my opinion, remote content should never have been supported by mail clients. If people insist on incorporating related files into a message, they can use aggregate documents. Now that remote content is used so widely, we have to live with the above drawbacks.

Proxying remote content 3. Fetchcontent2. Fetch content1. Fetch emailMailserverProxyserverMailclientWebserver How remote content can be fetched via a proxy server in order to protect the user’s privacy to some degree.

How to disable remote content

Link tracking

Emails often contain links to websites. Instead of linking to the target site directly, the sender can rewrite the link in such a way that your web browser sends a request to their tracking server, which in turn redirects your browser to the actual web server:

5. Request4. Redirect3. Request2. Open1. FetchMailserverMailclientWebbrowserTrackingserverWebserver How clicks on links can be tracked by the sender of a message. In general, you cannot trust the servers in orange.

Unlike tracking pixels, link tracking also works in plaintext emails and when remote content is disabled. If the target website isn’t identifiable in the tracking link, you have no other choice than to request its address from the tracking server if you really want to see the advertised content. The sender can use tracking links to measure what percentage of recipients opened the link, which is known as the click-through rate (CTR). The same technique is often used on social media in combination with URL shortening to determine the reach of a post.

Since seeing is believing, I wrote a little tool to track emails. You can generate a unique token and then subscribe to the associated events below. You can send the tracking link and the tracking image to someone using your mail client or the ESMTP tool above. I’ve deployed the tracking server on heroku.com. As you can see in its source code, my server doesn’t store anything but Heroku logs the last 1’500 requests, which includes the token, the link, and your IP address. I don’t persist the log file, but I might check it from time to time for troubleshooting. In order to determine where a request was made from, the tool uses the free API from ipinfo.io. You can also use the tool to see from where social media apps request an URL to generate a link preview or to convince yourself that the Tor browser indeed connects from a different location every time you restart it.

Token:Link:

Security

Security and the lack thereof have been a topic throughout this article. In this section, I shine a light on some additional aspects.

Spoofing

As we saw earlier, the sender of an email can easily be spoofed because at least historically emails aren’t authenticated. Somewhat frustratingly, RFC 5321 and some companies see forged sender addresses more as a feature than as a bug. Criminals abuse this “feature” to trick unsuspecting users into performing actions or disclosing information, which they wouldn’t do otherwise. Exploiting the credulity of people is known as social engineering. Besides impersonating a trusted organization for phishing, a common attack is to send a victim an email which seemingly comes from their own address. In the message, the attacker claims that they’ve compromised the victim’s computer and that they’ve recorded the victim masturbating to porn. The attacker threatens to send the recording to all the victim’s contacts unless they receive a payment, usually in Bitcoin, within a couple of days. This form of blackmailing is known as sextortion. If you receive such an email yourself, how do you know that the attacker’s claim is wrong? First of all, you know now that the sender address of emails can easily be forged and that there is no reason to assume that your account has been compromised. But more importantly, if there was an easy way to increase the fraction of people who pay the ransom, criminals would certainly make use of it. In the case of sextortion, they would just have to include a screenshot of the recording and the addresses of some contacts to make presumably the large majority of people pay. Given that this is (usually) not the case, there’s no reason to worry. Do people fall for this crap? The answer is yes, unfortunately. The first time I’ve received such a message was on 13 January 2019. The fraudster demanded 356 Euro in Bitcoin to remain silent and was stupid enough to provide the same Bitcoin address to several victims. Since all Bitcoin transactions are public, we know exactly how much money they made: 5.379 BTC, which was worth around 20’000 USD at the time. This also means that they had no way to know who of their victims actually paid, which made their threat even less credible to anyone who has a basic understanding of blockchains.

Besides social engineering, spoofed sender addresses can be abused where emails are used for authentication. For example, people can often unsubscribe from mailing lists via email. Even if this is not the case, many mailing lists remove subscribers to whom several messages in a row couldn’t be delivered automatically. Unless a mailing list uses unpredictable variable envelope return paths (VERP), bounce messages can easily be forged, which means that you can unsubscribe other people from the mailing list. Similarly, it’s often the case that only approved senders can send a message to all subscribers of a mailing list. Anyone who knows how to spoof emails can easily bypass this restriction and spam the mailing list.

Email address spoofing can be prevented by enforcing domain authentication, which I’ll cover in the last chapter of this article.

Phishing

Impersonating a trusted organization to obtain sensitive information or payments from gullible users is known as phishing. Phishing emails often direct their victims to a fraudulent website which looks exactly like the legitimate website. By providing a pretext, the attacker tries to get the victims to perform a specific action, such as entering their username and password or initiating a payment with their credit card. Phishing attacks can target specific individuals or a diverse group of people. If they’re not just an advance-fee scam, they usually require some technical skills to execute them. This is why most phishing attacks are motivated by financial gain rather than a desire to harass or stalk the victim. While requesting a payment leads to a direct success for the criminals, usernames and passwords can be used to launch further attacks from the victim’s account. For example, the credentials of an employee can be used to infiltrate a company in order to obtain trade secrets or to install ransomware on their computers.

Phishing attacks come in all shapes and sizes, but you can reduce your risk by sticking to the following principles:

  • Always be suspicious: If an email prompts you to perform a certain action, your alarm bells should ring. Have you been prompted for similar actions before? Is the time frame to perform the action unusually short? Is there a reasonable default option if you don’t perform the action? Does the action involve the disclosure of sensitive information or a payment?
  • Don’t click on links: Phishing attacks require that you take the bait. Create a bookmark for all the websites where you have an account. Make it a habit to navigate to these websites yourself instead of following links. If an email says that a subscription is about to expire, log in to the website of the service provider with the bookmark and not the link. Using a bookmark (or a search engine) to navigate to a website is better than relying on the address autocompletion of your browser. If you clicked on a dubious link by mistake in the past, the fraudulent URL is still in your browser’s history and you may not be able to recognize it as such.
  • Hover over links: If you can’t suppress your urge to click on a link, move your mouse over the link first and verify whether the status bar at the bottom of the window indeed displays the address you want to visit. You should always do this because the text of a link can be misleading. For example, www.google.com takes you to Bing, not Google. You should check the destination of a link before you click on it. If you check the destination of a link only in the opened browser window, you have already confirmed to the attacker that you click on links, and the visited website might have already infected your computer with malware. Unfortunately, link tracking can make it quite difficult to recognize whether the destination of a link is legitimate. Furthermore, not all companies prime their users to trust only a single domain. For example, PayPal, of all companies, directs their users to paypal-communication.com instead of paypal.com when informing them about changes to the general terms and conditions. Additionally, homograph attacks can make it difficult or even impossible to recognize that the target domain is not the legitimate one. This is one more reason why you shouldn’t click on links in the first place. The only exception to this rule are links to articles on which you won’t perform any actions. However, this means that you have to remember for each tab of your browser whether the address came from a trusted or an untrusted source. Anything you open on an untrusted page can also not be trusted. Some mail clients, such as Apple Mail, don’t have a status bar and show the destination address in a tooltip instead. And yes, Apple Mail is smart enough to override any tooltips that a sender provided with the title attribute. I’ve tested this.
  • Use a password manager: Seriously. Password managers not only allow you to have a long, randomly-generated password for each website, they also prevent you from entering them on the wrong websites. To be precise, you can still paste your passwords into any input fields you like. Password managers just won’t do this for you if the domain is different. This is just one more level of defense, which is especially useful for innocuous-looking actions that don’t trigger your alarm bells. For example, some websites require you to log in before you can unsubscribe from their newsletter, and such an email and login can be bogus, of course.
  • Verify the sender: Who sent you the email? Since the sender of an email can (still) be spoofed, a trusted sender shouldn’t lower your level of suspicion much. If the domain in the From address doesn’t belong to the impersonated organization, though, you should almost certainly ignore and delete the message. Mail clients could do way more to protect their users from phishing attacks. For example, changing the policy for incoming emails from “allowed unless blocked” to “blocked unless allowed” would likely help a lot in shifting the mindset of users. Mail clients could also display the country of origin for each message, warn the user if the message isn’t authenticated or if the clicked link leads to a domain which is different from the From address, etc.
  • Disable display names: While spoofing sender addresses can be prevented by technical means, the sender can choose their display name at will. Since sender-chosen means attacker-chosen, users shouldn’t be confronted with unverified display names. Unfortunately, all the mail clients I’ve checked handle this aspect so badly that I had to write a separate box on this topic.
  • Confirm out-of-band: If a known sender asks you to perform an action which has far-reaching or irreversible consequences, contact the sender through a different communication channel and let them confirm the request before executing it. Obeying orders blindly is dangerous from a security perspective and subordinates should be trained and encouraged to question them.
Malicious display names How Gmail warns its users when a known display name is used by an unknown sender.

Confidentiality and integrity

As we saw earlier, the percentage of emails which are encrypted and authenticated in transit increased significantly over the last decade. When you send an email, though, there is no guarantee that the confidentiality and integrity of your message is protected when it is relayed from your outgoing mail server to the incoming mail server of the recipient. This is especially problematic when email is used to perform security-critical operations, such as password resets. Due to backward compatibility, the email protocols are secure only against passive attackers. I will cover the efforts to make email secure against active attackers in the last chapter.

In my opinion, mail clients should warn their users if the incoming mail server of one of the recipients doesn’t support strict transport security. You can increase the pressure on email service providers only by increasing the awareness of users. Gmail provides an easy way to see whether a received message has been authenticated and encrypted in transit, which allows users to assess the authenticity and, somewhat misleadingly, the confidentiality of a message at least after it has been transmitted:

gmail-message-details.png

You can click on the little triangle to see more details in Gmail’s web interface. mailed-by indicates a successful SPF check, signed-by a valid DKIM signature. security indicates that the outgoing mail server of the sender used STARTTLS.

Reliable delivery (availability)

Besides confidentiality and integrity, information security is also concerned with the availability of a service. Since your message might be silently discarded as spam or land in the recipient’s spam folder, which they don’t check on a regular basis, you can never be certain that a (new) recipient received your message in their inbox. Most people minimize this risk by not hosting their emails themselves. Once domain authentication is commonplace, which solves the problem of backscatter, we can hopefully fight spam with other techniques so that self-hosting becomes feasible again.

Custom email filters are another source of unreliability. When users receive too many emails which they don’t want or can’t handle, they are tempted to set up a rule which moves or deletes them automatically. Personally, I have a rule which deletes all messages which contain certain keywords, such as “lotto winner”, in their subject. I’ve recently also added some top-level domains, such as .cheap and .city, to this list. If the From address ends in one of these domains, the message is deleted immediately. My custom anti-spam rule, which also includes the domains of sales companies, does wonders for my inbox. The problem with custom email filters, though, is that they often work like shotguns: They certainly hit the messages you wanted to remove from your inbox, but due to their simplicity, they likely bring down legitimate messages as well. As long as senders send emails automatically, recipients will remove them automatically. Casualties are to be expected in such a setup.

Quoting HTML messages

HTML emails can be styled. When you reply to or forward such an email, your mail client has to make sure that the quoted message cannot change the appearance of your own message. If the quoted message isn’t escaped properly, an attacker can inject text into the victim’s response. When quoting an HTML message, mail clients need to ensure the following two things:

  • Scoped styles: The style of the quoted message may not leak into the surrounding message. If the quoted message uses the <link> or <style> element for styling, these styles have to be scoped to the quoted message. Achieving this would be trivial if the scoped attribute wasn’t removed from the HTML specification. If we’re lucky, we might get an @scope selector in a future CSS standard. Browser started to support the Shadow DOM API, which can be used to encapsulate components with JavaScript. Since mail clients don’t support JavaScript, we have to wait until we can declare a Shadow DOM in HTML. So how do mail clients handle this? Gmail simply removes the <style> element when quoting an HTML message. If you want to make sure that your message is still displayed properly when it is replied to or forwarded, you have to keep inlining the styles. Apple Mail inlines internal CSS when quoting an HTML message. Yahoo Mail moves the <style> element from the <head> into the <body> and prefixes each rule with an ID, which it also assigns to the <div> element which contains the quoted message. Thunderbird only moves the <style> element from the <head> into the <body> and thus fails to scope the styles. Outlook.com behaves differently for replies and forwarded messages: It fails like Thunderbird in the former case and inline styles incorrectly in the latter case.
  • No overlays: CSS can be used to move HTML elements away from their default position in a document. This becomes a problem when HTML elements in the quoted message can be moved above the attribution line since email users are trained to perceive everything above the attribution line as coming from the sender of the message. I can think of three ways how HTML elements can be moved around with CSS, but I wouldn’t be surprised if CSS has more ways to achieve this. Firstly, there is the position property, with which elements can be moved relative to their default position or to an absolute position in the document. Secondly, the transform property can be used to translate HTML elements (and to scale and to rotate them). Thirdly, negative margins have a similar effect as position: relative;. Without having tested all possibilities, I have the impression that webmail clients handle this quite well. For example, Gmail doesn’t list position and transform under supported CSS properties and also removes negative margins before displaying a message. Desktop clients, on the other hand, struggle with this. position: absolute; and margin-top: -200px; work in Apple Mail and in Thunderbird. Restricting styles to inline CSS isn’t enough to scope the styles. Doing so would make it more difficult, though, to show the injected text only in the reply but not when composing the reply.

If analyzing the raw message before forwarding or replying to a message is too much to ask from you, you have only two options to avoid these issues: Choose a mail client which cares about your security or enforce that all messages are composed in plaintext. Apple Mail allows you to configure this in the “Composing” tab of your “Preferences”: Change the “Message format” to “Plain text” and disable “Use the same message format as the original message”. If you use Thunderbird, you can disable “Compose messages in HTML format” under the “Composition & Addressing” tab of your “Account Settings”.

Thunderbird example exploit

How to exploit Thunderbird’s failure to scope the styles of the quoted message. Click here to use this example in the ESMTP tool above. The attack uses the ::before pseudo-element to inject the text, and p[_moz_dirty] to hide the injected text during composition.

Outlook.com example exploit

How to exploit Outlook.com’s failure to scope the styles of the quoted message. Click here to use this example in the ESMTP tool above. Since Outlook.com doesn’t copy styles like div[style]:before and div:first-child:before to the reply, I had to abuse the <hr> element to make the injected text appear only once.

Different appearances

Another issue with email is that the same message can appear differently to different recipients. This is a problem whenever you refer to the content of an earlier message, no matter whether you quote the message or reference it in the In-Reply-To header field. Until mail clients address this issue, you must repeat the content you refer to. Emails can appear differently for three reasons:

  • multipart/alternative: Multipart messages can include different versions of the same content so that the mail client of the recipient can display the last version whose content type it supports. However, nothing guarantees that the various parts contain the same content. Spam filters might flag messages whose alternative parts diverge too much from one another, but determining whether different parts contain the same content is more difficult than it seems. Let’s look at an example:

    <html>
      <body>
        Hi boss, can you confirm to our accountant in Cc that my monthly salary is
        increased by USD 100<span style="font-size: 0;">0</span> starting next month?
      </body>
    </html>
    
    A simple HTML message whose plaintext version conveys a different content. Click here to use this example in the ESMTP tool above.

    If your boss uses an HTML-capable mail client, they will see USD 100 in the message. When your boss replies to this message with “Yes, that’s what we agreed.”, all the mail clients I usually mention in this article generate a Content-Type: text/plain version of the reply, which includes USD 1000. If you know that your accountant uses a plaintext-only mail client, this attack will work. On most HTML-capable mail clients, you can see the plaintext version only by inspecting the raw message. Thunderbird, however, allows you to change which part is being displayed by switching the “Message Body As” in the “View” menu. If you use display: none; instead of font-size: 0;, Apple Mail won’t include the additional zero in the plaintext reply as it uses something like innerText to determine the plaintext content. There are plenty of ways to hide content with CSS, though, and the plaintext conversion algorithm would have to consider them all. Since computing what is actually being displayed is impractical, the solution has to be to force all content to render in HTML messages by disabling those CSS properties. Since already the original message could have contained conflicting alternative parts, mail clients which take security seriously should probably warn their users when they reply to multipart/alternative messages because most mail clients hide the quoted messages in email conversations. All the mail clients I’ve tested generate the quoted message in the plaintext part of the reply from the HTML part that they’ve displayed to the user. If it wasn’t for malicious CSS styles, mail clients wouldn’t prepend your reply to content you haven’t seen. The only problem that remains is that the In-Reply-To header field doesn’t specify which alternative part your message refers to.

  • Conditional styles: Even without alternative parts, the same message can be rendered differently on different devices due to media queries. The following message shows a different text on devices with a small screen than on devices with a large screen:

    <html>
      <head>
        <style type="text/css">
          @media (max-width: 599px) {
            .large { display: none; }
          }
          @media (min-width: 600px) {
            .small { display: none; }
          }
          .touch { display: none; }
          @media (pointer: coarse) {
            .touch { display: inline; }
          }
        </style>
      </head>
      <body>
        <p>
          You have a
          <span class="large">large</span>
          <span class="small">small</span>
          <span class="touch">touch</span>
          screen.
        </p>
      </body>
    </html>
    
    A simple HTML message which is displayed differently on different devices. Click here to use this example in the ESMTP tool above.

    Media queries are useful to design websites for various screen sizes, which is known as responsive web design. Since emails are read on a wide variety of devices, media queries are an important technique to make them look good on all devices. Since media queries and selectors aren’t allowed in the style attribute, conditional rendering is much easier in mail clients which support internal or external CSS, which is the vast majority by now. In order to prevent this attack, Thunderbird no longer supports media queries. In my opinion, this is the wrong approach and the fix should rather be to force all content to render. Styles should affect only how content is displayed, not which content is being displayed. The supported media features vary greatly among clients. For example, the screen width media queries are supported by Gmail, Outlook.com, Yahoo Mail, and Apple Mail (also on iOS). The pointer media query, which can be used to detect a touch screen, is removed by the Gmail and Yahoo Mail webclients.

  • Different implementations: As long as different users use different mail clients which sanitize emails differently, attackers can draft messages which are displayed differently to different recipients. Since it’s easy to learn which mail client someone uses, it’s often not difficult to have some part of a message be shown or hidden for a specific recipient. I’ve drafted such a message for you:

    <html>
      <head>
        <style type="text/css">
          .apple-mail, .outlook { display: none; }
          @media (pointer) {
            .apple-mail { display: inline; }
            div .apple-mail { display: none; }
            div .outlook { display: inline; }
          }
          @media (min-width: 0px) {
            .thunderbird { display: none; }
          }
          p:first-child .gmail { display: none; }
          .yahoo-mail { display: none; }
          p:first-child .yahoo-mail { display: inline; }
          body .yahoo-mail { display: none !important; }
        </style>
      </head>
      <body>
        <p>
          You're reading this message
          <span class="apple-mail">in Apple Mail.</span>
          <span class="thunderbird">in Thunderbird.</span>
          <span class="gmail">on mail.google.com.</span>
          <span class="outlook">on outlook.live.com.</span>
          <span class="yahoo-mail">on mail.yahoo.com.</span>
        </p>
      </body>
    </html>
    
    A simple HTML message which is displayed differently by different clients. Click here to use this example in the ESMTP tool above.

As long as not all mail clients prevent senders from hiding content with CSS, email styling can be abused. Don’t we have the same problem with websites? In principle, yes, but the difference lies in the expectation of users. On the Web, you know that pages are often customized and that their content can change at any moment. In the case of email, however, you expect that everyone sees the same content, especially when you quote another message. If you reply to messages without quoting them, an attacker can deliver a different message with the same Message-ID to each of the recipients. As I wrote earlier: Just because someone is listed as another recipient doesn’t mean that they received the same message as you. The abuse of conditional CSS rules as a signing oracle was discovered and published by Jens Müller and his colleagues in 2019. The problem with diverging multipart/alternative parts was discussed thereafter in this Thunderbird issue.

Hide content with CSS

The many ways to hide text and images with CSS. If you can think of another way, let me know.

Complexity

Since you made it to this paragraph, I probably don’t need to convince you that email is incredibly complex. Email is a system that has been retrofitted to modern requirements for 40 years. It’s no wonder then that what we have today is a complicated patchwork of extensions. Just to be clear: I don’t want to criticize anyone in this section. Most of the design decisions that led us to the current situation were reasonable at the time. I still think it’s a good idea to assess what brought us here as this allows us to appreciate what we have now. In my view, the following limitations of early email are responsible for most of today’s complexity:

Benign inconsistencies

Unreasonable decisions

Innovation

Besides JMAP, dynamic content, and what we’ll discuss in the last chapter, there was barely any innovation over the last two decades. This is a pity given that email is the only decentralized communication service with global adoption. I can only speculate about the reasons for the lack of innovation:

  • Complexity: The enormous complexity of email can deter software engineers from entering the field. Patching a heavily patched system further is also not appealing to many young talents. I hope this article can motivate more people to shape the future of email in a positive way.
  • Fragmentation: The email ecosystem is so fragmented that no single organization can push the industry forward. The innovation that we see, such as email markup and dynamic content, often remains limited to just a few companies. If you want to write a mail client for a general audience, you have to support IMAP. If you have to deal with the intricacies of IMAP anyway, you don’t gain anything by implementing a newer access protocol such as JMAP as well. As long as all mail clients which people want to use support IMAP, existing email service providers have little incentive to support JMAP.
  • Saturation: The email market is saturated with free solutions for clients, servers, and hosting. The low willingness to pay for a product or service makes it really hard to build an innovative business in this space. Combined with the inertia of users, there is almost no economic pressure to innovate. Email service providers with a strong focus on privacy are the only exception to this rule because more and more people realize that if they don’t pay for a service, they’re the product and not the customer.

Format innovation

Since Skype failed to innovate, it was superseded by Zoom. The same fate is happening to WhatsApp: Telegram is showing us how much room for innovation there is for a messaging app. There’s plenty of features I would like to see in email. For a start, we still have no No-Reply header field, no Proof-Of-Work header field, no header field to reference the previous message by its hash (ideally using a Merkle tree for MIME parts so that attachments can be removed from a message without invalidating its hash), no header fields for the sender’s contact details to replace email signatures, no content type to initiate and reply to surveys, etc.

Some features, such as message compression, exist in theory but not in practice. Other features, which originated in the alternative email system X.400, were formally specified as IETF email header fields in order to increase compatibility between the two systems but were never recommended for general use. Among these header fields are Supersedes to replace a sent message with a revised version, Expires to indicate when a message loses its validity, and Reply-By to request a response in the specified time period.

Client innovation

Given the decentralized nature of email, protocol and format innovations are difficult to achieve. However, nothing hinders mail clients from innovating at the edge of the network. I’ve mentioned plenty of ideas throughout this article. Among them are sender approval, automatic challenges, Bcc recovery, privacy features such as proxying remote content via Tor (and even submitting emails via Tor as long as email service providers leak the IP addresses of their users), and security features such as preventing malicious display names and different appearances of messages. It would be great if my mail client displayed whether a received message was successfully authenticated with SPF and DKIM (just like Gmail). I would like to see native support for DNS-based autoconfiguration, Sieve and ManageSieve, as well as PGP. I don’t understand why mail clients separate the outbox from the inbox. (I don’t know any other messaging app which does this, and just because IMAP uses folders doesn’t mean you have to display them.) I think it would be great if my mail client could timestamp all the emails that I send. Whenever I submit a responsible disclosure, I do this manually.

Fixes

The last chapter of this article is dedicated to recent standards which address some of the aforementioned security issues. We’ll study how spoofing is prevented with domain authentication and how confidentiality and integrity is ensured in the presence of an active attacker with strict transport security. Many of the approaches rely on the Domain Name System (DNS) to provide additional information. This is secure only if the records are authenticated with DNSSEC. I will no longer mention this aspect in the remaining subsections. Some of the steps have to be performed by the owner of the domain rather than the email service provider. If you use a custom domain for your emails, you should definitely read the part about domain authentication to make sure that your domain is configured properly. Since email is a decentralized service, we can improve its security only in a collective effort.

Domain authentication

Historically, the sender of an email was not authenticated: Anyone could relay a message to anyone using any From address they wanted. Impersonating another sender is known as spoofing. While the prevention of spoofing won’t eliminate spam and phishing on its own because spammers can implement the following standards as well and phishing remains possible with similar domains and malicious display names, it’s an important prerequisite for other techniques, such as flagging unknown senders. As we’ve seen earlier, email spoofing is addressed in two steps: The incoming mail server of the recipient verifies that the other party is authorized to send emails on behalf of the sender’s domain and the outgoing mail server of this domain ensures that the local part of the From address belongs to the user who submitted the message.

Userauthen-ticationUserauthen-ticationDomainauthenticationUserauthen-ticationMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient The incoming mail server of the recipient authenticates the outgoing mail server of the sender and the outgoing mail server of the sender authenticates the user who submits the message.

As the title suggests, this subsection is only about the first part of the problem, namely how a domain owner can specify which mail servers are authorized to send messages on behalf of the domain and how receiving mail servers can verify whether the sending mail server is indeed authorized for the claimed domain. The second part is usually solved with password-based authentication mechanisms. The following techniques don’t prevent spoofing if the outgoing mail server of the sender is compromised or if the attacker can create an account at the same email service provider and impersonate another user during submission.

Before you continue, make sure that you understand the difference between a message and its envelope.

There are three complementary standards for domain authentication:

Adoption benefits

Domain owner

How to perform WHOIS queries yourself. Once the TCP connection is established, you just enter the domain name of interest followed by a new line. (Telnet should convert “return” into {CR}{LF} automatically.) As soon as the server has sent the answer, it closes the connection.

Privacy implications

Name chaining attacks

Sender Policy Framework (SPF)

The Sender Policy Framework (SPF) is specified in RFC 7208. As a domain owner, you list the IP addresses of your outgoing mail servers in a TXT record at your domain. Incoming mail servers then check whether the IP address of the sending mail server is included in the SPF record of the domain which was used in the MAIL FROM command. If this is not the case, incoming mail servers can reject the message during the SMTP session. RFC 7372 defines enhanced status codes with which the server can indicate failed SPF validation. Since IP addresses cannot be spoofed without access to the target’s local network, this procedure authenticates the sender’s domain.

So how do you create the SPF record for your domain? If you don’t run your email server yourself, your SPF record will consist of:

  • Version: Every SPF record has to start with v=spf1.
  • Includes: An SPF record can include the IP addresses of another SPF record. Search for the appropriate record from your email service provider. For example, put include:_spf.google.com (source) into your SPF record if you use Google Workspace. Since mailing list providers such as Mailchimp use an address of their own in the MAIL FROM command so that they can handle bounce messages for you, you don’t need to add the addresses of their servers to your SPF record.
  • Default: Provide an explicit default result for any sender which didn’t match one of the previous mechanisms. If you want incoming mail servers to reject messages with a spoofed MAIL FROM domain, use -all. If you want incoming mail servers to just flag such messages as potentially fraudulent, use ~all. In order not to disrupt email forwarding, incoming mail servers are unlikely to enforce your SPF policy. They are much more likely to enforce the domain policy of your DMARC record.

An SPF record created according to the above steps looks as follows: v=spf1 include:_spf.google.com -all. On domains from which you don’t send any emails, you should use v=spf1 -all. The full syntax of SPF records is much more powerful than this but rarely needed. I will cover SPF in more detail in the boxes below. There are a lot of things that can go wrong when configuring an SPF record. For a start, a domain may have at most one SPF record and the number of additional DNS lookups an SPF record may trigger is limited. Instead of listing all the pitfalls here, I’ve built a tool which performs 30 different checks on your SPF record. It uses Google’s DNS API to query the records. Please note that this tool warns you only about common mistakes, it doesn’t verify whether your outgoing mail servers are included in the record. You still have to test your setup by sending emails and checking the Received-SPF header field. By not evaluating whether an IP address passes SPF validation, the tool is also limited in other regards.

Domain:

SPF-Received header field

The format of the Received-SPF as specified in RFC 7208. The curly brackets need to be replaced with actual values. The content in square brackets is optional and the asterisk indicates that the preceding content can be repeated.

An example Received-SPF header field. The values are intended to make the result verifiable.

Protecting subdomains

Email forwarding SPF qualifiers

The curly brackets need to be replaced with a qualifier, a mechanism or a modifier. The content in square brackets is optional. The asterisk indicates that the preceding content can be repeated.

The four qualifiers and the evaluation results to which they lead.

SPF mechanisms SPF modifiers SPF macros

HELO identity

TXT size limits

How dig displays the OPT pseudo resource record. dig increases the UDP payload limit to 4096. As you can also see, dig sets the do (DNSSEC OK) flag of EDNS there to ask for DNSSEC records.

SPF record type

DomainKeys Identified Mail (DKIM)

DomainKeys Identified Mail (DKIM) is specified in RFC 6376. (DomainKeys was a predecessor designed by Yahoo and the name survived the IETF standardization process.) DKIM allows a domain owner to take responsibility for a message by signing its body and selected header fields. Any mail server through which a message passes can add a DKIM signature. Unlike S/MIME and PGP, which alter the body of a message, DKIM signatures are added in a new header field, which makes them unobtrusive for users whose mail clients don’t support DKIM. Also unlike S/MIME and PGP, DKIM uses the Domain Name System as its public-key infrastructure, which is secure only in combination with DNSSEC. The owner of a domain can publish several public keys so that different servers can use different private keys for signing. The ability to publish several keys is also useful for introducing a new key before revoking an old one. Each public key is identified by a unique Selector within its Domain and is published in TXT record at {Selector}._domainkey.{Domain}. The selector can contain periods, which allows large organizations to split the namespace of their DKIM keys into several administrative zones. Both the Domain and the Selector are included in the DKIM-Signature header field so that verifiers know how to retrieve the appropriate public key. Since the public key used to verify a signature is retrieved from the stated domain, a valid DKIM signature authenticates this domain.

The standard doesn’t specify which entities add and verify DKIM signatures. Since DKIM keys are valid for the whole domain and cannot be restricted to individual users, messages are usually signed by the outgoing mail server after authenticating the user. If you are the only user of your domain and your email service provider doesn’t support DKIM, you could add a DKIM signature to a message before submitting it to the outgoing mail server but I’m not aware of any mail client which supports this. Since DKIM keys can be revoked at any time after a message has been delivered, DKIM signatures are typically verified by incoming mail servers, which record the result in the Authentication-Results header field for later use. While the mail client of the recipient could verify DKIM signatures as well, it would have to record the result before the signature expires. Since emails are usually synchronized to new mail clients via IMAP, the DKIM-verifying mail client would have to replace the messages in the user’s remote mailbox for archiving. As noted earlier, Gmail displays the domain which signed a message. Similar functionality can be added to Thunderbird through an add-on. Another reason for verifying DKIM signatures on incoming mail servers is to filter spam and phishing emails before they reach the user’s mailbox. DKIM doesn’t specify how to handle messages which don’t have a valid signature from a suitable domain. It simply allows the recipient to use the reputation of a domain to assess a given message. How a domain owner can ask recipients to reject emails which don’t pass domain authentication is the topic of the next subsection.

If your email service provider supports DKIM for custom domains, it likely has an article on how to generate a DKIM key for your domain and how to publish the public key in your DNS zone. For example, this guide shows you how to set up DKIM for Google Workspace. If you have to generate the signing key yourself, you find the instructions to do so below. The following two tools help you generate and validate DKIM records. Instead of explaining the various configuration options here, you can simply hover your mouse over them to read a short description of their purpose in a tooltip. For a longer explanation, you can consult RFC 6376. The last three options are rarely used and just there for the sake of completeness. When RSA is used as the signing algorithm, the DKIM record can become quite long, which requires that the user interface of your domain name registrar splits the TXT data into strings of at most 255 bytes for you. Unlike SPF and DMARC, you don’t have to configure a DKIM record for unused domains. Only if you have a wildcard CNAME record and don’t trust the target domain not to spoof emails, you should configure a TXT record with a value of v=DKIM1; p= at *._domainkey in your DNS zone. The second tool uses Google’s DNS API to query the DKIM records. If it finds a record, it loads its content into the first tool to make it easier to analyze and modify existing DKIM records.

Minimal output:Key type:Public key:
Hash algorithms:Service types:Flags:
v=DKIM1; p=
Domain:Selector:

Non-repudiation DKIM-Signature header field

An example DKIM-Signature header field from a message sent with Gmail.

The various tags of the DKIM-Signature header field. The last four tags are less common and the IANA registry lists even more tags.

Body and header canonicalization

Allow others to extend the body

Signing messages of subdomains

Replay attacks for spamming

Generating the signing key

Authorized Third-Party Signatures (ATPS)

Author Domain Signing Practices (ADSP)

Domain-based Message Authentication, Reporting and Conformance (DMARC)

Increasing the security of a decentralized system is always difficult because enforcing new requirements prematurely disrupts the reliability of the system. In order to minimize the disruption for users, all changes to the system have to be backward compatible. As we will see in the next section, some improvements involve only two parties. Email authentication, on the other hand, involves many parties and is, therefore, quite difficult to deploy. To authenticate emails, the outgoing mail server of the sender has to implement SPF and/or DKIM, email forwarders may not break the authentication, and the incoming mail server of the recipient has to verify the authentication. It only makes sense to authenticate emails if unauthentic emails are somehow penalized. In the short term, this could mean that unauthentic emails are quarantined as potential spam. In the long term, the goal should be to reject or discard all unauthentic emails even if they are delivered by a reputable mail server. While it is always up to the mail system of the recipient to decide what to do with incoming messages, it can enforce domain authentication only if just a small percentage of legitimate mail is affected by it. Domain-based Message Authentication, Reporting and Conformance (DMARC), which is specified in RFC 7489, allows domain owners to deploy domain authentication gradually, to monitor its effect on the delivery of their emails, to detect overlooked sources of legitimate mail, and to ask for strict enforcement when the amount of disruption seems acceptable.

There are three aspects to understanding DMARC:

  • Authentication: A message is considered to be authentic if the domain of the From address matches the SPF-authenticated domain of the MAIL FROM address or the domain of a valid DKIM signature. The owner of the sending domain can require the matching to be strict, in which case the domains have to be identical, or relaxed, in which case only the organizational domains after removing any subdomains have to be the same. This is known as identifier alignment and the alignment can be configured separately for SPF and DKIM. If the From field consists of several addresses, which is valid according to RFC 5322, the recipient can either reject the message or authenticate all domains and apply the most strict policy among the unauthentic domains.
  • Reports: Domain owners can ask receiving mail servers to send them aggregate reports in regular intervals and failure reports for messages which failed authentication. These DMARC reports allow domain owners to monitor their deployment of domain authentication, to detect unauthorized sources of legitimate messages, such as webshops and continuous integration systems, and to be informed immediately when their domain is abused for phishing.
  • Policies: Domain owners can specify how receiving mail servers shall handle unauthentic messages. If your domain doesn’t yet have a DMARC record (see below), you should start with aggregate reports and a domain policy of none. This allows you to be informed about authentication failures without affecting how unauthentic messages are handled. Once you’re confident that you’ve authorized all legitimate sources of email with SPF, you should set the domain policy to quarantine, which requests receiving mail servers to treat unauthentic emails as suspicious. Since it’s not under your control whether your recipients use alias addresses, you should move to the reject policy only once you’ve also deployed DKIM. Otherwise, your messages might not even reach the spam folder of your recipients if they use email forwarding.

Domain owners publish their preferences in a TXT record at _dmarc.{Domain}. The following two tools help you generate and validate the DMARC record for your domain. Given the remarks above, most parameters should be self-explanatory. If this is not the case, you can hover your mouse over them to read a short description and all options are also documented in RFC 7489, of course. Domains which aren’t used to send emails from should have a DMARC record of v=DMARC1; p=reject. If you want to be informed about spoofing attempts, you can also include a reporting address. The second tool uses Google’s DNS API to query the DMARC record of the given domain. If it finds a record, it loads its content into the first tool to make it easier to analyze and modify existing records. Even if you use the first tool to generate your DMARC record, you should check your record with the second tool as there are still many mistakes that you can make. For example, if reports shall be sent to a different domain, the report receiver has to approve this. Another example is that the subdomain policy has an effect only in DMARC records of organizational domains. The second tool performs more than twenty such checks and warns you about potential configuration errors.

Minimal output:Domain policy:Subdomain policy:Rollout percentage:100SPF alignment:DKIM alignment:Aggregate reports:Report interval:
Failure reports:Report format:Report when:
v=DMARC1; p=reject
Domain:

Organizational domain

Subdomain policy

Unix time

Aggregate reports

An aggregate report, which was triggered by a Mailchimp signup and sent by Google. As you can see, the message passed both DKIM and SPF authentication. However, the MAIL FROM address didn’t align with the From address, which caused SPF to fail in the DMARC evaluation. As written earlier, this failure is intentional since Mailchimp wants to handle bounce messages for you, which prevents SPF from aligning.

Failure reports

The structure of failure reports. [A|B] means either A or B and […] stands for more header fields. multipart/report and text/rfc822-headers are specified in RFC 6522, message/rfc822 is specified in RFC 2046, and message/feedback-report is specified in RFC 5965. IANA maintains a list of feedback report header fields. The DMARC-specific Identity-Alignment header field is defined in RFC 7489.

Report approval

Authentication-Results header field

The format of the Authentication-Results as specified in RFC 8601. The curly brackets need to be replaced with actual values. The content in square brackets is optional and the asterisk indicates that the preceding content can be repeated.

The Authentication-Results header field added by Google when I send an email to Gmail. […] is the same comment as in the SPF-Received header field. Gmail doesn’t quite adhere to the standard. To begin with, it adds the Authentication-Results below the Received header field. Since SPF doesn’t authenticate the local part of the MAIL FROM address, it should not be included in the smtp.mailfrom property. Furthermore, I have no idea why Gmail includes [email protected] rather than header.d=ef1p.com in the DKIM result. To be fair, the RFC has one example using the d tag and one example using the i tag.

Authenticated Received Chain (ARC)

The ARC header fields that Gmail added to the message from which I took the Authentication-Results header field in the previous box.

DNS queries from your command line

Brand Indicators for Message Identification (BIMI)

Brand Indicators for Message Identification (BIMI) is an emerging standard, which allows mail clients to display the logo of the sending company for emails which passed DMARC authentication. It is specified in various drafts. Unlike SPF, DKIM, and DMARC, BIMI is not a domain authentication mechanism. The idea is that companies can refer to an SVG image, which needs to be certified by a certification authority, in a DNS record at their domain. By ensuring that trademarks cannot be abused by scammers, BIMI has the potential to eliminate homograph attacks and phishing. Another goal of BIMI is to increase DMARC adoption among companies which value marketing more than security.

The tool below uses Google’s DNS API to query the BIMI record of the given domain. BIMI records are identified with a selector just like DKIM records so that companies can use different logos for different purposes. The default selector is default. Google, Yahoo, and Fastmail are running BIMI pilots. Companies which already have a BIMI record include cnn.com, linkedin.com, and ebay.com.

Domain:Selector:

BIMI DNS record

An imaginary BIMI record with reasonably short addresses. The files can be hosted on different domains.

BIMI header fields

How the BIMI results should be recorded. [a|b] means a or b. The values are described in the BIMI draft.

Verified mark certificate (VMC) SVG Tiny Portable/Secure profile

Transport security

As discussed earlier, ESMTP uses the STARTTLS extension to upgrade an insecure TCP connection to a secure TLS connection:

ClientServer(Open tcp connection)220 server.example.comEHLO client.example.org250-server.example.com250-PIPELINING250 starttlsstarttls220 Go ahead(Start tls negotiation)(Negotiate a tls session)(Continue with tls) The sequence diagram of STARTTLS.

In order to remain backward compatible, the client can use the STARTTLS extension only if the server supports it. Since the server indicates support for STARTTLS over the insecure channel, an attacker who can intercept and alter the packets between the client and the server can simply strip the STARTTLS capability from the server’s response to the EHLO command:

ClientAttackerServer(Open tcp connection)(Open tcp connection)220 server.example.com220 server.example.comEHLO client.example.orgEHLO client.example.org250-server.example.com250-PIPELINING250 starttls250-server.example.com250 PIPELINING(Continue without tls)(Continue with or without tls) How a man in the middle can prevent the two parties from upgrading their connection to TLS.

This attack is known as STRIPTLS. The problem is not Explicit TLS but rather the opportunistic use of TLS for the sake of backward compatibility. If the client is willing to continue without the security provided by TLS, Implicit TLS suffers from the same problem:

ClientAttackerServer(Open tls connection)(Open tcp connection)(Open tcp or tls connection)220 server.example.com220 server.example.comEHLO client.example.orgEHLO client.example.org The attacker can drop the client’s TLS connection until it gives up and connects to the server with TCP.

While the opportunistic use of TLS is also a problem for submission, access, and filtering protocols, mail clients always communicate with the same few servers and should not fall back to insecure communication after the initial configuration. Additionally, cleartext is considered obsolete for email submission and access. For these reasons, we’re interested only in securing ESMTP for Relay between the outgoing mail server of the sender and the incoming mail server of the recipient in this section. As discussed earlier, there are three ways to achieve secure transport in the presence of an active adversary without sacrificing backward compatibility:

  1. Previous connections: If previous connections were secure, abort when TLS is no longer available. MTA-STS works like this.
  2. Authenticated channel: The recipient can indicate support for TLS through an authenticated channel. DANE works like this.
  3. User configuration: Let the user require that their messages may be delivered only with TLS. REQUIRETLS makes this possible.

Server authentication

REQUIRETLS extension

How a client asks an ESMTP server to forward the message only with TLS to other servers which support REQUIRETLS as well.

How a client asks an ESMTP server to forward the message even if DANE or MTA-STS fails. At the moment, No is the only valid value.

DNS-Based Authentication of Named Entities (DANE)

DNS-Based Authentication of Named Entities (DANE) is specified in RFC 6698. RFC 7671 updates and clarifies some aspects of DANE and RFC 7672 specifies how DANE is applied to SMTP. DANE relies on DNSSEC for three different purposes:

  • DNS authentication: Domain names are used to reference services, which are often provided by external service providers. Since changes are easier if the service providers can manage their address records themselves, indirections with MX, SRV, and CNAME records are quite common in the Domain Name System. The same is true for security-related DNS records, such as TLSA records, which are introduced by DANE. (Officially, TLSA is not an acronym but simply the name of the record type. Personally, I like to think of TLSA as Transport Layer Security Anchor.) Letting the service providers configure the necessary TLSA records at their domains has some advantages. However, the TLSA records can be trusted only if the DNS records are authenticated with DNSSEC both in the zone of the customer and in the zone of the service provider. If the reply to the MX, SRV, or CNAME query can be spoofed by an attacker, the attacker can pose as the legitimate service provider to unsuspecting clients.
  • Downgrade resistance: By configuring TLSA records at the appropriate subdomain, a service provider indicates that its server supports TLS. Thanks to DNSSEC’s authenticated denial of existence, an attacker cannot suppress the retrieval of the TLSA records, which makes DANE resistant to downgrade attacks. Before you can deploy DANE, you have to deploy DNSSEC. If a client encounters an unsigned domain, it continues with opportunistic encryption. If a client learns from the superzone that the subzone is signed but cannot retrieve the signed TLSA records or a signed statement of their absence, it aborts the connection.
  • Trust anchor: In order to prevent a man-in-the-middle attack, the client has to authenticate the server. Instead of relying on the traditional public-key infrastructure (PKI), DANE requires service providers to put the public key of their server or the public key of a trust anchor of their choosing into their TLSA records. DANE clients then verify whether the server’s public key is confirmed directly or indirectly by one of the server’s TLSA records. Relying on DNSSEC rather than on traditional certification authorities (CAs) has several advantages.

The tool below queries the MX records of the given domain and the TLSA records of each mail server. It uses Google’s DNS API for the DNS queries and performs only rudimentary checks on the format of the TLSA records. It doesn’t validate whether DNSSEC and DANE are deployed correctly. If you want to check this, you can use this validator. I cover how you can generate and verify TLSA records yourself below. You can deploy DANE only if your email service provider supports it. If your email service provider has configured TLSA records for their servers, all that you have to do is to enable DNSSEC on your custom domain.

Domain:

PKI comparison TLSA record type

One of the TLSA records at _25._tcp.mail.protonmail.ch. (I cover the location of the record in the next box.)

Which checks clients have to perform for each certificate usage according to RFC 7671.

Which combinations of DANE parameters you should and should not use.
The verbs in the recommendation column are specified in RFC 2119.
The last two rows are applicable only to SMTP for Relay on port 25.

TLSA record location

Multiple TLSA records

Name matching

Client behavior

How the client has to handle the various situations according to RFC 7672 and RFC 7673.

How to generate a TLSA record

How to verify a TLSA record

How to compute the certificate association data directly from the server certificate. 2> /dev/null suppresses the error output.

DANE on outgoing mail server HTTP Public-Key Pinning (HPKP)

A Public-Key-Pins response header field with a validity period of 30 days and a report URI for validation failures.

Mail Transfer Agent Strict Transport Security (MTA-STS)

Mail Transfer Agent Strict Transport Security (MTA-STS) is specified in RFC 8461. MTA-STS is a PKIX-based alternative to DANE for those who cannot or don’t want to deploy DNSSEC on their domain. It lets receiving domains indicate their support for PKIX-authenticated TLS with the following two resources:

  • DNS record: A TXT record is used to inform the sender that the receiving domain has an MTA-STS policy and whether the policy has been changed since the last time the sender retrieved it. Since small DNS records are retrieved with UDP, this is much faster than retrieving the policy file, which requires a TCP and a TLS handshake.
  • Policy file: The sender fetches the MTA-STS policy with HTTPS from the receiving domain. The MTA-STS policy indicates what the sender shall do if it cannot authenticate the incoming mail server of the recipient with the presented PKIX certificate. Since MTA-STS doesn’t require that DNS records are authenticated with DNSSEC, the policy file is also used to authenticate the MX records of the receiving domain. This allows clients to match the presented certificate against the name of the mail server.

The tool below queries the MTS-STS record and the policy file of the given domain. It uses Google’s DNS API for the DNS query and the email tracking server, which I’ve deployed on Heroku, as a proxy server. This is necessary because the policy file is usually served without the header field which is required for cross-origin resource sharing (CORS). As you can see in its source code, my proxy server doesn’t store anything but Heroku logs the last 1’500 requests, which includes the queried domain and your IP address. I don’t persist the log file but I might check it from time to time for troubleshooting. The tool checks the syntax of the DNS record and the policy file but it verifies neither the MX records nor whether the mail server has a valid PKIX certificate.

Domain:

Comparison to DANE

Coexistence with DANE

Can DANE override MTA-STS validation? Unfortunately, the standard is silent on this.

MTA-STS DNS record

The TXT record at _mta-sts.gmail.com in April 2021.

MTA-STS policy file

The policy file at https://mta-sts.gmail.com/.well-known/mta-sts.txt in April 2021.

HTTP Strict Transport Security (HSTS) ClientAttackerServer(Open tcp connection)GET / http/1.0Host: www.example.com(Open tcp connection)GET / http/1.0Host: www.example.comhttp/1.0 301 Moved PermanentlyLocation: https://www.example.com/(Close tcp connection)(Open tls connection)GET / http/1.0Host: www.example.comhttp/1.0 200 ok[Headers and body](Close tls connection)http/1.0 200 ok[Rewritten headers and body](Close tcp connection)

How a man in the middle can prevent the client from upgrading its TCP connection to TLS. If the attacker knows that the server redirects http://www.example.com/ to https://www.example.com/, it can skip the first connection to the server, of course.

A Strict-Transport-Security response header field with a validity period of 365 days.

STARTTLS Policy List

SMTP TLS Reporting (TLSRPT)

SMTP TLS Reporting (TLSRPT) is specified in RFC 8460. With it, domain owners can ask sending mail servers to report transport security failures to them, which allows them to detect misconfigurations and attacks. If you’re certain that all emails are still being delivered to you, you’re much more likely to enforce strict transport security. Just like DMARC reporting, TLSRPT uses a DNS record to specify the endpoints to which reports should be sent once a day.

The following tool queries the TLSRPT record with Google’s DNS API and checks its format with a regular expression:

Domain:

TLSRPT DNS record

The TXT record at _smtp._tls.gmail.com. Whitespace is allowed before and after semicolons.

TLSRPT report format

A report which I’ve received from Google with the filename google.com!ef1p.com!1617926400!1618012799!001.json.gz in an email with the subject Report Domain: ef1p.com Submitter: google.com Report-ID: <[email protected]>. I’m not sure what total-successful-session-count means if the policy-type is no-policy-found. Would it count as a failure if TLS cannot be negotiated? Or are all sessions successful if no policy was found? You find an example report with failure details in RFC 8460.

TLSRPT report conditions

Reporting in header fields

End-to-end security

Instead of relying on mail servers to perform domain authentication and enforce transport security, senders and recipients can take matters into their own hands and secure their communication themselves. This idea is often referred to as end-to-end encryption (E2EE). Since protecting the authenticity of the content is usually just as important as protecting its confidentiality, I prefer the term end-to-end security. As we’ve seen earlier, arbitrary content can be sent via email. As long as the sender and the recipient agree on which cryptographic algorithms and which encoding they want to use, they can use any technique they want, such as one-time pad encryption combined with a message authentication code (MAC). While end-to-end security doesn’t have to be standardized, doing so is valuable for two reasons: The more people use the same technique, the more useful it becomes for each user, which is known as a network effect, and if everyone uses the same technique, it can be integrated into mail clients, which makes it easier to use.

imap formessagestorageimap or pop3for messageretrievalsmtp formessage relaysmtp formessagesubmissionMail clientof senderOutgoingmail serverof senderIncomingmail serverof recipientMail clientof recipientIncomingmail serverof senderOutgoingmail serverof recipient If the mail client of the sender encrypts and authenticates the message for the mail client of the recipient, none of the mail servers have to be trusted (beyond delivering or storing the message).

End-to-end security has two advantages:

  • No trust in mail servers required: In theory, if you don’t trust any email service provider, you can just host your emails yourself. In practice, however, running your own mail server on your own hardware is a hassle. Beyond the technical complexity of running a mail server, you may want to share the infrastructure with other users in order to reduce costs. Without end-to-end security, everyone has to trust the administrator of the mail servers. If you employ end-to-end security on all your messages, you can choose any free email service provider who delivers your messages reliably.
  • Secrets on clients instead of servers: In order to receive emails from anyone, incoming mail servers have to be reachable from anywhere on the Internet. If a security hole is found in the used software, mail servers become vulnerable immediately. Given that servers are typically shared by many users, they are a prime target for attacks. If you employ end-to-end security with all your contacts, an attacker who compromised your mail server can neither read your communication nor send messages in your name. If, on the other hand, your mail client or your computer is compromised, it’s over with and without end-to-end security.

End-to-end security also has some disadvantages:

  • No remote search: Your mail client has to store all messages locally if you want to be able to search for a message based on its content. Without end-to-end encryption, your mail client can ask your incoming mail server to perform the search using the SEARCH command of IMAP. While storing all messages locally is no longer a problem for modern smartphones, it might still be one for smartwatches. End-to-end security requires thick clients instead of thin clients.
  • No partial downloads: IMAP allows clients to fetch just certain parts of multipart messages. Since the body of a message is usually signed and encrypted as a single unit, bandwidth-constrained mail clients cannot download the text of an end-to-end secured message without its attachments.
  • Archiving of messages: If you lose your decryption key, you can no longer access your messages (unless your mail client stores them in plaintext on your computer). While this can be an annoyance for individuals, it can be a real problem for companies, who must archive their electronic communication in order to avert spoliation of potential evidence.
  • Message filtering: Mail servers cannot scan encrypted messages for malware or discard them as spam based on their content.

There are two main standards for end-to-end security: Secure/Multipurpose Internet Mail Extensions (S/MIME) and Pretty Good Privacy (PGP), which was standardized as OpenPGP. Unlike the other fixes in this chapter, both of them have existed since the 90s. The main difference between them is how public keys are authenticated, distributed, and revoked. Otherwise, they are quite similar.

Aspect Secure/Multipurpose Internet
Mail Extensions (S/MIME)
Pretty Good Privacy (PGP) Standards RFC 8551: S/MIME formats
RFC 8550: Certificate handling
RFC 5652: Message syntax
RFC 3370: Algorithms RFC 4880: OpenPGP formats
RFC 3156: Content types Certificate format RFC 5280: X.509 Specific to OpenPGP Public-key binding Certification authorities Web of trust Public-key distribution
(before relying on DNSSEC) Attached to the message
as part of the signature or
through an internal directory Public key server,
personal website,
or Autocrypt header field Public-key revocation
(before relying on DNSSEC) Certificate Revocation Lists (CRL) or
Online Certificate Status Protocol (OCSP)
by the issuing certification authority Key revocation signature
by the owner of the key DANE resource record type SMIMEA OPENPGPKEY Content type for encryption application/pkcs7-mime application/pgp-encrypted Content type for signature application/pkcs7-signature application/pgp-signature Primary user group Business world Security specialists Costs for users You have to pay for the certificate but
there are free offers for personal use None

Comparison of Secure/Multipurpose Internet Mail Extensions (S/MIME) and Pretty Good Privacy (PGP).

Modes of operation Message, statusSigned messageRecipientSigned andencryptedmessageAttackerSigned messageSenderMessageSignEncryptDecryptVerifyPrivate key of senderPublic key of recipientPrivate key of recipientPublic key of sender How a message is signed and encrypted by the sender and then decrypted and verified by the recipient. If you don’t know the public key of the recipient, you can only sign your message.

Deniable authentication

Which mechanism ensures which properties. More is not always better.

Compression before encryption

Multipart message nesting S/MIME signature using the application/pkcs7-mime format.

S/MIME signature using the multipart/signed format. micalg stands for message integrity check (MIC) algorithm. The advantage of this format is that users can read the message even if their mail client doesn’t support S/MIME.

S/MIME encryption with integrity protection.

PGP signature with the message in the first part. The checksum is a 24-bit cyclic redundancy check (CRC). Some implementations use BEGIN PGP SIGNATURE instead of BEGIN PGP MESSAGE.

PGP encryption with metadata in the first part. If the plaintext has been signed, you get the format of the previous example without MIME-Version: 1.0 after decryption.

Securing header fields PGP signature with protected header fields. The original subject is replaced with three dots only when the message is encrypted.

SMIMEA resource record

OPENPGPKEY resource record

SSHFP resource record

If you like my work, please consider supporting me with a donation so that I can keep publishing articles which are freely available. To be informed about new articles, follow this blog on Twitter, Reddit, or Telegram, or subscribe to its news feed using RSS/Atom. The copyright of this article and its graphics belong to Kaspar Etter. You can share this article in any form as long as you give proper attribution.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK