Network (Transport) Encryption for MongoDB

Why do I need Network encryption?

In our previous blog post MongoDB Security vs. Five ‘Bad Guys’ there’s an overview of five main areas of security functions.

Let’s say you’ve enabled #1 and #2 (Authentication, Authorization) and #4 (Storage encryption a.k.a. encryption-at-rest and Auditing) mentioned in the previous blog post. Only authenticated users will be connecting, and they will be only doing what they’re authorized to. With storage encryption configured properly, the database data can’t be decrypted even if the server’s disk was stolen or accidentally given away.

You will have some pretty tight database servers indeed. However, consider the following movement of user data over the network:

Clients sending updates to the database (to a mongos, or mongod if unsharded).
A mongos or mongod sending query results back to a client.
Between replica set members as they replicate to each other.
mongos nodes retrieving collection data from the shards before relaying it to the user.
Shards with each other if chunks are being moved between sharded collections

As it moves, the user collection data is no longer within the database ‘fortress’. It’s riding in plain, unencrypted TCP packets. It can be grabbed from that with tcpdump etc. as shown here:

~$ #mongod primary is running on localhost:28051 is this example.
~$ #
~$ #In a different terminal I run: 
~$ #  mongo --port 28051 -u akira -p secret  --quiet --eval 'db.getSiblingDB("payments").TheImportantCollection.findOne()
~$ 
~$ sudo ngrep -d lo . 'port 28051'
interface: lo (127.0.0.0/255.0.0.0)
filter: ( port 28051 ) and ((ip || ip6) || (vlan && (ip || ip6)))
match: .
####
...
...
T 127.0.0.1:51632 -> 127.0.0.1:28051 [AP] #19
  ..........................find.....TheImportantCollection..filter.......lim
  it........?.singleBatch...lsid......id........-H..HN.n.`..}{..$clusterTime.
  X....clusterTime...../%9].signature.3....hash.........>.9...(.j. ..H4. .key
  Id.....fK.]...$db.....payments..                                           
#
T 127.0.0.1:28051 -> 127.0.0.1:51632 [AP] #20
  ....4................s....cursor......firstBatch......0......_id..........c
  ustomer.d....fn.....Smith..gn.....Ken..city.....Georgeville..street1.....1 
  Wishful St...postcode.....45031...order_ids.........id..........ns. ...paym
  ents.TheImportantCollection...ok........?.operationTime...../%9].$clusterTi
  me.X....clusterTime...../%9].signature.3....hash.........>.9...(.j. ..H4. .
  keyId.....fK.]...                                                          
#
T 127.0.0.1:51632 -> 127.0.0.1:28051 [AP] #21
  \....................G....endSessions.&....0......id........-H..HN.n.`..}{.
  ..$db.....admin..                                                          
#
T 127.0.0.1:28051 -> 127.0.0.1:51632 [AP] #22
  ....5.....................ok........?.operationTime...../%9].$clusterTime.X
  ....clusterTime...../%9].signature.3....hash.........>.9...(.j. ..H4. .keyI
  d.....fK.]...                                                              
###^Cexit

The key names and strings such as customer name and address are visible at a glance . This is proof that the TCP data isn’t encrypted. It is moving around in the plain. (You can use “ mongoreplay monitor ” if you want to see numeric and other non-ascii-string data in a fully human-readable way.)

(If you can unscramble the ascii soup above and see the whole BSON document in your head – great! But you failed the “I am not a robot” test so now you have to stop reading this web page.)

For comparison, this is what the same ngrep command prints when I change to using TLS in the client <-> database connection.

~$ #ngrep during: mongo --port 28051 --ssl --sslCAFile /data/tls_certs_etc/root.crt \
~$ #  --sslPEMKeyFile /data/tls_certs_etc/client.foo_app.pem -u akira -p secret --quiet \
~$ #  --eval 'db.getSiblingDB("payments").TheImportantCollection.findOne()'
~$ 
~$ sudo ngrep -d lo . 'port 28051'
interface: lo (127.0.0.0/255.0.0.0)
filter: ( port 28051 ) and ((ip || ip6) || (vlan && (ip || ip6)))
match: .
####
...
...
T 127.0.0.1:51612 -> 127.0.0.1:28051 [AP] #23
  .........5nYe.).I.M..H.T..j...r".4{.1\.....>...N.Vm.........C..m.V....7.nP.
  f..Z37......}..c?...$.......edN..Qj....$....O[Zb...[...v.....<s.T..m8..u.u3
  R.?....5;...$.F.h...][email protected]....."..F.M(^.b.....cv.../............\.z..9
  hY........Bz.QEu...`z.W..O@...\.K..C.....N..=.......}.                     
#
T 127.0.0.1:28051 -> 127.0.0.1:51612 [AP] #24
  .....*......4...p.t...G5!.Od...e}.......b.dt.\.xo..^0g.F'.;""..a.....L....#
  DXR.H..)..b.3`.y.vB{@...../..;lOn.k.$7R.]?.M.!H..BC.7........8..k..Rl2.._..
  .pa..-.u...t..;7T8s. z4...Q.....+.Y.\B.............B`.R.(.........~@f..^{.s
  .....\}.D[&..?..m.j#jb.....*.a..`. J?".........Z...J.,....B6............M>.
  ....J....N.H.).!:...B.g2...lua.....5......L9.?.a3....~.G..:...........VB..v
  ........E..[f.S."+...W...A...3...0.G5^.                                    
#
T 127.0.0.1:51612 -> 127.0.0.1:28051 [AP] #25
  ....m..m.5...u...i.H|..L;...M..~#._.v.....7..e...7w.0.......[p..".E=...a.?.
  G{{TS&.........s\..).U.vwV.t...t..2.%..                                    
#
T 127.0.0.1:28051 -> 127.0.0.1:51612 [AP] #26
  .....?..u.*.j...^.LF]6...I.5..5...X.....?..IR(v.T..sX.`....>..Vos.v...z.3d.
  .z.(d.DFs..j.SIA.d]x..s{7..{....n.[n{z.'e'...r............\..#.<<.Y.5.K....
  .....[......[6.....2......[w.5....H                                        
###^Cexit

No more plain data to see! The high-loss ascii format being printed by ngrep above can’t provide genuine satisfaction that this is perfectly encrypted binary data, but I hope I’ve demonstrated a quick, useful way to do a ‘sanity check’ that you are using TLS and are not still sending data in the plain.

Note: I’ve used ngrep because I found it made the shortest example. If you prefer tcpdump you can capture the dump with tcpdump <interface filter> <bpf filter> -w <dump file>, then open with the Wireshark GUI or view it with tshark -r <dump file> -V on the command line. And for real satisfaction that the TLS traffic is cryptographically protected data, you can print the captured data in hexadecimal / binary format (as opposed to ‘ascii’) and run an entropy assessment on it.

What’s the risk, really?

It’s probably a very difficult task for a hypothetical spy who was targeting you 1-to-1 to find and occupy a place in your network where they can just read the TCP traffic as a man-in-the-middle. But wholesale network scanners, who don’t know or care who any target is beforehand, will find any place that happens to have a gate open on the day they were passing by.

The scrambled look of raw TCP data to the human eye is not a distraction to them as it is to you, the DBA or server or application programmer. They’ve already scripted the unpacking of all the protocols. I assume the technical problem for the blackhat hackers is more a big-data one (too much copied data to process). As an aside, I hypothesize that they are already pioneering a lot of edge-computing techniques.

It is true that data captured on the move between servers might be only a tiny fraction of the whole data. But if you are making backups by the dump method once a day, and the traffic between the database server and the backup store server is being listened to, then it would be the full database.

How to enable MongoDB network encryption

MongoDB traffic is not encrypted until you create a set of TLS/SSL certificates and keys, and apply them in the mongod and mongos configuration files of your entire cluster (or non-sharded replica set). If you are an experienced TLS/SSL admin, or you are a DBA who has been given a certificate and key set by security administrators elsewhere in your organization, then I think you will find enabling MongoDB’s TLS easy – just distribute the files, reference them in the net.ssl.* options, and stop all the nodes and restart them. Gradually enabling without downtime takes longer but is still possible by using rolling restarts changing net.ssl.mode from disabled -> allowSSL -> preferSSL -> requireSSL ( doc link ) in each restart.

Conversely, if you are an experienced DBA and it will be your first time creating and distributing TLS certificates and keys, be prepared to spend some time learning about it first.

Possibly you thought you already enabled network encryption, at least for the internal TCP traffic between mongod and mongos nodes, when you created and deployed the keyfile for the security.keyfile option.

Admittedly it does look something like an SSH private key, but it is not. The long string value you made by extracting from /dev/random per the tutorial is just a string, it’s not part of asymmetric cryptography key where one binary value can be mathematically confirmed to be have been encrypted or signed by another.

In short, it is a misunderstanding – it is not the net.ssl.PEMKeyFile option or a key like it. The security.keyfile file just holds a password (like a user password) that the nodes use to connect as the internal “__system” user to each other. Setting it hasn’t enabled TLS/SSL. For example from a secondary’s server, or some hypothetical network node between it and the primary, you could listen to every write op as it comes through the oplog replication process.

The first ngrep network sniffing example in this article was from a replicaset with security.keyfile option set whilst net.ssl.* options were unset.

The way the certificates and PEM key files are created varies according to the following choices:

Using an external certificate authority or making a new root certificate just for these MongoDB clusters
If you are using it just for the internal system authentication between mongod and mongos nodes, or if you are enabling TLS for clients too
How strict you will be making certificates (e.g. with host limitations)
Whether you need the ability to revoke certificates

To repeat the first point in this section: if you have a security administration team who already know and control these public key infrastructure (PKI) components – ask them for help, in the interests of saving time and being more certain you’re getting certificates that conform with internal policy.

Self-made test certificates

Percona Security Team note : This is not a best practice, even though it is in the documentation as a tutorial ; we recommend you do not use this in production deployments .

So you want to get hands-on with TLS configuration of MongoDB a.s.a.p.? You’ll need certificates and PEM key files. Having the patience to fully master certificate administration would be a virtue, but you are not that virtuous. So you are going to use the existing tutorials (links below) to create self-signed certificates.

The quickest way to create certificates is:

Make a new root certificate
Generate server certificates (i.e. the ones the mongod and mongos nodes use for net.ssl.PEMKeyFile ) from that root certificate
Generate client certificates from the new root certificate too
- Skip setting CN / “subject” fields that limit the hosts or domains the client certificate can be used on
Self-sign those certificates
Skip making revocation certificates

The weakness in these certificates is:

A man in the middle attack is possible ( MongoDB doc link ): “MongoDB can use any valid TLS/SSL certificate issued by a certificate authority or a self-signed certificate. If you use a self-signed certificate, although the communications channel will be encrypted, there will be no validation of server identity. Although such a situation will prevent eavesdropping on the connection, it leaves you vulnerable to a man-in-the-middle attack. Using a certificate signed by a trusted certificate authority will permit MongoDB drivers to verify the server’s identity.”
What will happen if someone gets a copy of one of them?
- If they get the client or a server certificate they will be able to decrypt or spoof being a SSL encrypting-and-decrypting network peer on the network edges to those nodes.
- When using self-signed certificates you distribute a copy of the root certificate with the server or client certificate to every mongod, mongos, and client app. I.e. it’s as likely to be misplaced or stolen as a single client or server certificate. With the root certificate spoofing can be done on any edge in the network.
- You can’t revoke a stolen client or server certificate and cut them off from further access. You’re stuck with it. You’ll have to completely replace all the server-side and client certificates with cluster-wide downtime (at least for MongoDB < 4.2).

Examples on how to make self-signed certificates:

This snippet from MongoDB’s Kevin Adistimba is the most concise I’ve seen.
This replicaset setup tutorial for Percona’s Corrado Pandiani includes similar instructions with more mongodb context on the page.

Reference in the MongoDB docs:

Various configuration file examples

Three detailed appendix entries on how to make OpenSSL Certificates for Testing .

Troubleshooting

I like the brevity of the SSL Troubleshooting page in Gabriel Ciciliani’s MongoDB administration cool tips presentation from Percona Live Europe ’18. Speaking from my own experience before enabling them in the MongoDB config it’s crucial to make sure the PEM files (both server and client ones) pass the ‘openssl verify’ test command against the root / CA certificate they’re derived from. Absolutely, 100% do this before trying to use them in your mongodb config.

If “ openssl verify “-confirmed certificates still create a mongodb replicaset or cluster that is unconnectable then add the --sslAllowInvalidHostnames option when connecting with the mongo shell, and/or net.ssl.allowInvalidHostnames in mongod/mongos configuration. This is a differential diagnosis to see if the hostname requirements of the certificates are the only thing causing the SSL rules to reject the certificates.

If you find it takes --sslAllowInvalidHostnames to make it work it means the CN subject field and/or SAN fields in the certificate need to be edited until they match the hostnames and domains that the SSL lib identifies the hosts as. Don’t be tempted to just conveniently forget about it; disabling hostname verification is a gap that might be leveraged into a man-in-the-middle attack.

If you still are experiencing trouble my next step would be to check the mongod logs. You will find lines matching the grep expression ‘NETWORK .*SSL’ in the log if there are rejections. (This might become “TLS” later.) E.g.

2019-07-25T16:34:49.981+0900 I NETWORK [conn11] Error receiving request from client: SSLHandshakeFailed: SSL peer certificate validation failed: self signed certificate in certificate chain. Ending connection from 127.0.0.1:33456 (connection id: 11)

You might also try grepping for '[EW] NETWORK' to look for all network errors and warnings.

For SSL there is no need to raise the logging verbosity to see errors and warnings. From what I can see in ssl_manager_openssl.cpp those all come at the default log verbosity of 0. Only if you want to confirm normal, successful connections would I advise briefly raising log verbosity in the config file to level 2 for the exact log ‘component’ (in this case this is the “network”). (Don’t forget to turn it off soon after – forgetting you set log level 2 is a great way to fill your disk.) But for this topic the only thing I think log level 2 will add is “Accepted TLS connection from peer” confirmations like the following.

2019-07-25T16:29:41.779+0900 D NETWORK [conn18] Accepted TLS connection from peer: [email protected],CN=svrA80v,OU=testmongocluster,O=MongoTestCorp,L=Tokyo,ST=Tokyo,C=JP

Take a peek in the code

Certificate acceptance rules are a big topic and I am not the author to cover it. But take a look at the SSLManagerOpenSSL::parseAndValidatePeerCertificate(…) function in ssl_manager_openssl.cpp as a starting point if you’d like to be a bit more familiar with MongoDB’s application.

Why do I need Network encryption?

What’s the risk, really?

How to enable MongoDB network encryption

Self-made test certificates

Examples on how to make self-signed certificates:

Reference in the MongoDB docs:

Troubleshooting

Take a peek in the code

Recommend

iOS 开发：『Runtime』详解（四）获取类详细属性、方法

一天实现你自己的源到源自动微分 - 知乎

使用Elasticsearch(附Golang代码)

Beyond cookies: Today's options for client-side data storage

Java: ChronicleMap Part 2, Super RAM Maps

What’s New in Kotlin 1.3 [FREE]

Non-Intrusive Access to ''Orphaned'' Beans in Sp...

An open, extensible, wiki for your team built using React and Node.js

Python爬虫很强大，在爬虫里如何自动操控浏览器呢？

Blog Poll: Who’s Responsible for Securing the Data in your Databases?

About Joyk