MongoDB Backup: How and When To Use PSMDB hotbackup and mongodb_consistent_backu...

rEVfeqQ.jpg!web We have many backup methods to backup a MongoDB database using native mongodump or external tools. However, in this article, we’ll take a look at the backup tools offered by Percona, keeping in mind the restoration scenarios for MongoDB replicaSet and Sharded Cluster environments. We’ll explore how and when to use the tool mongodb-consistent-backup from Percona lab to backup the database consistently in Sharded Cluster/replicaSet environments. We’ll also take a look athotbackup, a tool that’s available in Percona Server for MongoDB (PSMDB) packages.

Backup is done – What about Restore?

Those who are responsible for data almost always think about the methods needed to backup the database and store the backups securely. But they often fail to foresee the scenario where the backup needs to be used to restore data. For example, unfortunately, I have seen many companies schedule the backup of config files and shard servers separately, but they start and complete the backups at different times based on data volumes. But can we use that backup when we need to restore and start the cluster with it? The answer is no—well, maybe yes if you can tweak the metadata, but data inconsistency may occur. Using this backup schedule, the backup is not consistent for the whole cluster, and we don’t have a point where we can restore the data for all shards / config dbs so that we can start the cluster from that point. Consequently, we face a difficult situation where we really need to use that backup!

Let’s explore the two tools/features available to backup MongoDB from Percona, and look at which method to choose based on your restoration plan.

Hot backup for both replicaset and Sharded cluster:

The main problem with backup is maintaining consistency, as an application still writes to the DB while backup is going on. So to maintain the consistency throughout the backup, and get a reliable full backup of all data needed to restore the database, the backup tool needs to track changes via oplog as well. Using the mongodump utility along with oplog backup would help to achieve this easily in a replicaSet environment since you will need consistency for that replicaSet alone.

But when we need a consistent backup of a Sharded cluster , then it is very difficult to achieve the total cluster consistency as it involvs the backup of all shards and config servers all together up to a particular point, to reuse in failover cases. In this case, even if you use mongodump manually in each shard/config separately, and try to take a consistent backup of the total cluster when there are writes being made, it is a very tedious job. The backup of each shard ends at different points based on different scenarios such as load, data volume etc.

To remedy this, we could take a consistent hot backup of the Sharded cluster by using our utility mongodb-consistent-backup – in other words, point-in-time backup for the sharded cluster environment. This utility internally uses mongodump and gets the oplog changes from each node until the backup from all data nodes and configs are complete. This ensures that there is consistency in the backup of a total Sharded Cluster ! You have to make sure you are using replicaSet for your config server too. In fact, this tool also helps you to take a consistent backup in the replicaSet environment.

This utility is available in our Percona lab but please note that it is not yet supported officially. To install this package, please make sure you install all the dependency packages, and follow the steps mentioned in this link to complete the installation process.

If you have enabled authentication in your environment, then create a user like below:

db.createUser({
	user: "backup_usr",
	pwd: "backup_pass", 
	roles: [
	{ role: "clusterMonitor", db: "admin" }
	]
})/

The backup could be taken as follows by connecting one of the mongos node in the Sharded Cluster . Here mongos is running on 27051 port and the Cluster has one config replicaSet cfg and two Shards s1 and s2.

[root@app mongodb_consistent_backup-master]# ./bin/mongodb-consistent-backup -H localhost \
> -P 27051 \
> -u backup_usr \
> -p backup_pass \
> -a admin \
> -n clusterFullBackup \
> -l backup/mongodb
[2018-12-05 18:57:38,863] [INFO] [MainProcess] [Main:init:144] Starting mongodb-consistent-backup version 1.4.0 
(git commit: unknown)
[2018-12-05 18:57:38,864] [INFO] [MainProcess] [Main:init:145] Loaded config: {"archive": {"method": "tar", "tar": 
{"binary": "tar", "compression": "gzip"}, "zbackup": {"binary": "/usr/bin/zbackup", "cache_mb": 128, "compression": "lzma"}}, 
"authdb": "admin", "backup": {"location": "backup/mongodb", "method": "mongodump", "mongodump": {"binary": "/usr/bin/mongodump", 
"compression": "auto"}, "name": "clusterFullBackup"}, "environment": "production", "host": "localhost", "lock_file": 
"/tmp/mongodb-consistent-backup.lock", "notify": {"method": "none"}, "oplog": {"compression": "none", "flush": {"max_docs": 100, 
"max_secs": 1}, "tailer": {"enabled": "true", "status_interval": 30}}, "password": "******", "port": 27051, "replication": 
{"max_lag_secs": 10, "max_priority": 1000}, "sharding": {"balancer": {"ping_secs": 3, "wait_secs": 300}}, "upload": {"method": 
"none", "retries": 5, "rsync": {"path": "/", "port": 22}, "s3": {"chunk_size_mb": 50, "region": "us-east-1", "secure": true}, 
"threads": 4}, "username": "backup_usr"}
...
...
[2018-12-05 18:57:40,715] [INFO] [MongodumpThread-5] [MongodumpThread:run:204] Starting mongodump backup of s2/127.0.0.1:27043
[2018-12-05 18:57:40,722] [INFO] [MongodumpThread-7] [MongodumpThread:run:204] Starting mongodump backup of cfg/127.0.0.1:27022
[2018-12-05 18:57:40,724] [INFO] [MongodumpThread-6] [MongodumpThread:run:204] Starting mongodump backup of s1/127.0.0.1:27032
[2018-12-05 18:57:40,800] [INFO] [MongodumpThread-5] [MongodumpThread:wait:130] s2/127.0.0.1:27043:	Enter password:
[2018-12-05 18:57:40,804] [INFO] [MongodumpThread-6] [MongodumpThread:wait:130] s1/127.0.0.1:27032:	Enter password:
[2018-12-05 18:57:40,820] [INFO] [MongodumpThread-7] [MongodumpThread:wait:130] cfg/127.0.0.1:27022:	Enter password:
...
...
[2018-12-05 18:57:54,880] [INFO] [MainProcess] [Mongodump:wait:105] All mongodump backups completed successfully
[2018-12-05 18:57:54,892] [INFO] [MainProcess] [Stage:run:95] Completed running stage mongodb_consistent_backup.Backup with task 
Mongodump in 14.21 seconds
[2018-12-05 18:57:54,913] [INFO] [MainProcess] [Tailer:stop:86] Stopping all oplog tailers
[2018-12-05 18:57:55,955] [INFO] [MainProcess] [Tailer:stop:118] Waiting for tailer s2/127.0.0.1:27043 to stop
[2018-12-05 18:57:56,889] [INFO] [TailThread-2] [TailThread:run:177] Done tailing oplog on s2/127.0.0.1:27043, 2 oplog changes, 
end ts: Timestamp(1544036268, 1)
[2018-12-05 18:57:59,967] [INFO] [MainProcess] [Tailer:stop:118] Waiting for tailer s1/127.0.0.1:27032 to stop
[2018-12-05 18:58:00,801] [INFO] [TailThread-3] [TailThread:run:177] Done tailing oplog on s1/127.0.0.1:27032, 3 oplog changes, 
end ts: Timestamp(1544036271, 1)
[2018-12-05 18:58:03,985] [INFO] [MainProcess] [Tailer:stop:118] Waiting for tailer cfg/127.0.0.1:27022 to stop
[2018-12-05 18:58:04,803] [INFO] [TailThread-4] [TailThread:run:177] Done tailing oplog on cfg/127.0.0.1:27022, 8 oplog changes, 
end ts: Timestamp(1544036279, 1)
[2018-12-05 18:58:06,989] [INFO] [MainProcess] [Tailer:stop:125] Oplog tailing completed in 27.85 seconds
...
...
[2018-12-05 18:58:09,478] [INFO] [MainProcess] [Rotate:symlink:83] Updating clusterFullBackup latest symlink to current backup 
path: backup/mongodb/clusterFullBackup/20181205_1857
[2018-12-05 18:58:09,480] [INFO] [MainProcess] [Main:run:461] Completed mongodb-consistent-backup in 30.49 sec

where,

n – backup directory name to be created

l – backup directory

H – hostname

P – port

p – password

u – user

a – authentication database

The log, above, shows the backup pattern going on, and it captures the state of the oplog, and updates the changes. The same command could be used to connect the replicaSet by having a proper hostname . The tool also has the ability to identify whether it is a replicaSet or Sharded cluster before proceeding with the backup. This can be determined from the log output, as shown below, which is written by the tool when running the backup:

For shading cluster:

[2018-12-05 19:05:02,453] [INFO] [MainProcess] [Main:run:299] Running backup in sharding mode using seed node(s): localhost:27051

For replicaSet:

[2018-12-05 19:23:05,070] [INFO] [MainProcess] [Main:run:257] Running backup in replset mode using seed node(s): localhost:27041

You can check out a couple of our blogshereandherefor more details about the utility.

Hot but Cold backup

You may be wondering about the title Hot but Cold backup. Yes, for Percona Server for MongoDB (PSMDB) packages, there is feature to take the binary hot backup using hotbackup . Those who know the MySQL world will already know about Percona XtraBackup which is our open source and free binary hot backup utility for MySQL. PSMDB hotbackup works in a similar way. When you use hotbackup to backup, then you will have a binary backup ready to start an instance with the backup directory. You don’t need to worry about restoring from scratch and recreating indices. However, this solution works for replicaset / standalone mongodb instances only.

If you can plan well, then you could feasibly use this feature to backup a Sharded cluster by bringing down one of the secondaries from all shards / config servers at the same time (probably when there is low or no transaction writing), then start them on a different port and without the replicaSet variable option, so that those instances won’t rejoin their replicaSet . Now you can start the hotbackup in all instances, once they are finished. You can revert the changes in the config file and allow them to rejoin their replicaSet .

Cautionary notes:Please make sure you are using the low priority or hidden nodes for this purpose, so that the election is not triggered when they split/join back to the replicaSet and don’t use SIGKILL (kill -9) to stop the db as it shuts down the database abruptly. Also, please plan to have at least an equal amount of disk space to that of your shard. A hotbackup takes an approximately equal amount of space as your node.

My colleagueTim Vaillancourt has written a great blogpost on this. Seehere.

Conclusion

So from the above two methods, now you have the option to choose the similar backup methods based on your RTO, RPO explained here . Hope this helps you! Please share your comments and feedback below, and tell me what you think!