Bare Systemd Method to Create an XFS Mount

Back to the Blog

For MongoDB data directories only XFS is recommended. The ext4 filesystem isn’t so bad but when there are a very, very high number of random accesses (which WiredTiger can reach) it can hit a bottleneck. To be fair most deployments will never hit this bottleneck, but it does remain an official production recommendation of MongoDB to only use XFS, and you get annoying warnings until you do.

On a fresh cloud server instance for your MongoDB hosts, it would be helpful if they always booted up with a flexibly-attached XFS mount for a MongoDB data directory. Your cloud service possibly isn’t making this easy though. E.g. you can get a fresh, network-attached block device on demand with each new virtual server instance but there is no “xfs” option available in that template configuration.

If you script or configure something at the cloud service API level (eg. launch using AWS CLI scripts in AWS EC2, or use cloud-init for a multi-vendor way) this is achievable. But let’s assume you have some one-time testing, or something like that, where the time investment for a cloud service script/recipe won’t pay off.

The Ideal Method – Which Doesn’t Work Yet

Ideally, you would create a system mount unit, specify the filesystem type, and systemd would take care of formatting a freshly-attached block device if the filesystem wasn’t initialized yet.

But this is not supported so far (systemd github issue #10014). I’ve not been able to make the ‘x-systemd.makefs’ expansion feature added around systemd-fstab-generator work in AWS Linux 2 instances either.

I assume you’re in the same situation if you’ve landed here.

Bare systemd Units to Do it All

One systemd unit is required for each of these steps: mkfs.xfs, mount, and chown mongod:mongod. The key points are:

Require the mkfs.xfs command to be run when block device is loaded by systemd, and do so before target level “local-fs.target” (or “local-fs-pre.target”)
Making a mount type service unit to mount the block device at the desired directory (eg. /data)
After the mount unit is up, run the chown command

The following example assumes:

“mongod” user already exists. (Create manually, or get it incidentally as a MongoDB package is installed.)
/dev/xvdb is the device path.
/data is the path it will be mounted at.

/etc/systemd/system/mkfs.xfs_xvdb.service

[Unit]

Description=oneshot systemd service to XFS format /dev/xvdb device

After=dev-xvdb.device

Requires=dev-xvdb.device

[Service]

Type=oneshot

#Note the leading "-" in ExecStart. In systemd exec directives this means ignore non-zero exit code.

#systemd init will continue peacefully this way, even if mkfs.xfs error-exits in subsequent restarts because the block device was formatted already.

ExecStart=-/usr/sbin/mkfs.xfs /dev/xvdb

[Install]

WantedBy=local-fs.target

Enable with: sudo systemctl enable mkfs.xfs_xvdb.service

/etc/systemd/system/data.mount

(!) Don’t forget to first create the /data directory in your server image’s root filesystem to be the mount point for the data.mount unit.

[Unit]

Description=systemd unit to mount /dev/xvdb at /data

After=mkfs.xfs_xvdb.service

Requires=mkfs.xfs_xvdb.service

[Mount]

What=/dev/xvdb

#N.b. "Where" must be reflected in the unit name.

#Eg. if it is for path "/data" we must name this unit file "data.mount".

#Substitute "-" in place of non-root "/" path delimiters. Eg. /srv/xyz --> "srv-xyz.mount"

Where=/data

Type=xfs

[Install]

WantedBy=multi-user.target

Enable with: sudo systemctl enable data.mount

/etc/systemd/system/set_mongodb_data_dir_owner.service

[Unit]

Description=oneshot systemd service to chown mongod:mongod /data

After=data.mount

Requires=data.mount

[Service]

Type=oneshot

#Using -v (verbose) to produce message that can be seen in the systemd journal. This is optional.

ExecStart=/usr/bin/chown -v mongod:mongod /data

[Install]

WantedBy=multi-user.target

Enable with: sudo systemctl enable set_mongodb_data_dir_owner.service

As you’re building a server image at this stage you don’t have to start the units above – just enable, then save the server image. Yes of course it should be tested, but the real goal is making it work in new server instances. So, confirm these systemd units are automatically executed after the startup of those.

Have open source expertise you want to share? Submit your talk for Percona Live ONLINE 2021!

The Output in systemd Journal

When a new server instance is started, the journal messages for these three units should look something like this:

mkfs.xfs_xvdb.service

~]$ journalctl -u mkfs.xfs_xvdb.service

<timestamp> <hostname> systemd[1]: Starting Formats /dev/xvdb device with XFS filesystem...

<timestamp> <hostname> mkfs.xfs[2494]: meta-data=/dev/xvdb isize=512 agcount=4, agsize=524288 blks

<timestamp> <hostname> mkfs.xfs[2494]: = sectsz=512 attr=2, projid32bit=1

<timestamp> <hostname> mkfs.xfs[2494]: = crc=1 finobt=1, sparse=0

<timestamp> <hostname> mkfs.xfs[2494]: realtime =none extsz=4096 blocks=0, rtextents=0

<timestamp> <hostname> systemd[1]: Started Formats /dev/xvdb device with XFS filesystem.

Or, if after Reboot:

<timestamp> <hostname> systemd[1]: Starting Formats /dev/xvdb device with XFS filesystem...

<timestamp> <hostname> mkfs.xfs[2497]: mkfs.xfs: /dev/xvdb contains a mounted filesystem

<timestamp> <hostname> mkfs.xfs[2497]: Usage: mkfs.xfs

<timestamp> <hostname> mkfs.xfs[2497]: /* blocksize */ [-b log=n|size=num]

<timestamp> <hostname> mkfs.xfs[2497]: xxxm (xxx MiB), xxxg (xxx GiB), xxxt (xxx TiB) or xxxp (xxx PiB).

<timestamp> <hostname> mkfs.xfs[2497]: <value> is xxx (512 byte blocks).

<timestamp> <hostname> systemd[1]: Started Formats /dev/xvdb device with XFS filesystem.

Note that mkfs.xfs failing and being ignored in the second or later restarts is planned and expected given the way this systemd service unit was written.

data.mount

~]$ journalctl -u data.mount

<timestamp> <hostname> systemd[1]: Mounting Mount block device xvdb at /data...

<timestamp> <hostname> systemd[1]: Mounted Mount block device xvdb at /data.

set_mongodb_data_dir_owner.service

~]$ journalctl -u set_mongodb_data_dir_owner.service

<timestamp> <hostname> systemd[1]: Starting Ensures mongod is owner of mounted XFS directory at /data...

<timestamp> <hostname> chown[2549]: changed ownership of ‘/data’ from root:root to mongod:mongod

<timestamp> <hostname> systemd[1]: Started Ensures mongod is owner of mounted XFS directory at /data

Or, if after Reboot:

<timestamp> <hostname> systemd[1]: Starting oneshot systemd service to chown mongod:mongod /data...

<timestamp> <hostname> chown[2549]: ownership of ‘/data’ retained as mongod:mongod

<timestamp> <hostname> systemd[1]: Started oneshot systemd service to chown mongod:mongod /data.

The Wrap-Up

systemd unit types and activation rules are tightly coupled with core Linux. You can use them to do the right thing, at the right time.

A server setup job that can be reduced to single commands such as /usr/bin/mkdir, /usr/sbin/mkfs*, /usr/bin/chown etc. is an opportunity for you to implement a minimalist systemd config project.

Scripts with systemd are fine too – make them the command that is run by ExecStart=… – but that’s a different feeling to being able to see everything with just “systemctl cat <unit_name>” and “systemctl status”.

Typically systemd units will be run every bootup, not just the first one. A command such as mkfs.xfs should be only run once, however, so a trick is needed. This example relied on the fact that mkfs.xfs will not damage an existing filesystem (without -f force at least). Putting “-” at the start of /usr/sbin/mkfs.xfs is how the ‘filesystem already exists’ exit code is ignored.

Bare Systemd Method to Create an XFS Mount

The Ideal Method – Which Doesn’t Work Yet

Bare systemd Units to Do it All

/etc/systemd/system/mkfs.xfs_xvdb.service

/etc/systemd/system/data.mount

/etc/systemd/system/set_mongodb_data_dir_owner.service

The Output in systemd Journal

mkfs.xfs_xvdb.service

data.mount

set_mongodb_data_dir_owner.service

The Wrap-Up

Recommend

Bitcoin HODLers are selling while miners are storing

总市值逼近1万亿美元，比特币表现“秒杀”传统资产

从淘汰Oracle数据库的事情说起

MySQL 5.6 and Percona Server for MySQL 5.6 are End of Life

Top 10 Use Cases: Supply Chain Management

Graphs for Artificial Intelligence and Machine Learning

The Most Important Skills for an SRE, DBRE, or DBA

ORACLE和SYBASE数据库中实现数据查询条数限制的SQL语句实现

Bitcoin adoption: Morgan Stanley explores the idea

删库跑路救命策略

About Joyk