Bad utmp implementations in Glibc and FreeBSD

I recently released another version – 0.5.0 – of Dinit, the service manager / init system. There were a number of minor improvements, including to the build system (just running “make” or “gmake” should be enough on any of the systems which have a pre-defined configuration, no need to edit mconfig by hand), but the main features of the release were S6-compatible readiness notification, and support for updating the utmp database.

At this point, I’d expect, there might be one or two readers wondering what this “utmp” database might be. On Linux you can find out easily enough via “man utmp” in the terminal:

The utmp file allows one to discover information about who is currently  using the system. There may be more users currently using the system,  because not all programs use utmp logging.

The OpenBSD man page clarifies:

The utmp file is used by the programs users(1), w(1) and who(1).

In other words, utmp is a record of who is currently logged in to the system (another file, “wtmp”, records all logins and logouts, as well as, potentially, certain system events such as reboots and time updates). This is a hint at the main motivation for having utmp support in Dinit – I wanted the “who” command to correctly report current logins (and I wanted boot time to be correctly recorded in the wtmp file).

However, when I began to implement the support for utmp and wtmp in Dinit, I also started to think about how these databases worked. I knew already that they were simply flat file databases – i.e. each record is a fixed number of bytes, the size of the “struct utmp” structure. The files are normally readable by unprivileged users, so that utilities such as who (1) don’t need to be setuid/setgid. Updating and reading the database is done (behind the scenes) via normal file system read and writes, via the getutent (3)/ pututline (3) family of functions, their getutxent / pututxline POSIX equivalents, or by the higher-level login (3) and logout (3) functions (found in libutil; In OpenBSD, only the latter are available, the lower-level routines don’t exist).

I wondered: If the files consist of fixed-sized records, and are readable by regular users, how is consistency maintained? That is – how can a process ensure that, when it updates the database, it doesn’t conflict with another process also attempting to update the database at the same time? Similarly, how can a process reading an entry from the database be sure that it receives a consistent, full record and not a record which has been partially updated? (after all, POSIX allows that a write(2) call can return without having written all the requested bytes, and I’m not aware of Linux or any of the *BSDs documenting that this cannot happen for regular files). Clearly, some kind of locking is needed; a process that wants to write to or read from the database locks it first, performs its operation, and then unlocks the database. Once again, this happens under the hood, in the implementation of the getutent/pututline functions or their equivalents.

Then I wondered: if a user process is able to lock the utmp file, and this prevents updates, what’s to stop a user process from manually acquiring and then holding such a lock for a long – even practically infinite – duration? This would prevent the database from being updated, and would perhaps even prevent logins/logouts from completing. Unfortunately, the answer is – nothing; and yes, it is possible on different systems to prevent the database from being correctly updated or even to prevent all other users – including root – from logging in to the system.

Specifically:

On Linux with Glibc (or, I suppose, any other system with Glibc), updates to the database can be prevented and logins delayed by 10 seconds ( bug filed );
On FreeBSD, updates to the database can be prevented and logins prevented indefinitely ( bug filed ). Note that on FreeBSD the file is named “utx.active” but is otherwise the same as “utmp” on other systems. A patch was quickly put together after I filed this bug, but progress on it has seemingly stalled.

I haven’t checked all other systems but suspect that various other BSDs could be susceptible to related problems. On the other hand, some systems are immune:

Linux with Musl, because Musl doesn’t implement the utmp functions (though it has no-op stubs). I don’t understand why the Musl FAQ claims that you need a setuid program to update the database: it seems perfectly reasonable to simply limit modification to daemons already running as root or in a particular group. (Perhaps it is referring to having terminal emulators create utmp entries, which the Linux “utmp” manpage suggests is something that happens, though this also seems unnecessary to me).
OpenBSD structures the utmp file so there is one particular entry per tty device, and so avoids the need for locking (writes to the same tty entry should naturally be serialised, since they are either for login or logout). It performs no locking for reading, which leaves open the possibility of reading a partially written entry, though this is certainly a less severe problem than the ones affecting Glibc/FreeBSD.

The whole thing isn’t an issue for single-user systems, but for multiple-user systems it is more of a concern. On such systems, I’d recommend making /var/run/utmp and /var/run/wtmp (or their equivalents) readable only by the owner and group, or removing them altogether, and forgoing the ability for unprivileged users to run the “who” command. Otherwise, you risk users being able to deny logins or prevent them being recorded, as per above.

As for fixes which still allow unprivileged processes to read the database, I’ve come to the conclusion that the best option is to use locking (on a separate, root-only file) only for write operations, and live with the limitation that it is theoretically possible for a program to read a partially-updated entry; this seems unlikely to ever happen, let alone actually cause a significant problem, in practice. To completely solve the problem, you’d either need atomic read and write support on files, or a secondary mechanism for accessing the database which obviated the concurrency problem (eg access the database via communication with a running daemon which can serialize requests). Or, perhaps Musl is taking the right approach by simply excluding the functionality.

Recommend

数字货币市场的量化交易工具有哪些？

(译)dart:async - 异步编程

Negotiations Failed: How Oracle killed Java EE.

Java：优雅地处理异常真是一门学问啊！

Docker系列之MTU debug

从实践认识修饰符 - 布尔bl - 博客园

Dockerfile指令解析

[UI组件] 来做一个可配置的滑块进度条吧

何时能不被黑客鱼肉？D-Link 智能摄像头又出事了

Spring Cloud Gateway入坑记 - throwable - 博客园

About Joyk