1

Landlock: Unprivileged Access Control

 1 month ago
source link: https://docs.kernel.org/userspace-api/landlock.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Landlock rules

A Landlock rule describes an action on an object which the process intends to perform. A set of rules is aggregated in a ruleset, which can then restrict the thread enforcing it, and its future children.

The two existing types of rules are:

Filesystem rules

For these rules, the object is a file hierarchy, and the related filesystem actions are defined with filesystem access rights.

Network rules (since ABI v4)

For these rules, the object is a TCP port, and the related actions are defined with network access rights.

Defining and enforcing a security policy

We first need to define the ruleset that will contain our rules.

For this example, the ruleset will contain rules that only allow filesystem read actions and establish a specific TCP connection. Filesystem write actions and other TCP actions will be denied.

The ruleset then needs to handle both these kinds of actions. This is required for backward and forward compatibility (i.e. the kernel and user space may not know each other’s supported restrictions), hence the need to be explicit about the denied-by-default access rights.

struct landlock_ruleset_attr ruleset_attr = {
    .handled_access_fs =
        LANDLOCK_ACCESS_FS_EXECUTE |
        LANDLOCK_ACCESS_FS_WRITE_FILE |
        LANDLOCK_ACCESS_FS_READ_FILE |
        LANDLOCK_ACCESS_FS_READ_DIR |
        LANDLOCK_ACCESS_FS_REMOVE_DIR |
        LANDLOCK_ACCESS_FS_REMOVE_FILE |
        LANDLOCK_ACCESS_FS_MAKE_CHAR |
        LANDLOCK_ACCESS_FS_MAKE_DIR |
        LANDLOCK_ACCESS_FS_MAKE_REG |
        LANDLOCK_ACCESS_FS_MAKE_SOCK |
        LANDLOCK_ACCESS_FS_MAKE_FIFO |
        LANDLOCK_ACCESS_FS_MAKE_BLOCK |
        LANDLOCK_ACCESS_FS_MAKE_SYM |
        LANDLOCK_ACCESS_FS_REFER |
        LANDLOCK_ACCESS_FS_TRUNCATE,
    .handled_access_net =
        LANDLOCK_ACCESS_NET_BIND_TCP |
        LANDLOCK_ACCESS_NET_CONNECT_TCP,
};

Because we may not know on which kernel version an application will be executed, it is safer to follow a best-effort security approach. Indeed, we should try to protect users as much as possible whatever the kernel they are using. To avoid binary enforcement (i.e. either all security features or none), we can leverage a dedicated Landlock command to get the current version of the Landlock ABI and adapt the handled accesses. Let’s check if we should remove access rights which are only supported in higher versions of the ABI.

int abi;

abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
if (abi < 0) {
    /* Degrades gracefully if Landlock is not handled. */
    perror("The running kernel does not enable to use Landlock");
    return 0;
}
switch (abi) {
case 1:
    /* Removes LANDLOCK_ACCESS_FS_REFER for ABI < 2 */
    ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
    __attribute__((fallthrough));
case 2:
    /* Removes LANDLOCK_ACCESS_FS_TRUNCATE for ABI < 3 */
    ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE;
    __attribute__((fallthrough));
case 3:
    /* Removes network support for ABI < 4 */
    ruleset_attr.handled_access_net &=
        ~(LANDLOCK_ACCESS_NET_BIND_TCP |
          LANDLOCK_ACCESS_NET_CONNECT_TCP);
}

This enables to create an inclusive ruleset that will contain our rules.

int ruleset_fd;

ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
if (ruleset_fd < 0) {
    perror("Failed to create a ruleset");
    return 1;
}

We can now add a new rule to this ruleset thanks to the returned file descriptor referring to this ruleset. The rule will only allow reading the file hierarchy /usr. Without another rule, write actions would then be denied by the ruleset. To add /usr to the ruleset, we open it with the O_PATH flag and fill the &struct landlock_path_beneath_attr with this file descriptor.

int err;
struct landlock_path_beneath_attr path_beneath = {
    .allowed_access =
        LANDLOCK_ACCESS_FS_EXECUTE |
        LANDLOCK_ACCESS_FS_READ_FILE |
        LANDLOCK_ACCESS_FS_READ_DIR,
};

path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
if (path_beneath.parent_fd < 0) {
    perror("Failed to open file");
    close(ruleset_fd);
    return 1;
}
err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
                        &path_beneath, 0);
close(path_beneath.parent_fd);
if (err) {
    perror("Failed to update ruleset");
    close(ruleset_fd);
    return 1;
}

It may also be required to create rules following the same logic as explained for the ruleset creation, by filtering access rights according to the Landlock ABI version. In this example, this is not required because all of the requested allowed_access rights are already available in ABI 1.

For network access-control, we can add a set of rules that allow to use a port number for a specific action: HTTPS connections.

struct landlock_net_port_attr net_port = {
    .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
    .port = 443,
};

err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
                        &net_port, 0);

The next step is to restrict the current thread from gaining more privileges (e.g. through a SUID binary). We now have a ruleset with the first rule allowing read access to /usr while denying all other handled accesses for the filesystem, and a second rule allowing HTTPS connections.

if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
    perror("Failed to restrict privileges");
    close(ruleset_fd);
    return 1;
}

The current thread is now ready to sandbox itself with the ruleset.

if (landlock_restrict_self(ruleset_fd, 0)) {
    perror("Failed to enforce ruleset");
    close(ruleset_fd);
    return 1;
}
close(ruleset_fd);

If the landlock_restrict_self system call succeeds, the current thread is now restricted and this policy will be enforced on all its subsequently created children as well. Once a thread is landlocked, there is no way to remove its security policy; only adding more restrictions is allowed. These threads are now in a new Landlock domain, merge of their parent one (if any) with the new ruleset.

Full working code can be found in samples/landlock/sandboxer.c.

Good practices

It is recommended setting access rights to file hierarchy leaves as much as possible. For instance, it is better to be able to have ~/doc/ as a read-only hierarchy and ~/tmp/ as a read-write hierarchy, compared to ~/ as a read-only hierarchy and ~/tmp/ as a read-write hierarchy. Following this good practice leads to self-sufficient hierarchies that do not depend on their location (i.e. parent directories). This is particularly relevant when we want to allow linking or renaming. Indeed, having consistent access rights per directory enables to change the location of such directory without relying on the destination directory access rights (except those that are required for this operation, see LANDLOCK_ACCESS_FS_REFER documentation). Having self-sufficient hierarchies also helps to tighten the required access rights to the minimal set of data. This also helps avoid sinkhole directories, i.e. directories where data can be linked to but not linked from. However, this depends on data organization, which might not be controlled by developers. In this case, granting read-write access to ~/tmp/, instead of write-only access, would potentially allow to move ~/tmp/ to a non-readable directory and still keep the ability to list the content of ~/tmp/.

Layers of file path access rights

Each time a thread enforces a ruleset on itself, it updates its Landlock domain with a new layer of policy. Indeed, this complementary policy is stacked with the potentially other rulesets already restricting this thread. A sandboxed thread can then safely add more constraints to itself with a new enforced ruleset.

One policy layer grants access to a file path if at least one of its rules encountered on the path grants the access. A sandboxed thread can only access a file path if all its enforced policy layers grant the access as well as all the other system access controls (e.g. filesystem DAC, other LSM policies, etc.).

Bind mounts and OverlayFS

Landlock enables to restrict access to file hierarchies, which means that these access rights can be propagated with bind mounts (cf. Shared Subtrees) but not with Overlay Filesystem.

A bind mount mirrors a source file hierarchy to a destination. The destination hierarchy is then composed of the exact same files, on which Landlock rules can be tied, either via the source or the destination path. These rules restrict access when they are encountered on a path, which means that they can restrict access to multiple file hierarchies at the same time, whether these hierarchies are the result of bind mounts or not.

An OverlayFS mount point consists of upper and lower layers. These layers are combined in a merge directory, result of the mount point. This merge hierarchy may include files from the upper and lower layers, but modifications performed on the merge hierarchy only reflects on the upper layer. From a Landlock policy point of view, each OverlayFS layers and merge hierarchies are standalone and contains their own set of files and directories, which is different from bind mounts. A policy restricting an OverlayFS layer will not restrict the resulted merged hierarchy, and vice versa. Landlock users should then only think about file hierarchies they want to allow access to, regardless of the underlying filesystem.

Inheritance

Every new thread resulting from a clone(2) inherits Landlock domain restrictions from its parent. This is similar to the seccomp inheritance (cf. Seccomp BPF (SECure COMPuting with filters)) or any other LSM dealing with task’s credentials(7). For instance, one process’s thread may apply Landlock rules to itself, but they will not be automatically applied to other sibling threads (unlike POSIX thread credential changes, cf. nptl(7)).

When a thread sandboxes itself, we have the guarantee that the related security policy will stay enforced on all this thread’s descendants. This allows creating standalone and modular security policies per application, which will automatically be composed between themselves according to their runtime parent policies.

Ptrace restrictions

A sandboxed process has less privileges than a non-sandboxed process and must then be subject to additional restrictions when manipulating another process. To be allowed to use ptrace(2) and related syscalls on a target process, a sandboxed process should have a subset of the target process rules, which means the tracee must be in a sub-domain of the tracer.

Truncating files

The operations covered by LANDLOCK_ACCESS_FS_WRITE_FILE and LANDLOCK_ACCESS_FS_TRUNCATE both change the contents of a file and sometimes overlap in non-intuitive ways. It is recommended to always specify both of these together.

A particularly surprising example is creat(2). The name suggests that this system call requires the rights to create and write files. However, it also requires the truncate right if an existing file under the same name is already present.

It should also be noted that truncating files does not require the LANDLOCK_ACCESS_FS_WRITE_FILE right. Apart from the truncate(2) system call, this can also be done through open(2) with the flags O_RDONLY | O_TRUNC.

When opening a file, the availability of the LANDLOCK_ACCESS_FS_TRUNCATE right is associated with the newly created file descriptor and will be used for subsequent truncation attempts using ftruncate(2). The behavior is similar to opening a file for reading or writing, where permissions are checked during open(2), but not during the subsequent read(2) and write(2) calls.

As a consequence, it is possible to have multiple open file descriptors for the same file, where one grants the right to truncate the file and the other does not. It is also possible to pass such file descriptors between processes, keeping their Landlock properties, even when these processes do not have an enforced Landlock ruleset.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK