Why polkit (or, how to mount a disk on modern Linux)

← Discworld Noir in a Windows 98 virtual machine on Linux | Background noise | still aiming to be the universal operating system →

I've recently found myself explaining polkit (formerly PolicyKit) to one of Collabora's clients, and thought that blogging about the same topic might be useful for other people who are confused by it; so, here is why udisks2 and polkit are the way they are.

As always, opinions in this blog are my own, not Collabora's.

Privileged actions

Broadly, there are two ways a process can do something: it can do it directly (i.e. ask the kernel directly), or it can use inter-process communication to ask a service to do that operation on its behalf. If it does it directly, the components that say whether it can succeed are the Linux kernel's normal permissions checks (DAC), and if configured, AppArmor, SELinux or a similar MAC layer. All very simple so far.

Unfortunately, the kernel's relatively coarse-grained checks are not sufficient to express the sorts of policies that exist on a desktop/laptop/mobile system. My favourite example for this sort of thing is mounting filesystems. If I plug in a USB stick with a FAT filesystem, it's reasonable to expect my chosen user interface to either mount it automatically, or let me press a button to mount it. Similarly, to avoid data loss, I should be able to unmount it when I'm finished with it. However, mounting and unmounting a USB stick is fundamentally the same system call as mounting and unmounting any other filesystem - and if ordinary users can do arbitrary mount system calls, they can cause all sorts of chaos, for instance by mounting a filesystem that contains setuid executables (privilege escalation), or umounting a critical OS filesystem like /usr (denial of service). Something needs to arbitrate: “you can mount filesystems, but only under certain conditions”.

The kernel developer motto for this sort of thing is “mechanism, not policy”: they are very keen to avoid encoding particular environments' policies (the sort of thing you could think of as “business rules”) in the kernel, because that makes it non-generic and hard to maintain. As a result, direct mount/unmount actions are only allowed by privileged processes, and it's up to user-space processes to arrange for a privileged process to make the desired mount syscall.

Here are some other privileged actions which laptop/desktop users can reasonably expect to “just work”, with or without requiring a sysadmin-like (root-equivalent) user:

reconfiguring networking (privileged because, in the general case, it's an availability and potentially integrity issue)
installing, upgrading or removing packages (privileged because, in the general case, it can result in arbitrary root code execution)
suspending or shutting down the system (privileged because you wouldn't want random people doing this on your server, but should normally be allowed on e.g. laptops for people with physical access, because they could just disconnect the power anyway)

In environments that use a MAC framework like AppArmor, actions that would normally be allowed can become privileged: for instance, in a framework for sandboxed applications, most apps shouldn't be allowed to record audio. This prevents carrying out these actions directly, again resulting in the only way to achieve them being to ask a service to carry out the action.

Ask a system service to do it

On to the next design, then: I can submit a request to a privileged process, which does some checks to make sure I'm not trying to break the system (or alternatively, that I have enough sysadmin rights that I'm allowed to break the system if I want to), and then does the privileged action for me.

You might think I'm about to start discussing D-Bus and daemons, but actually, a prominent earlier implementation of this was mount(8), which is normally setuid root:

% ls -l /bin/mount
-rwsr-xr-x 1 root root 40000 May 22 11:37 /bin/mount

If you look at it from an odd angle, this is inter-process communication across a privilege boundary: I run the setuid executable, creating a process. Because the executable has the setuid bit set, the kernel makes the process highly privileged: its effective uid is root, and it has all the necessary capabilities to mount filesystems. I submit the request by passing it in the command-line arguments. mount does some checks - specifically, it looks in /etc/fstab to see whether the filesystem I'm trying to mount has the “user” or “users” flag - then carries out the mount system call.

There are a few obvious problems with this:

When machines had a static set of hardware devices (and a sysadmin who knew how to configure them), it might have made sense to list them all in /etc/fstab; but this is not a useful solution if you can plug in any number of USB drives, or if you are a non-expert user with Linux on your laptop. The decision ought to be based on general attributes of devices, such as “is removable?”, and on the role of the machine.
Setuid executables are alarmingly easy to get wrong so it is not necessarily wise to assume that mount(8) is safe to be setuid.
One fact that a reasonable security policy might include is “users who are logged in remotely should have less control over physically present devices than those who are physically present” - but that sort of thing can't be checked by mount(8) without specifically teaching the mount binary about it.

Ask a system service to do it, via D-Bus or other IPC

To avoid the issues of setuid, we could use inter-process communication in the traditional sense: run a privileged daemon (on boot or on-demand), make it listen for requests, and use the IPC channel as our privilege boundary.

udisks2 is one such privileged daemon, which uses D-Bus as its IPC channel. D-Bus is a commonly-used inter-process system; one of its intended/designed uses is to let user processes and system services communicate, especially this sort of communication between a privileged daemon and its less-privileged clients.

People sometimes criticize D-Bus as not doing anything you couldn't do yourself with some AF_UNIX sockets. Well, no, of course it doesn't - the important bit of the reference implementation and the various interoperable reimplementations consists of a daemon and some AF_UNIX sockets, and the rest is a simple matter of programming. However, it's sufficient for most uses in its problem space, and is usually better than inventing your own.

The advantage of D-Bus over doing your own thing is precisely that you are not doing your own thing: good IPC design is hard, and D-Bus makes some structural decisions so that fewer application authors have to think about them. For instance, it has a central “hub” daemon (the dbus-daemon, or “message bus”) so that n communicating applications don't need O(n²) sockets; it uses the dbus-daemon to provide a total message ordering so you don't have to think about message reordering; it has a distributed naming model (which can also be used as a distributed mutex) so you don't have to design that; it has a serialization format and a type system so you don't have to design one of those; it has a framework for “activating" run-on-demand daemons so they don't have to use resources initially, implemented using a setuid helper and/or systemd; and so on.

If you have religious objections to D-Bus, you can mentally replace “D-Bus” with “AF_UNIX or something” and most of this article will still be true.

Is this OK?

In either case - exec'ing a privileged helper, or submitting a request to a privileged daemon via IPC - the privileged process has two questions that it needs to answer before it does its work:

what am I being asked to do?
should I do it?

It needs to make some sort of decision on the latter based on the information available to it. However, before we even get there, there is another layer:

did the request get there at all?

In the setuid model, there is a simple security check that you can apply: you can make /bin/mount only executable by a particular group, or only executable by certain AppArmor profiles, or similar. That works up to a point, but cannot distinguish between physically-present and not-physically-present users, or other facts that might be interesting to your local security policy. Similarly, in the IPC model, you can make certain communication channels impossible, for instance by using dbus-daemon's ability to decide which messages to deliver, or AF_UNIX sockets' filesystem permissions, or a MAC framework like AppArmor.

Both of these are quite “coarse-grained” checks which don't really understand the finer details of what is going on. If the answer to "is this safe?” is something of the form “maybe, it depends on...”, then they can't do the right thing: they must either let it through and let the domain-specific privileged process do the check, or deny it and lose potentially useful functionality.

For instance, in an AppArmor environment, some applications have absolutely no legitimate reason to talk to udisks2, so the AppArmor policy can just block it altogether. However, once again, this is a coarse-grained check: the kernel has mechanism, not policy, and it doesn't know what the service does or why. If the application does need to be able to talk to the service at all, then finer-grained access control (obeying some, but not all, requests) has to be the service's job.

dbus-daemon does have the ability to match messages in a relatively fine-grained way, based on the object path, interface and member in the message, as well as the routing information that it uses itself (i.e. the source and destination). However, it is not clear that this makes a great deal of sense conceptually: these are facts about the mechanics of the IPC, not facts about the domain-specific request (because the mechanics of the IPC are all that dbus-daemon understands). For instance, taking the udisks2 example again, dbus-daemon can't distinguish between an attempt to adjust mount options for a USB stick (probably fine) and an attempt to adjust mount options for /usr (not good).

To have a domain-specific security policy, we need a domain-specific component, for instance udisks2, to get involved. Unlike dbus-daemon, udisks2 knows that not all disks are equal, knows which categories make sense to distinguish, and can identify which categories a particular disk is in. So udisks2 can make a more informed decision.

So, a naive approach might be to write a function in udisks2 that looks something like this pseudocode:

may_i_mount_this_disk (user, disk, mount options) → boolean
{
    if (user is root || user is root-equivalent)
        return true;

    if (disk is not removable)
       return false;

    if (mount options are scary)
       return false;

    if (user is in “manipulate non-local disks” group)
        return true;

    if (user is not logged-in locally)
       return false;

    # https://en.wikipedia.org/wiki/Multiseat_configuration
    if (user is not logged-in on the same seat where the disk is
            plugged in)
       return false;

    return true;
}

Delegating the security policy to something central

The pseudocode security policy outlined above is reasonably complicated already, and doesn't necessarily cover everything that you might want to consider.

Meanwhile, not every system is the same. A general-purpose Linux distribution like Debian might run on server/mainframe systems with only remote users, personal laptops/desktops with one root-equivalent user, locked-down corporate laptops/desktops, mobile devices and so on; these systems should not necessarily all have the same security policy.

Another interesting factor is that for some privileged operations, you might want to carry out interactive authorization: ask the requesting user to confirm that the action (which might have come from a background process) should take place (like Windows' UAC), or to prove that the person currently at the keyboard is the same as the person who logged in by giving their password (like sudo).

We could in principle write code for all of this in udisks2, and in NetworkManager, and in systemd, ... - but that clearly doesn't scale, particularly if you want the security policy to be configurable. Enter polkit (formerly PolicyKit), a system service for applying security policies to actions.

The way polkit works is that the application does its domain-specific analysis of the request - in the case of udisks2, whether the device to be mounted is removable, whether the mount options are reasonable, etc. - and converts it into an action. The action gives polkit a way to distinguish between things that are conceptually different, without needing to know the specifics. For instance, udisks2 currently divides up filesystem-mounting into org.freedesktop.udisks2.filesystem-mount, org.freedesktop.udisks2.filesystem-mount-fstab, org.freedesktop.udisks2.filesystem-mount-system and org.freedesktop.udisks2.filesystem-mount-other-seat.

The application also finds the identity of the user making the request. Next, the application sends the action, the identity of the requesting user, and any other interesting facts to polkit. As currently implemented, polkit is a D-Bus service, so this is an IPC request via D-Bus. polkit consults its database of policies in order to choose one of several results:

yes, allow it
no, do not allow it
ask the user to either authenticate as themselves or as a privileged (sysadmin) user to allow it, or cancel authentication to not allow it
ask the user to authenticate the first time, but if they do, remember that for a while and don't ask again

So how does polkit decide this? The first thing is that it reads the machine-readable description of the actions, in /usr/share/polkit-1/actions, which specifies a default policy. Next, it evaluates a local security policy to see what that says. In the current version of polkit, the local security policy is configured by writing JavaScript in /etc/polkit-1/rules.d (local policy) and /usr/share/polkit-1/rules.d (OS-vendor defaults). In older versions such as the one currently shipped in Debian unstable, there was a plugin architecture; but in practice nobody wrote plugins for it, and instead everyone used the example local authority shipped with polkit, which was configured via files in /etc/polkit-1/localauthority and /etc/polkit-1/localauthority.d.

These policies can take into account useful facts like:

what is the action we're talking about?
is the user logged-in locally? are they active, i.e. they are not just on a non-current virtual console?
is the user in particular groups?

For instance, gnome-control-center on Debian installs this snippet:

polkit.addRule(function(action, subject) {
        if ((action.id == "org.freedesktop.locale1.set-locale" ||
             action.id == "org.freedesktop.locale1.set-keyboard" ||
             action.id == "org.freedesktop.hostname1.set-static-hostname" ||
             action.id == "org.freedesktop.hostname1.set-hostname" ||
             action.id == "org.gnome.controlcenter.datetime.configure") &&
            subject.local &&
            subject.active &&
            subject.isInGroup ("sudo")) {
                    return polkit.Result.YES;
            }
});

which is reasonably close to being pseudocode for “active local users in the sudo group may set the system locale, keyboard layout, hostname and time, without needing to authenticate”. A system administrator could of course override that by dropping a higher-priority policy for some or all of these actions into /etc/polkit-1/rules.d.

Summary

Kernel-based permission checks are not sufficiently fine-grained to be able to express some quite reasonable security policies
Fine-grained access control needs domain-specific understanding
The kernel doesn't have that information (and neither does dbus-daemon)
The privileged service that does the domain-specific thing can provide the domain-specific understanding to turn the request into an action
polkit evaluates a configurable policy to determine whether privileged services should carry out requested actions