This is Simon McVittie's software development blog. Main site: pseudorandom.co.uk

Why polkit (or, how to mount a disk on modern Linux)

I've recently found myself explaining polkit (formerly PolicyKit) to one of Collabora's clients, and thought that blogging about the same topic might be useful for other people who are confused by it; so, here is why udisks2 and polkit are the way they are.

As always, opinions in this blog are my own, not Collabora's.

Privileged actions

Broadly, there are two ways a process can do something: it can do it directly (i.e. ask the kernel directly), or it can use inter-process communication to ask a service to do that operation on its behalf. If it does it directly, the components that say whether it can succeed are the Linux kernel's normal permissions checks (DAC), and if configured, AppArmor, SELinux or a similar MAC layer. All very simple so far.

Unfortunately, the kernel's relatively coarse-grained checks are not sufficient to express the sorts of policies that exist on a desktop/laptop/mobile system. My favourite example for this sort of thing is mounting filesystems. If I plug in a USB stick with a FAT filesystem, it's reasonable to expect my chosen user interface to either mount it automatically, or let me press a button to mount it. Similarly, to avoid data loss, I should be able to unmount it when I'm finished with it. However, mounting and unmounting a USB stick is fundamentally the same system call as mounting and unmounting any other filesystem - and if ordinary users can do arbitrary mount system calls, they can cause all sorts of chaos, for instance by mounting a filesystem that contains setuid executables (privilege escalation), or umounting a critical OS filesystem like /usr (denial of service). Something needs to arbitrate: “you can mount filesystems, but only under certain conditions”.

The kernel developer motto for this sort of thing is “mechanism, not policy”: they are very keen to avoid encoding particular environments' policies (the sort of thing you could think of as “business rules”) in the kernel, because that makes it non-generic and hard to maintain. As a result, direct mount/unmount actions are only allowed by privileged processes, and it's up to user-space processes to arrange for a privileged process to make the desired mount syscall.

Here are some other privileged actions which laptop/desktop users can reasonably expect to “just work”, with or without requiring a sysadmin-like (root-equivalent) user:

  • reconfiguring networking (privileged because, in the general case, it's an availability and potentially integrity issue)
  • installing, upgrading or removing packages (privileged because, in the general case, it can result in arbitrary root code execution)
  • suspending or shutting down the system (privileged because you wouldn't want random people doing this on your server, but should normally be allowed on e.g. laptops for people with physical access, because they could just disconnect the power anyway)

In environments that use a MAC framework like AppArmor, actions that would normally be allowed can become privileged: for instance, in a framework for sandboxed applications, most apps shouldn't be allowed to record audio. This prevents carrying out these actions directly, again resulting in the only way to achieve them being to ask a service to carry out the action.

Ask a system service to do it

On to the next design, then: I can submit a request to a privileged process, which does some checks to make sure I'm not trying to break the system (or alternatively, that I have enough sysadmin rights that I'm allowed to break the system if I want to), and then does the privileged action for me.

You might think I'm about to start discussing D-Bus and daemons, but actually, a prominent earlier implementation of this was mount(8), which is normally setuid root:

% ls -l /bin/mount
-rwsr-xr-x 1 root root 40000 May 22 11:37 /bin/mount

If you look at it from an odd angle, this is inter-process communication across a privilege boundary: I run the setuid executable, creating a process. Because the executable has the setuid bit set, the kernel makes the process highly privileged: its effective uid is root, and it has all the necessary capabilities to mount filesystems. I submit the request by passing it in the command-line arguments. mount does some checks - specifically, it looks in /etc/fstab to see whether the filesystem I'm trying to mount has the “user” or “users” flag - then carries out the mount system call.

There are a few obvious problems with this:

  • When machines had a static set of hardware devices (and a sysadmin who knew how to configure them), it might have made sense to list them all in /etc/fstab; but this is not a useful solution if you can plug in any number of USB drives, or if you are a non-expert user with Linux on your laptop. The decision ought to be based on general attributes of devices, such as “is removable?”, and on the role of the machine.
  • Setuid executables are alarmingly easy to get wrong so it is not necessarily wise to assume that mount(8) is safe to be setuid.
  • One fact that a reasonable security policy might include is “users who are logged in remotely should have less control over physically present devices than those who are physically present” - but that sort of thing can't be checked by mount(8) without specifically teaching the mount binary about it.

Ask a system service to do it, via D-Bus or other IPC

To avoid the issues of setuid, we could use inter-process communication in the traditional sense: run a privileged daemon (on boot or on-demand), make it listen for requests, and use the IPC channel as our privilege boundary.

udisks2 is one such privileged daemon, which uses D-Bus as its IPC channel. D-Bus is a commonly-used inter-process system; one of its intended/designed uses is to let user processes and system services communicate, especially this sort of communication between a privileged daemon and its less-privileged clients.

People sometimes criticize D-Bus as not doing anything you couldn't do yourself with some AF_UNIX sockets. Well, no, of course it doesn't - the important bit of the reference implementation and the various interoperable reimplementations consists of a daemon and some AF_UNIX sockets, and the rest is a simple matter of programming. However, it's sufficient for most uses in its problem space, and is usually better than inventing your own.

The advantage of D-Bus over doing your own thing is precisely that you are not doing your own thing: good IPC design is hard, and D-Bus makes some structural decisions so that fewer application authors have to think about them. For instance, it has a central “hub” daemon (the dbus-daemon, or “message bus”) so that n communicating applications don't need O(n²) sockets; it uses the dbus-daemon to provide a total message ordering so you don't have to think about message reordering; it has a distributed naming model (which can also be used as a distributed mutex) so you don't have to design that; it has a serialization format and a type system so you don't have to design one of those; it has a framework for “activating" run-on-demand daemons so they don't have to use resources initially, implemented using a setuid helper and/or systemd; and so on.

If you have religious objections to D-Bus, you can mentally replace “D-Bus” with “AF_UNIX or something” and most of this article will still be true.

Is this OK?

In either case - exec'ing a privileged helper, or submitting a request to a privileged daemon via IPC - the privileged process has two questions that it needs to answer before it does its work:

  • what am I being asked to do?
  • should I do it?

It needs to make some sort of decision on the latter based on the information available to it. However, before we even get there, there is another layer:

  • did the request get there at all?

In the setuid model, there is a simple security check that you can apply: you can make /bin/mount only executable by a particular group, or only executable by certain AppArmor profiles, or similar. That works up to a point, but cannot distinguish between physically-present and not-physically-present users, or other facts that might be interesting to your local security policy. Similarly, in the IPC model, you can make certain communication channels impossible, for instance by using dbus-daemon's ability to decide which messages to deliver, or AF_UNIX sockets' filesystem permissions, or a MAC framework like AppArmor.

Both of these are quite “coarse-grained” checks which don't really understand the finer details of what is going on. If the answer to "is this safe?” is something of the form “maybe, it depends on...”, then they can't do the right thing: they must either let it through and let the domain-specific privileged process do the check, or deny it and lose potentially useful functionality.

For instance, in an AppArmor environment, some applications have absolutely no legitimate reason to talk to udisks2, so the AppArmor policy can just block it altogether. However, once again, this is a coarse-grained check: the kernel has mechanism, not policy, and it doesn't know what the service does or why. If the application does need to be able to talk to the service at all, then finer-grained access control (obeying some, but not all, requests) has to be the service's job.

dbus-daemon does have the ability to match messages in a relatively fine-grained way, based on the object path, interface and member in the message, as well as the routing information that it uses itself (i.e. the source and destination). However, it is not clear that this makes a great deal of sense conceptually: these are facts about the mechanics of the IPC, not facts about the domain-specific request (because the mechanics of the IPC are all that dbus-daemon understands). For instance, taking the udisks2 example again, dbus-daemon can't distinguish between an attempt to adjust mount options for a USB stick (probably fine) and an attempt to adjust mount options for /usr (not good).

To have a domain-specific security policy, we need a domain-specific component, for instance udisks2, to get involved. Unlike dbus-daemon, udisks2 knows that not all disks are equal, knows which categories make sense to distinguish, and can identify which categories a particular disk is in. So udisks2 can make a more informed decision.

So, a naive approach might be to write a function in udisks2 that looks something like this pseudocode:

may_i_mount_this_disk (user, disk, mount options) → boolean
{
    if (user is root || user is root-equivalent)
        return true;

    if (disk is not removable)
       return false;

    if (mount options are scary)
       return false;

    if (user is in “manipulate non-local disks” group)
        return true;

    if (user is not logged-in locally)
       return false;

    # https://en.wikipedia.org/wiki/Multiseat_configuration
    if (user is not logged-in on the same seat where the disk is
            plugged in)
       return false;

    return true;
}

Delegating the security policy to something central

The pseudocode security policy outlined above is reasonably complicated already, and doesn't necessarily cover everything that you might want to consider.

Meanwhile, not every system is the same. A general-purpose Linux distribution like Debian might run on server/mainframe systems with only remote users, personal laptops/desktops with one root-equivalent user, locked-down corporate laptops/desktops, mobile devices and so on; these systems should not necessarily all have the same security policy.

Another interesting factor is that for some privileged operations, you might want to carry out interactive authorization: ask the requesting user to confirm that the action (which might have come from a background process) should take place (like Windows' UAC), or to prove that the person currently at the keyboard is the same as the person who logged in by giving their password (like sudo).

We could in principle write code for all of this in udisks2, and in NetworkManager, and in systemd, ... - but that clearly doesn't scale, particularly if you want the security policy to be configurable. Enter polkit (formerly PolicyKit), a system service for applying security policies to actions.

The way polkit works is that the application does its domain-specific analysis of the request - in the case of udisks2, whether the device to be mounted is removable, whether the mount options are reasonable, etc. - and converts it into an action. The action gives polkit a way to distinguish between things that are conceptually different, without needing to know the specifics. For instance, udisks2 currently divides up filesystem-mounting into org.freedesktop.udisks2.filesystem-mount, org.freedesktop.udisks2.filesystem-mount-fstab, org.freedesktop.udisks2.filesystem-mount-system and org.freedesktop.udisks2.filesystem-mount-other-seat.

The application also finds the identity of the user making the request. Next, the application sends the action, the identity of the requesting user, and any other interesting facts to polkit. As currently implemented, polkit is a D-Bus service, so this is an IPC request via D-Bus. polkit consults its database of policies in order to choose one of several results:

  • yes, allow it
  • no, do not allow it
  • ask the user to either authenticate as themselves or as a privileged (sysadmin) user to allow it, or cancel authentication to not allow it
  • ask the user to authenticate the first time, but if they do, remember that for a while and don't ask again

So how does polkit decide this? The first thing is that it reads the machine-readable description of the actions, in /usr/share/polkit-1/actions, which specifies a default policy. Next, it evaluates a local security policy to see what that says. In the current version of polkit, the local security policy is configured by writing JavaScript in /etc/polkit-1/rules.d (local policy) and /usr/share/polkit-1/rules.d (OS-vendor defaults). In older versions such as the one currently shipped in Debian unstable, there was a plugin architecture; but in practice nobody wrote plugins for it, and instead everyone used the example local authority shipped with polkit, which was configured via files in /etc/polkit-1/localauthority and /etc/polkit-1/localauthority.d.

These policies can take into account useful facts like:

  • what is the action we're talking about?
  • is the user logged-in locally? are they active, i.e. they are not just on a non-current virtual console?
  • is the user in particular groups?

For instance, gnome-control-center on Debian installs this snippet:

polkit.addRule(function(action, subject) {
        if ((action.id == "org.freedesktop.locale1.set-locale" ||
             action.id == "org.freedesktop.locale1.set-keyboard" ||
             action.id == "org.freedesktop.hostname1.set-static-hostname" ||
             action.id == "org.freedesktop.hostname1.set-hostname" ||
             action.id == "org.gnome.controlcenter.datetime.configure") &&
            subject.local &&
            subject.active &&
            subject.isInGroup ("sudo")) {
                    return polkit.Result.YES;
            }
});

which is reasonably close to being pseudocode for “active local users in the sudo group may set the system locale, keyboard layout, hostname and time, without needing to authenticate”. A system administrator could of course override that by dropping a higher-priority policy for some or all of these actions into /etc/polkit-1/rules.d.

Summary

  • Kernel-based permission checks are not sufficiently fine-grained to be able to express some quite reasonable security policies
  • Fine-grained access control needs domain-specific understanding
  • The kernel doesn't have that information (and neither does dbus-daemon)
  • The privileged service that does the domain-specific thing can provide the domain-specific understanding to turn the request into an action
  • polkit evaluates a configurable policy to determine whether privileged services should carry out requested actions
still aiming to be the universal operating system

Debian's latest round of angry mailing list threads have been about some combination of init systems, future direction and project governance. The details aren't particularly important here, and pretty much everything worthwhile in favour of or against each position has already been said several times, but I think this bit is important enough that it bears repeating: the reason I voted "we didn't need this General Resolution" ahead of the other options is that I hope we can continue to use our normal technical and decision-making processes to make Debian 8 the best possible OS distribution for everyone. That includes people who like systemd, people who dislike systemd, people who don't care either way and just want the OS to work, and everyone in between those extremes.

I think that works best when we do things, and least well when a lot of time and energy get diverted into talking about doing things. I've been trying to do my small part of the former by fixing some release-critical bugs so we can release Debian 8. Please join in, and remember to write good unblock requests so our hard-working release team can get through them in a finite time. I realise not everyone will agree with my idea of which bugs, which features and which combinations of packages are highest-priority; that's fine, there are plenty of bugs to go round!

Regarding init systems specifically, Debian 'jessie' currently works with at least systemd-sysv or sysvinit-core as pid 1 (probably also Upstart, but I haven't tried that) and I'm confident that Debian developers won't let either of those regress before it's released as Debian 8.

I expect the freeze for Debian 'stretch' (presumably Debian 9) to be a couple of years away, so it seems premature to say anything about what will or won't be supported there; that depends on what upstream developers do, and what Debian developers do, between now and then. What I can predict is that the components that get useful bug reports, active maintenance, thorough testing, careful review, and similar help from contributors will work better than the things that don't; so if you like a component and want it to be supported in Debian, you can help by, well, supporting it.


PS. If you want the Debian 8 installer to leave you running sysvinit as pid 1 after the first reboot, here's a suitable incantation to add to the kernel command-line in the installer's bootloader. This one certainly worked when KiBi asked for testing a few days ago:

preseed/late_command="in-target apt-get install -y sysvinit-core"

I think that corresponds to this line in a preseeding file, if you use those:

d-i preseed/late_command string in-target apt-get install -y sysvinit-core

A similar apt-get command, without the in-target prefix, should work on an installed system that already has systemd-sysv. Depending on other installed software, you might need to add systemd-shim to the command line too, but when I tried it, apt-get was able to work that out for itself.

If you use aptitude instead of apt-get, double-check what it will do before saying "yes" to this particular switchover: its heuristic for resolving conflicts seems to be rather more trigger-happy about removing packages than the one in apt-get.

Posted
Debian activities

Most of my recent Debian work has been the usual pkg-telepathy stuff (mainly in experimental while we're frozen), and hacking on the Quake III engine.

Having started working on OpenArena DFSG-freeness as random bug-squashing a year ago, I've accidentally ended up taking over the package. The situation for Debian 6.0 (squeeze) looks something like this:

  • openarena contains both the engine (a modified ioquake3, which I modified further for Debian) and the game logic compiled to native code
  • openarena-data has had the bytecode game logic stripped out, since no Free compiler can produce it; the bytecode has been replaced with stub files that direct our modified engine to load native code

The plan for Quake III in Debian 7.0 (wheezy) has already started in experimental:

  • ioquake3 contains the ioquake3.org port of the Quake III engine, which I've modified to be able to run either Quake III Arena or OpenArena based on runtime options
  • openarena contains the game logic compiled to native code (in experimental it still contains source code for the engine, but it's no longer patched or compiled, and I'll hopefully drop it entirely after the next upstream release), and launcher scripts to run it under the ioquake3 engine
  • openarena-data is much like it is now, although it might move to data.debian.org if that happens
  • quake3 (currently in the NEW queue) contains launcher scripts to run Quake III Arena under the ioquake3 engine, requiring a non-distributable quake3-data package
  • game-data-packager version 23 or later can build the quake3-data package for local use, if you give it pak0.pk3 (from the CD-ROM) and the latest Quake III patch

I haven't been doing as much QA work as the RCBW crowd, but here's some halfway-recent bug squashing in the hope that it motivates others:

I also started looking at the series of bugs regarding Flash not compiled from source, but I mostly got distracted by all the webapps having a contrib/ directory containing a million embedded code copies...

Announcing gfcombinefs

gfcombinefs is a side project I did some work on a few weeks ago. It combines several "shares" of a secret file previously split using Shamir secret sharing, to produce the original secret, and presents it as a file in a FUSE filesystem. It uses libgfshare for the actual mathematics, and expects its input to have been split with the gfsplit tool shipped with libgfshare.

At this stage of development, I suggest not trusting it with important data, like the GnuPG secret keyrings it's designed for. However, I hope that with some feedback from others I can get it into a state where it's ready to be released and packaged (perhaps I'm being unnecessarily conservative, but for something that deals with GnuPG keys, it seems wise to be careful).

Source code is available in a git repository, and I'd welcome contributions, bug reports (other than the limitations that are already listed in the documentation) and in particular, a systematic code audit from someone (the good news is that there isn't very much code, so that shouldn't actually take long).

Posted
RC Bugs of W48-W49

More release critical bug squashing of the "week" (fortnight, really):

Removals from testing and proposed removals from unstable:

Possible candidates for removal:

  • bfm (unmaintained upstream since 2004, declining popcon score)

Other drive-by QA:

Older posts:

RC Bugs of W47
Posted
telepathy-qt4 0.2.0: first shared library release
Posted
DSpam and how not to do it
Posted
Converting Debian packaging from bzr to git
Posted
Why you shouldn't block on D-Bus calls
Posted
Encrypted root filesystem on a Debian laptop
Posted
Compiling against uninstalled versions of libraries
Posted
Integrating Enemies of Carlotta with Postfix
Posted
Speeding up builds with ccache and icecc
Posted
Spec-writing On A Plane
Posted
\o/
Posted
Implementing D-Bus services with telepathy-glib
Posted
Here's how it works
Posted
A magical xrandr incantation
Posted

This blog is powered by ikiwiki.