Wednesday, November 17, 2010

The Linux Programming Interface is released

I'm happy to announce that my book, The Linux Programming Interface (TLPI), is now available. TLPI is a detailed guide and reference for system programming on Linux and UNIX systems, 1552 pages in length, with 115 diagrams, 88 tables, nearly 200 example programs, and over 200 exercises.

The TLPI web site contains the detailed table of contents (PDF or HTML), preface, sample chapters, and index for download. It also provides complete source code for the book (both as a tarball for download and browsable as individual files online). You can find a detailed description of TLPI on the web site here.

A few reviews (all extremely positive) have already appeared. You can find pointers to reviews here.

For information on ordering (a nice way to support the work of the man-pages maintainer!), look here.

(Post updated 2010-11-20, to fix a typo and add a detail to the description.)

Sunday, November 14, 2010

man-pages-3.31 is released

I've uploaded man-pages-3.31 into the release directory (or view the online pages). This is a fairly small release. The most notable changes in man-pages-3.31 are as follows:

  • The getrlimit(2) man page adds documentation of the prlimit() system call, which was new in Linux 2.6.36.
  • The inotify(7) man page adds documentation of the IN_EXCL_UNLINK flag, also new in Linux 2.6.36.

Sunday, November 7, 2010

System call credential checking (a tale of inconsistency)

While looking at the new prlimit() system call in Linux 2.6.36, I surveyed the various system calls that allow one process to change the operation or attributes of another (arbitrary) process. In general, these system calls require either that the caller is privileged (i.e., has some capability) or that there is a match between the credentials (user or group IDs) of the calling process and the target process.

There's a great deal of inconsistency. As at 2.6.36, here's what we have (in the following, uid means the real UID of the caller, euid means the effective UID, and suid means the saved set-user-ID; a similar convention applies for the group IDs--thus gid, egid, sgid; and a "t-" prefix means the corresponding credentials of the target process):

  • setpriority(), sched_setscheduler(), sched_setparam(), sched_setaffinity(): CAP_SYS_NICE || euid == t-uid || euid == t-euid. This is sane: you can make changes to another process if you have the right capability or you own the process--that is, you (i.e., here "you" means the UID currently operated via the effective UID) can change the attributes of a process that was originally created by you (euid == t-uid) or one that has assumed (via the set-user-ID mechanism) your identity (euid == t-euid). POSIX specifies that the checks for setpriority() are uid == t-euid || euid == t-euid; the Linux semantics are arguably saner (and are consistent with historical BSD behavior). POSIX specifies sched_setscheduler() and sched_setparam() but does not specify their permission-checking semantics.
  • ioprio_set(): CAP_SYS_NICE || uid == t-uid || euid == t-uid. The caller is privileged, or the caller's real or effective UID matches the target process's UID. There's no obvious reason for the inconsistency with setpriority().
  • migrate_pages(), move_pages(): CAP_SYS_NICE || uid == t-uid || uid == t-suid || euid == t-uid || euid == t-suid. Like setpriority(), but you can also make changes if your real UID matches target credentials. Again, there's no obvious reason for the inconsistency with setpriority().
  • kill(), killpg(): CAP_KILL || uid == t-uid || uid == t-suid || euid == t-uid euid == t-suid. The UID-matching semantics are as required by POSIX: the real or effective UID of the caller must match the real or saved set-user-ID of the target.
  • prlimit(): CAP_SYS_RESOURCE || (uid == t-uid && uid == t-euid && uid == t-suid) && (gid == t-gid && gid == t-guid && gid == t-sgid). Now we start to get into strange territory. Using CAP_SYS_RESOURCE makes sense, because CAP_SYS_RESOURCE is used for the privilege checks in the setrlimit() system call. However, requiring that all of the UIDs of the target match the real UID of the caller is quite inconsistent with any of the other APIs. Adding an analogous check for the group IDs further compounds the inconsistency.
One thing to note: the behavior of most of the Linux-specific system calls (i.e., ioprio_set(), move_pages(), migrate_pages(), and prlimit()) was documented only after the implementation, which I'd argue was a contributing factor to the inconsistencies described above.

Monday, November 1, 2010

man-pages-3.30 is released

I've uploaded man-pages-3.30 into the release directory (or view the online pages). The most notable changes in man-pages-3.30 are as follows:

  • A new kexec_load(2) man page documents the kexec_load() system call. Thanks to Andi Kleen.
  • A new lio_listio(3) page documents the lio_listio() library function.
  • The reboot(2) page adds documentation of the LINUX_REBOOT_KEXEC command.
  • The unshare(2) page adds documentation of CLONE_NEWIPC, CLONE_NEWNET, CLONE_SYSVSEM, and CLONE_NEWUTS.
  • Various consistency fixes were made across a wide range of pages.