Wednesday, December 3, 2008

Linux Foundation fellowship, 6 months in

Not quite 6 months since I started the Linux Foundation fellowship, it's time to analyze and reflect on what has (or hasn't) been accomplished.

Some statistics

I took over maintainership of man-pages at the start of November 2004, with the first release being man-pages-2.00. From then until the fellowship started in the middle of May this year (a period of 185 weeks), I probably spent between 0 and 2.5 days a week on man-pages, most of it done as private, volunteer work. (For a period of around a year, I probably managed up about half day a week as part of my day job; thanks Google!) I'd guess it was a bit better than day a week on average (let's say 1.25 days), and we could roughly estimate that as the equivalent of 45 working weeks.

Since the fellowship started, I've worked for about 25 weeks on man-pages; that is, somewhat more than half of the estimated time that I spent on man-pages in the preceding 3.5 years. The first release during my tenure of the fellowship was man-pages-2.80, and since then there have been 15 more (man-pages-3.00 through man-pages-3.14).

What I'm expecting is that the limiting factor in the progress of man-pages is the availability of my time. If I get to work at around four times the rate I did before, then we should see a corresponding increase in the progress of man-pages. Very roughly, in the last 6 months, progress should have been somewhat more than 50% of what it was in the previous 3.5 years.So here's a first comparison:






PeriodNumber of releases
Pre-fellowship80
During fellowship16

Well, that doesn't look so good. But there's no question that there's more work going on for each release nowadays. Here's another simple statistic, derived from the commit logs:






PeriodNumber of commits
Pre-fellowship3610
During fellowship1852


Commits in the last 6 months were nearly 50% of the total during the previous 3.5 years. That seems roughly in line with expectations, and supports the theory that there's a lot more work going into each man-pages release nowadays. Of course, commits vary a lot in size, ranging from a spelling fix, to a complete new page, and going through to some of the enormous global formatting fixes that took place in the man-pages-2.* series, so this is a very rough measure. (One of the commits cleaning up source files layout in man-pages-2.47 had a diff size of more than 60000 lines(!). There were many other large formatting commits in the man-pages-2.*, which is why trying to compare the volume of diffs before and during the fellowship doesn't produce a useful metric.)Another rough measure is how many man pages were added to the set over time:





PeriodNumber of new pages added
Pre-fellowship93
During fellowship56


Again, that's roughly in line with expectations, with the number of pages added during the fellowship being somewhat more than 50% of the previous period.

But where did the new pages come from?





PeriodBy mtkBy othersBy mtk + other(s)Imports
Pre-fellowship5222136
During fellowship5041

1



"mtk" is me. "Other(s)" is someone else. "Imports" are pages under a free license that I scooped up from some other source (e.g., found on the net, in a distro, or in BSD).

On the negative side, I wrote the vast majority of new pages that have been added so far during the fellowship. On the positive side, Paul Jackson contributed the single biggest page, cpuset(7), which became the fourth largest page in man-pages. (Also worth noting: in the man-pages-2.* releases, a total of 28 pages were deleted, mainly obsolete pages in Section 1.) In fact, I had hoped to be able to get even more pages written, but other tasks, such as testing, API review, and kernel patches have also taken up a significant fraction of my time during the fellowship. When considered as a (calendar) monthly rate, contributions of new pages by others are, unfortunately, essentially unchanged since before the fellowship.

So, progress towards improving contributions by others, at least in terms of new pages, has not been good. However, my gut feeling has been that more people are actually contributing to man-pages than before: the fact that there is a full-time maintainer means people are rather more likely to send bug reports, suggestions, and patches for existing pages. Here's a statistic that bears it out:






PeriodAverage contributors/week
Pre-fellowship2.8
During fellowship5.9

This was calculated by summing the number of contributors in each of the change logs in all of the releases over the two periods and then dividing by the number of calendar weeks in each period (185 and 28 respectively). 5.9 contributors per week is still much lower than I'd like, but my feeling is that the rate has increased steadily over the time of the fellowship, so that the current rate is already higher than 5.9, and set to increase further. (Another factor that may also have helped boost the number of reports is that in December 2007 I started adding a COLOPHON to each man page describing how to report bugs, and this change would have filtered into distribution CDs a few months later.)

Timeliness of documentation

Things have defintely got better during the fellowship. Most additions and changes to the kernel-userland interface during the time of the fellowship have been documented in man-pages pretty much as they occur. (This contrasts with earlier times, where interface changes have sometimes been followed only months (or in extreme cases years) later by man page updates.) Most notably, Ulrich Drepper's new system calls in Linux 2.6.27 saw man pages go out a few days after the release of that kernel.

Testing and bug reporting

I've done a fair bit of this over the course of the fellowship. Most new system calls and system call extensions got tested by me before they hit mainline. This uncovered a few bugs which were then fixed. The biggest single piece of work here was for the utimensat(2) system call, producing a test suite (later integrated into LTP), along with patches that fixed the 5 or so bugs in the interface (details here).

Many existing glibc functions also got tested as I updated the man pages for them. Most notably, updates to the man pages produced about 35 bug reports related to error reporting by the math functions. The addition of man pages for various pthreads functions has also been accompanied by a lot of testing, and a half dozen or so bug reports.

API design review

Most new system calls and system call extensions got reviewed before going into mainline. (My record on other kernel interfaces, such as /proc files, was a more spotty though.) Among other things, this resulted in a redesign of the proposed extension of the accept() system call (originally proposed as paccept(), with a signal set argument whose necessity was dubious, later revised to accept4(), which should appear in kernel 2.6.28).

Miscellaneous

A summary of other man-pages work that I've done during the time of the fellowship:
  • After my presentation at LPC for the kernel-userland interface track, I finally got round to an idea I'd been considering for a while: creating the linux-api mailing list. The rationale for the list is that all patches that cause API/ABI changes should be CCed to the list, so that the many parties who are interested in API/ABI changes (e.g., man-pages, LSB, libc developers, kernel developers, testers such as the folk at LTP, and of course userland developers) can get an idea of what's going on. Most people still don't read Documentation/SubmitChecklist, to know they should be using this list, so I try to regularly chase people to use it (and some others also help in that regard), and by now at least some people do so without prompting.
  • I continue to respond to many bug reports in the manpages and manpages-dev components of Debian's bug tracking system. This has mutual benefits: on the one hand, although I'm not actually a member of Debian, I'm by far the most active fixer of their bug reports; on the other hand, most Debian bug reports for man pages really apply to the upstream pages (I ignore the ones that don't), and so the reports provide a valuable source of pointers to things that need fixing in man-pages. A big thank you to Debian users, who produce far more (and more useful) man-pages bug reports than all of the other distributions put together!
  • Working on man-pages led me to find various deficiencies in POSIX.1 specifications, resulting in around a half dozen bug reports to the Austin group.

Mariposa
Publish Post

No comments: