Not quite 6 months since I started the Linux Foundation fellowship, it's time to analyze and reflect on what has (or hasn't) been accomplished.
Some statistics
I took over maintainership of man-pages at the start of November 2004, with the first release being man-pages-2.00. From then until the fellowship started in the middle of May this year (a period of 185 weeks), I probably spent between 0 and 2.5 days a week on man-pages, most of it done as private, volunteer work. (For a period of around a year, I probably managed up about half day a week as part of my day job; thanks Google!) I'd guess it was a bit better than day a week on average (let's say 1.25 days), and we could roughly estimate that as the equivalent of 45 working weeks.
Since the fellowship started, I've worked for about 25 weeks on man-pages; that is, somewhat more than half of the estimated time that I spent on man-pages in the preceding 3.5 years. The first release during my tenure of the fellowship was man-pages-2.80, and since then there have been 15 more (man-pages-3.00 through man-pages-3.14).
What I'm expecting is that the limiting factor in the progress of man-pages is the availability of my time. If I get to work at around four times the rate I did before, then we should see a corresponding increase in the progress of man-pages. Very roughly, in the last 6 months, progress should have been somewhat more than 50% of what it was in the previous 3.5 years.So here's a first comparison:
Period | Number of releases |
Pre-fellowship | 80 |
During fellowship | 16 |
Well, that doesn't look so good. But there's no question that there's more work going on for each release nowadays. Here's another simple statistic, derived from the commit logs:
Period | Number of commits |
Pre-fellowship | 3610 |
During fellowship | 1852 |
Commits in the last 6 months were nearly 50% of the total during the previous 3.5 years. That seems roughly in line with expectations, and supports the theory that there's a lot more work going into each man-pages release nowadays. Of course, commits vary a lot in size, ranging from a spelling fix, to a complete new page, and going through to some of the enormous global formatting fixes that took place in the man-pages-2.* series, so this is a very rough measure. (One of the commits cleaning up source files layout in man-pages-2.47 had a diff size of more than 60000 lines(!). There were many other large formatting commits in the man-pages-2.*, which is why trying to compare the volume of diffs before and during the fellowship doesn't produce a useful metric.)Another rough measure is how many man pages were added to the set over time:
Period | Number of new pages added |
Pre-fellowship | 93 |
During fellowship | 56 |
Again, that's roughly in line with expectations, with the number of pages added during the fellowship being somewhat more than 50% of the previous period.
But where did the new pages come from?
Period | By mtk | By others | By mtk + other(s) | Imports |
Pre-fellowship | 52 | 22 | 13 | 6 |
During fellowship | 50 | 4 | 1 | 1 |
"mtk" is me. "Other(s)" is someone else. "Imports" are pages under a free license that I scooped up from some other source (e.g., found on the net, in a distro, or in BSD).
On the negative side, I wrote the vast majority of new pages that have been added so far during the fellowship. On the positive side, Paul Jackson contributed the single biggest page, cpuset(7), which became the fourth largest page in man-pages. (Also worth noting: in the man-pages-2.* releases, a total of 28 pages were deleted, mainly obsolete pages in Section 1.) In fact, I had hoped to be able to get even more pages written, but other tasks, such as testing, API review, and kernel patches have also taken up a significant fraction of my time during the fellowship. When considered as a (calendar) monthly rate, contributions of new pages by others are, unfortunately, essentially unchanged since before the fellowship.
So, progress towards improving contributions by others, at least in terms of new pages, has not been good. However, my gut feeling has been that more people are actually contributing to man-pages than before: the fact that there is a full-time maintainer means people are rather more likely to send bug reports, suggestions, and patches for existing pages. Here's a statistic that bears it out:
Period | Average contributors/week |
Pre-fellowship | 2.8 |
During fellowship | 5.9 |
This was calculated by summing the number of contributors in each of the change logs in all of the releases over the two periods and then dividing by the number of calendar weeks in each period (185 and 28 respectively). 5.9 contributors per week is still much lower than I'd like, but my feeling is that the rate has increased steadily over the time of the fellowship, so that the current rate is already higher than 5.9, and set to increase further. (Another factor that may also have helped boost the number of reports is that in December 2007 I started adding a COLOPHON to each man page describing how to report bugs, and this change would have filtered into distribution CDs a few months later.)
Timeliness of documentation
Things have defintely got better during the fellowship. Most additions and changes to the kernel-userland interface during the time of the fellowship have been documented in man-pages pretty much as they occur. (This contrasts with earlier times, where interface changes have sometimes been followed only months (or in extreme cases years) later by man page updates.) Most notably, Ulrich Drepper's new system calls in Linux 2.6.27 saw man pages go out a few days after the release of that kernel.
Testing and bug reporting
I've done a fair bit of this over the course of the fellowship. Most new system calls and system call extensions got tested by me before they hit mainline. This uncovered a few bugs which were then fixed. The biggest single piece of work here was for the utimensat(2) system call, producing a test suite (later integrated into LTP), along with patches that fixed the 5 or so bugs in the interface (details here).
Many existing glibc functions also got tested as I updated the man pages for them. Most notably, updates to the man pages produced about 35 bug reports related to error reporting by the math functions. The addition of man pages for various pthreads functions has also been accompanied by a lot of testing, and a half dozen or so bug reports.
API design review
Most new system calls and system call extensions got reviewed before going into mainline. (My record on other kernel interfaces, such as /proc files, was a more spotty though.) Among other things, this resulted in a redesign of the proposed extension of the accept() system call (originally proposed as paccept(), with a signal set argument whose necessity was dubious, later revised to accept4(), which should appear in kernel 2.6.28).
Miscellaneous
- man-pages moved from a private Subversion repository to a public git repository on kernel.org.
- In general, I'm blogging a bit more actively nowadays, and in addition to posts summarizing releases, there have been longer posts on topics such as: problems with kernel-userland interface design and implementation; the state of error reporting for glibc's math functions; and a few articles describing or summarizing Linux kernel-userland interface changes.
- After my presentation at LPC for the kernel-userland interface track, I finally got round to an idea I'd been considering for a while: creating the linux-api mailing list. The rationale for the list is that all patches that cause API/ABI changes should be CCed to the list, so that the many parties who are interested in API/ABI changes (e.g., man-pages, LSB, libc developers, kernel developers, testers such as the folk at LTP, and of course userland developers) can get an idea of what's going on. Most people still don't read Documentation/SubmitChecklist, to know they should be using this list, so I try to regularly chase people to use it (and some others also help in that regard), and by now at least some people do so without prompting.
- I've helped out LSB on a number of occasions, filing a few bug reports against the spec, but also helping out by writing man pages for functions that they wanted to specify that were currently undocumented (see, e.g., here and here). This resulted in new man pages such as getprotoent_r(3), getservent_r(3), getnetent_r(3), and pthread_getattr_np().
- I continue to respond to many bug reports in the manpages and manpages-dev components of Debian's bug tracking system. This has mutual benefits: on the one hand, although I'm not actually a member of Debian, I'm by far the most active fixer of their bug reports; on the other hand, most Debian bug reports for man pages really apply to the upstream pages (I ignore the ones that don't), and so the reports provide a valuable source of pointers to things that need fixing in man-pages. A big thank you to Debian users, who produce far more (and more useful) man-pages bug reports than all of the other distributions put together!
- Working on man-pages led me to find various deficiencies in POSIX.1 specifications, resulting in around a half dozen bug reports to the Austin group.
No comments:
Post a Comment