by Matt Frye
If you’re reading this article in Red Hat Magazine, it’s hard to imagine that you don’t know the story of RPM, the package manager that is the core of so much of Red Hat’s Linux experience. From a beginner’s first installation to the Free and Open Source Software (FOSS) developer’s latest Fedora release, RPM is inherently part of the Linux user interaction. But what happens when a core piece of software suffers from politics and agendas, cruft, and bad decisions–or no decisions at all?
RPM has endured a little of each and is now at a crossroads. This article examines some of the decisions, indecisions, and bungles that led to the current state of RPM. Where is RPM headed? What led to its current (forked) state, and what should Red Hat, Fedora, or any other major stakeholder do about it? In whose hands does RPM rest? This article is not meant to be a technical guide or primer on RPM, but presents the history of RPM in context, including some project background.
For those just getting started with Linux, RPM refers to the RPM package manager, formerly the Red Hat Package Manager. On the most basic level, RPM is a powerful command-line-driven package management system capable of installing, uninstalling, verifying, querying, and updating software packages. RPM is free software, released under the GPL, and is a core component of many Linux distributions. Red Hat® Enterprise Linux®, the Fedora™ Project, SUSE, openSUSE, CentOS, and Mandriva, and many others use RPM. RPM is also available in other operating systems and is part of the Linux Standard Base.
Where did RPM come from? The functionality that we know as RPM came from a few different projects. RPP was used in Red Hat Linux prior to version 2.0, supported one-command installation, package verification, and had a powerful query mechanism. The major problem with RPP was that it didn’t rise from pristine sources. This meant that RPP’s packages were based on source code that had been specifically modified for RPP. This was a problem because different versions of source code had to be released for the same product. If you’re a developer managing dozens of packages, keeping track of all the different versions can complicate your project management and cause you to devote more time to package management than you need to.
Another project that eventually became part of RPM was PMS. PMS was developed at the same time as RPP and was part of the BOGUS Linux release. PMS had a fairly feeble query mechanism and no package verification. Its saving grace was that it used pristine sources. Under contract to Red Hat Software, Rik Faith and Doug Hoffman created PM from the best features of RPP and PMS. Although PM was very close to a viable package management system, it was never used in a commercially available project.
With a few attempts at package management behind them, Marc Ewing and Erik Troan developed RPM. Although it was built on the experiences of RPP, PMS, and PM, RPM was innately different. It was written in Perl for fast development and attempted to solve many of the problems of its predecessors. This led to a host of other problems, not the least of which was trying to cram a Perl implementation on a boot floppy. By version 2, RPM had been completely rewritten in C, the database format had been redesigned to provide reliability and improve performance, and rpmlib–the library of RPM routines–allowed developers to use RPM functionality in their applications.
Red Hat realized early on that decisions made on Red Hat’s behalf would affect developers up- and downstream. As such, development of RPM consisted of five design goals:
- Make it easy to get packages on and off the system
- Make it easy to verify a package was installed correctly
- Make it easy for the package builder
- Make it start with the original source code
- Make it work on different computer architectures
The building and maintenance of packages had to be kept as simple as the installation of packages. Also, to make it easy for developers to keep track of changes made to code in a package, the elemental components of a package had to be simplified.
These goals made RPM a relatively easily solved problem. And, by 1998, Red Hat effectively had its package manager. However, with major engineering problems out of the way, the remaining issue became innovating on RPM while maintaining legacy functionality. While engineers tried to work out bugs and get new releases of RPM through quality assurance (QA), new innovations were saddled with ever-deepening requirements to satisfy. As a result, development of new RPM functionality was stifled.
“It’s hard to innovate on such a central piece of software,” said Greg DeKoenigsberg, Red Hat’s Community Development Manager. The RPM implementation, says Greg, is “10 years worth of cruft. It may well be that, upon closer examination, a number of RPM’s features must work in the exact way in which they’ve been coded. But there’s also complexity that we don’t think we need.”
Recognizing this issue was a big step towards getting RPM back on track. On December 14, 2006, Max Spevack (the Fedora Project leader) posted to the fedora-announce mailing list stating that Red Hat was ready to focus on RPM again. In the announcement, Spevack spoke of the technology of RPM, the relationship Red Hat and Fedora shared with the community that existed around RPM, and their renewed commitment. Perhaps most telling of the innovation struggle was Spevack’s fourth major point in the email:
“RPM, as an application, has a fairly mature feature set that we are very interested in stabilizing and bug fixing. Furthermore, we want to make sure that RPM is a stable and simplified base for the building of other technologies on top of it. Down the road, we might be interested in exploring a variety of new features, but we don’t believe that should be the initial focus of our efforts.”
So where does RPM go from here? “We need a strong technical community around RPM that believes what we believe: that stability and community are paramount.” Again, Greg DeKoenigsberg. “We see RPM as being under Red Hat’s influence, rather than control, and that’s an important distinction.”
Its clear that one of the side effects of the RPM experience is that Red Hat has learned things can get significantly off track when accountability isn’t clear in an FOSS project. As the company changed and FOSS became a larger part of the general computing landscape, many of the key engineers (and much of the RPM team) left to pursue other opportunities. Red Hat was complacent in exerting major influence over RPM in their absence, and so when Fedora finally made its move to reclaim RPM, it had to use 4.4.2–the common base for Novell and Red Hat, but several releases behind the latest work.
One can argue that this effective fork of RPM is a good thing. While the general perception about forks is that they fragment the development community and are detrimental to its growth, forks result in new projects and new competition. Consider the example of the X.org/XFree86 split. When releasing version 4.4 of the XFree86 server, licensing changed and most major Linux distributions threw XFree86 overboard in favor of the X.org fork. Prior to this, there was an additional split led by Keith Packard following dissatisfaction with XFree86 development. Among the problems Packard cited were limited development resources, slow release schedules, lack of co-operation with other projects (notably GNOME and KDE) and opacity of the development process.
In the end, this created competition for XFree86, and XFree86 obtained the resources they needed from work with freedesktop.org. The competitive drive made both projects better. This illustrates a fundamental truth about Free and Open Source Software: when end users are included in the process of building software and making decisions, FOSS succeeds. In the course of any project, there are eventually divergent views on how things should go, what path to take, and where to stop working on one piece of code over another. Project leaders will disagree and a decision has to be made. This is fertile ground for a fork. Managed properly, and undertaken for the right reasons, a fork can improve both projects while increasing diversity and preserving cooperation and competition.
With FOSS, there are additional benefits to forking. If, after a fork, one branch is innovating more than the other–making the “right” decisions–it may attract a larger portion of the user base over time. Since the code is freely available, one branch can borrow ideas from the other and the best ideas are replicated across both projects. Ultimately, good ideas propagate throughout the community.
This also applies to RPM. Despite the awesome success of Ubuntu, there are still millions of users of RPM. And, perhaps now Red Hat is ready for the best ideas from the community to flow into RPM. This willingness is what you already see in projects like Yellowdog Updater Modified (YUM) and what you will see in the near future in projects like the Community Package Manager (CPM), a proposed strip down and rewrite of RPM.
RPM has a long way to go to build back its previous momentum, but it also has a lot going for it; among other things, a great community. For now, you can find more information at www.RPM.org, including a wiki, mailing lists, information about the #rpm IRC channel and more. There are many ways to contribute to RPM and now’s probably the best time to join in. While there may be some bad blood and scar tissue left over from the previous flame wars, it’s in every user’s best interest to foster growth in the community.