Rate this page del.icio.us  Digg slashdot StumbleUpon

The story of RPM

by Matt Frye

If you’re reading this article in Red Hat Magazine, it’s hard to imagine that you don’t know the story of RPM, the package manager that is the core of so much of Red Hat’s Linux experience. From a beginner’s first installation to the Free and Open Source Software (FOSS) developer’s latest Fedora release, RPM is inherently part of the Linux user interaction. But what happens when a core piece of software suffers from politics and agendas, cruft, and bad decisions–or no decisions at all?

RPM has endured a little of each and is now at a crossroads. This article examines some of the decisions, indecisions, and bungles that led to the current state of RPM. Where is RPM headed? What led to its current (forked) state, and what should Red Hat, Fedora, or any other major stakeholder do about it? In whose hands does RPM rest? This article is not meant to be a technical guide or primer on RPM, but presents the history of RPM in context, including some project background.

For those just getting started with Linux, RPM refers to the RPM package manager, formerly the Red Hat Package Manager. On the most basic level, RPM is a powerful command-line-driven package management system capable of installing, uninstalling, verifying, querying, and updating software packages. RPM is free software, released under the GPL, and is a core component of many Linux distributions. Red Hat® Enterprise Linux®, the Fedora™ Project, SUSE, openSUSE, CentOS, and Mandriva, and many others use RPM. RPM is also available in other operating systems and is part of the Linux Standard Base.

Where did RPM come from? The functionality that we know as RPM came from a few different projects. RPP was used in Red Hat Linux prior to version 2.0, supported one-command installation, package verification, and had a powerful query mechanism. The major problem with RPP was that it didn’t rise from pristine sources. This meant that RPP’s packages were based on source code that had been specifically modified for RPP. This was a problem because different versions of source code had to be released for the same product. If you’re a developer managing dozens of packages, keeping track of all the different versions can complicate your project management and cause you to devote more time to package management than you need to.

Another project that eventually became part of RPM was PMS. PMS was developed at the same time as RPP and was part of the BOGUS Linux release. PMS had a fairly feeble query mechanism and no package verification. Its saving grace was that it used pristine sources. Under contract to Red Hat Software, Rik Faith and Doug Hoffman created PM from the best features of RPP and PMS. Although PM was very close to a viable package management system, it was never used in a commercially available project.

With a few attempts at package management behind them, Marc Ewing and Erik Troan developed RPM. Although it was built on the experiences of RPP, PMS, and PM, RPM was innately different. It was written in Perl for fast development and attempted to solve many of the problems of its predecessors. This led to a host of other problems, not the least of which was trying to cram a Perl implementation on a boot floppy. By version 2, RPM had been completely rewritten in C, the database format had been redesigned to provide reliability and improve performance, and rpmlib–the library of RPM routines–allowed developers to use RPM functionality in their applications.

Red Hat realized early on that decisions made on Red Hat’s behalf would affect developers up- and downstream. As such, development of RPM consisted of five design goals:

  • Make it easy to get packages on and off the system
  • Make it easy to verify a package was installed correctly
  • Make it easy for the package builder
  • Make it start with the original source code
  • Make it work on different computer architectures

The building and maintenance of packages had to be kept as simple as the installation of packages. Also, to make it easy for developers to keep track of changes made to code in a package, the elemental components of a package had to be simplified.

These goals made RPM a relatively easily solved problem. And, by 1998, Red Hat effectively had its package manager. However, with major engineering problems out of the way, the remaining issue became innovating on RPM while maintaining legacy functionality. While engineers tried to work out bugs and get new releases of RPM through quality assurance (QA), new innovations were saddled with ever-deepening requirements to satisfy. As a result, development of new RPM functionality was stifled.

“It’s hard to innovate on such a central piece of software,” said Greg DeKoenigsberg, Red Hat’s Community Development Manager. The RPM implementation, says Greg, is “10 years worth of cruft. It may well be that, upon closer examination, a number of RPM’s features must work in the exact way in which they’ve been coded. But there’s also complexity that we don’t think we need.”

Recognizing this issue was a big step towards getting RPM back on track. On December 14, 2006, Max Spevack (the Fedora Project leader) posted to the fedora-announce mailing list stating that Red Hat was ready to focus on RPM again. In the announcement, Spevack spoke of the technology of RPM, the relationship Red Hat and Fedora shared with the community that existed around RPM, and their renewed commitment. Perhaps most telling of the innovation struggle was Spevack’s fourth major point in the email:

“RPM, as an application, has a fairly mature feature set that we are very interested in stabilizing and bug fixing. Furthermore, we want to make sure that RPM is a stable and simplified base for the building of other technologies on top of it. Down the road, we might be interested in exploring a variety of new features, but we don’t believe that should be the initial focus of our efforts.”

So where does RPM go from here? “We need a strong technical community around RPM that believes what we believe: that stability and community are paramount.” Again, Greg DeKoenigsberg. “We see RPM as being under Red Hat’s influence, rather than control, and that’s an important distinction.”

Its clear that one of the side effects of the RPM experience is that Red Hat has learned things can get significantly off track when accountability isn’t clear in an FOSS project. As the company changed and FOSS became a larger part of the general computing landscape, many of the key engineers (and much of the RPM team) left to pursue other opportunities. Red Hat was complacent in exerting major influence over RPM in their absence, and so when Fedora finally made its move to reclaim RPM, it had to use 4.4.2–the common base for Novell and Red Hat, but several releases behind the latest work.

One can argue that this effective fork of RPM is a good thing. While the general perception about forks is that they fragment the development community and are detrimental to its growth, forks result in new projects and new competition. Consider the example of the X.org/XFree86 split. When releasing version 4.4 of the XFree86 server, licensing changed and most major Linux distributions threw XFree86 overboard in favor of the X.org fork. Prior to this, there was an additional split led by Keith Packard following dissatisfaction with XFree86 development. Among the problems Packard cited were limited development resources, slow release schedules, lack of co-operation with other projects (notably GNOME and KDE) and opacity of the development process.

In the end, this created competition for XFree86, and XFree86 obtained the resources they needed from work with freedesktop.org. The competitive drive made both projects better. This illustrates a fundamental truth about Free and Open Source Software: when end users are included in the process of building software and making decisions, FOSS succeeds. In the course of any project, there are eventually divergent views on how things should go, what path to take, and where to stop working on one piece of code over another. Project leaders will disagree and a decision has to be made. This is fertile ground for a fork. Managed properly, and undertaken for the right reasons, a fork can improve both projects while increasing diversity and preserving cooperation and competition.

With FOSS, there are additional benefits to forking. If, after a fork, one branch is innovating more than the other–making the “right” decisions–it may attract a larger portion of the user base over time. Since the code is freely available, one branch can borrow ideas from the other and the best ideas are replicated across both projects. Ultimately, good ideas propagate throughout the community.

This also applies to RPM. Despite the awesome success of Ubuntu, there are still millions of users of RPM. And, perhaps now Red Hat is ready for the best ideas from the community to flow into RPM. This willingness is what you already see in projects like Yellowdog Updater Modified (YUM) and what you will see in the near future in projects like the Community Package Manager (CPM), a proposed strip down and rewrite of RPM.

RPM has a long way to go to build back its previous momentum, but it also has a lot going for it; among other things, a great community. For now, you can find more information at www.RPM.org, including a wiki, mailing lists, information about the #rpm IRC channel and more. There are many ways to contribute to RPM and now’s probably the best time to join in. While there may be some bad blood and scar tissue left over from the previous flame wars, it’s in every user’s best interest to foster growth in the community.

25 responses to “The story of RPM”

  1. bvbvc says:

    g

  2. Robert says:

    A nice read, I didn’t know the early history of RPM. A follow up down the track on how things are progressing would be great.

  3. winsnomore says:

    My god .. Is this history or “random thoughts”. I guess that’s why RPM

    I guess the author knows as much about writing history as much RPM knows about circular-dependencies/dead-ends/spurious stuff.

  4. Mark says:

    Didn’t Caldera have the first rpm software? Didn’t RedHat take rpm from a project at Caldera?

  5. AJ says:

    I like corn

  6. Tim says:

    rpm fsfs backend as an alternative or replacement for db backend

    rpm uses db for its database backend, just as svn orginally only used db.
    Now svn can be configured to use the file system, fsfs backend.
    The biggest problem with using db is when it is interrupted, it can crash and require recovery

    dpkg has been using the file system backend for quite some time and does not crash or require recovery if interrupted.

    The biggest advantage of the file system backend is that the package database is more accessible to both users and other software.

  7. Fabio says:

    Why a package manager can not be divided as gcc in three parts, the first one which can manage packages of several types (rpm, deb, tgz, etc) a second central part common to install the software and a third to connect to a already done database (mysql, postgresql, etc.) to manage the file positions, package installed and so on ? Why in a linux distribution i can not install rpm, deb, tgz all the same?

  8. Sandeep K. says:

    “And, perhaps now Red Hat is ready for the best ideas from the community to flow into RPM. This willingness is what you already see in projects like Yellowdog Updater Modified (YUM) and what you will see in the near future in projects like the Community Package Manager (CPM), a proposed strip down and rewrite of RPM”

    What prevents the likes of Red Hat, Novel-OpenSuse, Mandriva etc. etc. from switching over to *.deb format- one of the best ideas in linux community. Perhaps it might be an ego problem, perhaps it might take the identity out package management, but customers are not buying Red Hat Enterprise linux for rpm but for the significant advantages it offers over non-professional distributions and its established identity as reliable software vendor.

  9. ozonehole says:

    I agree with Sandeep K. above. I just don’t understand why RedHat sticks with RPM. People use RedHat not for RPM or because of YUM, but for other things like the installer, support, nicely-done interface, etc. I would be using RedHat if they would adapt APT, but sadly, the big egos don’t seem to be able to make the jump.

  10. Jef says:

    “What prevents the likes of Red Hat, Novel-OpenSuse, Mandriva etc. etc. from switching over to *.deb format- one of the best ideas in linux community”

    Why should they? You lose all possibility of upgrading from a previous version sanely. .deb format does not support even multilib. Its only recently that even apt added gpg signatures. Wake me up when you have a proper list of arguments as to why a switch would be advantageous.

  11. Andrew Schott says:

    In reference to switching to DEBs, I use Red Hat because RPM is far simpler to with and to develop RPMs. DEBs are a PITA and take me forever in comparison to the creation of RPMs. The documentation is another thing — RPMs have a full HUMAN READABLE documentation that DEB just lacks. Following only the documentation, it is impossible to create a DEB. Outside resources are needed. In my case its a friend in IRC ;D

    Anyway, decent article. I have been with the RPMs for about 10 years now, and its always interesting to see what history really was like, instead of that of my foggy memory.

  12. x says:

    Note that original rpm is still developed at http://wraptastic.org/ (version 4.4.7)

    Recently RH and some other folks took quite old 4.4.2 version (but commonly used) and created a fork at http://www.rpm.org.

  13. y says:

    and i thought that wraptastic was marc ewing continuing his work on rpm after leaving redhat.
    and since rpm was at that time still “redhat package manager” and not “rpm package manager” i guess that would make it a fork of the original redhat package manager. no?

  14. me says:

    ignorance abounds

    You can do nothing more to demonstrate your lack of understanding about linux and package managment than compare rpm to apt. They are not the same type of program!! One (rpm) handles install, building and verification. The other (apt) handles dependency resolustion and package repositories. Please compare apples to apples and oranges to oranges.

    APT is the same class of program as YUM. I will agree that apt is better at what it does than yum. It is easier to manage, and much faster at what it does.

    RPM is the same class of program as DPKG. RPM is way better than dpkg. dpkg is a PITA to use (as pointed out above). This is mostly due to the _STUPID_ restrictions and rules that the debian folks have written into it. RPMs have one ‘control’ file, the spec file. dpkg has upto 10 or 12. Every time you update your package you can easily have to touch 6 files with dpkg. With rpm its one.

    please, never compare rpm to apt. If you do, you are doing nothing more than exposing your total lack of knowledge on the subject

  15. Jason says:

    For all the talk of community, a lot of folks have missed the fact that a) work on RPM was already being done outside of Red Hat, and b) the rpm.org revamp was really a hostile takeover of sorts. The following is taken from http://www.oldrpm.org/ (what rpm.org used to be):

    “Note: On December 14, 2006, Red Hat decided to take complete control of editorial content at the formerly community maintained website which content was maintainted in an ‘open to the community’ process manner. This was done without advance communication to the long time maintainer of the RPM website.”

  16. T L says:

    Yes, i prefer RPM to deb for a lot of reasons, easier to create, the RPM-philosophy (Always non-interactive) as opposed to preseeding .deb-packages that are poorly made (eg. not able to be preseeded or changing preseed-questions from time to time)

  17. Steff Davies says:

    While “me” above has a point, I feel he/she/it hasn’t quite grasped the wider situation. The reason apt and RPM are so often compared is that due to the lack of an apt-level tool of apt’s quality on RPM-based distributions, the average RPM-distro user spends a hell of a lot more time playing with RPM than a similar apt-distro user would spend on dpkg. It may well be true that RPM is in some ways a better format than deb, but it’s irrelevant as long as the overall toolset for debs remains better.

    Besides which, we all know that the ports system is vastly superior to either. (Anyone who rises to this last should be ridiculed mercilessly ;-)

  18. Dan McDonald says:

    >The reason apt and RPM are so often compared is that due to the lack of an apt-level tool >of apt’s quality on RPM-based distributions, the average RPM-distro user spends a hell of a >lot more time playing with RPM than a similar apt-distro user would spend on dpkg.

    Except for Mandriva users, who have the very efficient urpmi to manage package repositories and dependencies for them. And if you haven’t tried urpmi in a few years, yes it it much better than it used to be. The latest version now has urpme, which will uninstall all packages dependent upon a package you wish to uninstall.

  19. Veronica Wright says:

    This is a perfect example of what’s wrong with Linux. Not only do we have different distributions, but within a distro we have different ways of doing the same thing, e.g. rpm and yum, and even worse should it be written in perl, C, bash, or perhaps Fortran. Didn’t Germany lose World War II because it decided to fight on two fronts? . Here we trot out all the same tired old excuses about backward compatibility and the need for competition. My dream: One Linux distro, one way of installing, one desktop — all maintained by the best programmers concentrating on their chosen area. Oh yes, one web site to go to for answers to problems by people that really know. Only then will MS shake in its boots. I’d better get the Kevlar on. This is going to hurt.

  20. Magnus says:

    Veronica tries and fails to evoke Godwin’s Law. She shoots! She misses!

  21. Kevin Otte says:

    Ahh, WWII Germany… Yeah, Hitler had a dream: one master race. That didn’t work out so well, did it. How about a little open mindedness here people?

    And besides, let’s look at this pragmatically. Microsoft mandated One Way of Computing, and look where that’s gotten us. Choice is good! The distros will evolve on their own and the best will survive. Unless of course you’re one of those Intelligent Design folk, in which case I have no frame of reference to continue an argument with you.

    OK, I’ve managed to invoke politics and religion. Anyone got any sex to throw into this thread?

  22. Anonymous says:

    May I remind her that there are many companies that are willing to help hold your hand to figure out which distro is best for you to use and implement in your environment.

  23. abuog says:

    i think rpm concept should be allowed to flow smoothly in cummnity package manager (CUM) cos it makes life better for any Linux System admintration with few commands like yum install … or rpm -ivh … your software is installed or update instead of manually modifiying your free open source (source code) and not allowing this transition will be a big blow to the fundamental principles of richard stellman free software (GNU)

  24. Anonymous says:

    Conary package management should also be mentioned as a next generation packaging methods.

  25. asennadas » Por que o sistema de paquetes de Debian non é estándar? says:

    [...] Podes ler máis sobre isto en The story of RPM, dpkg, rpm, APT e a partir daquí as ligazóns que veñan [...]