Rate this page del.icio.us  Digg slashdot StumbleUpon

How to write really good documentation:Four rules and an axiom

by Brian Forte

The question came in via IRC.

"non built-in modules" or "non-built-in" or "non-builtin" or...?

I didn’t like any of these constructions, and neither did my colleague, David O’Brien.

What David was asking about was a section of the forthcoming Fortitude documentation for Red Hat Enterprise Linux 5. He was re-writing the following paragraphs:

When Apache Web server is started, any modules that are not built-in are loaded, unloaded, and then reloaded.

These modules are loaded the first time to verify that the configuration is correct. They are then unloaded and reloaded when the server is actually ready to receive connections. After the first module load, Apache Web server closes access to the terminal. This prevents the system from prompting for the NSS token passwords (it would also be annoying to have to authenticate twice). Because the module is loaded and unloaded, the NSS certificate database also needs to be loaded and unloaded, causing any PINs entered during the first load to be lost and causing the server to be unstartable.

The solution is the PassPhraseHelper. This is a stand-alone program that also opens the NSS certificate database and stores a copy of the encrypted token password entered during the first load of the NSS module. When mod_nss needs to open the certificate database during subsequent reloads, it queries the PassPhraseHelper for the token password.

There’s not a lot to like about these paragraphs. Start with the ugly neologism: unstartable.

(This is Rule 0 for technical writers, by the way. The rule you have in place before you contemplate the existence of other rules: don’t make up your own words.)

Follow that with a plethora of passive voice: [t]hese modules are loaded; [t]hey are then unloaded; causing any PINs entered.

And finish with no information about the context for the described behavior. Apache may well do all that it’s said to above, but a spoonful of ‘why’ makes the ‘what’ go down a lot easier.

The initial question over IRC was part of an effort to include a bit of context in the re-write.

Text around the paragraphs quoted above made it easy to infer the behavior described was consequent to mod_nss not being built-in to Apache. Hence the question that started all this.

Feedback from Rob Crittenden, however, made it clear that David’s (and my) comprehension problem was both more basic and more far reaching.

(Quick aside: Rob Crittenden is ‘the Web Server engineering “team” from Netscape working on Netscape Enterprise Server (NES)’ at Red Hat. NES isn’t Apache, but it is a web server, and Rob knows a lot more about all web servers than David or me.)

While considering our word choice options, David sent Rob a few questions, including the following:

The following section of the Fortitude docs has me confused:

       "After the first module load, Apache Web server closes
       access to the terminal."

After the first module is loaded, or after the first cycle of loading any non built-in modules? What are these modules?

And Rob replied:

All loadable modules. Apache first loads all of the modules only to get the list of directives it supports. During this load all of the module initialization and shutdown routines are executed.

What Apache does, roughly is this:

 1. Initialize itself a bit (open ports to listen to, etc).
 2. Load all modules that are specified by the
    LoadModule directive
 3. As each module is loaded, run its initialization
    routine
 4. Identify the list of configuration directives
    supported by the module
 5. Unload the module, running its shutdown directive
 6. Close stdin, stdout, stderr
 7. Load all modules that are specified by the
    LoadModule directive
 8. As each module is loaded, run its initialization
    routine
 9. Wait for requests

Which made it clear the section needed a re-think, not just a re-write. The behavior being described in the original documentation is germane to mod_nss but not specific to it.

Apache’s own documentation makes this even clearer:

If the Listen specified in the configuration file is default of 80 (or any other port below 1024), then it is necessary to have root privileges in order to start apache, so that it can bind to this privileged port. Once the server has started and performed a few preliminary activities such as opening its log files, it will launch several child processes which do the work of listening for and answering requests from clients. The main httpd process continues to run as the root user, but the child processes run as a less privileged user.

More notes from Rob Crittenden showed other problems with the original documentation:

In order to use an NSS token you have to [authenticate] to it (assuming it has a password associated with it). During the Apache module first load phase the tty is accessible so if we need to prompt for a token PIN we can.

After this initial loadable module load and unload stdin, stdout and stderr are closed. So now Apache is unable to communicate with the user via the tty/command-line.

The problem is that during the module shutdown we have to shut down NSS. We can't leave it resident because there is no guarantee that we could find its old pointer during the next load. So what we have to do is cache the PIN that was obtained during the first module load. That is the PassPhraseHelper solution.

With this new information in hand, David and I swapped a few re-writes back-and-forth until David settled on the current version, which is as follows:

When Apache is started, it performs some preliminary housekeeping tasks, detaches itself from the terminal and then creates several child processes.

During this preliminary housekeeping, all of the modules specified by the LoadModule directive in httpd.conf are loaded. Each module is initialized as it is loaded, and any of the module's configuration directives are identified.

At this point, Apache detaches itself from the terminal. That is, stdin, stdout, and stderr are all closed. This prevents the system from prompting for the token passwords. Because the NSS module is loaded and unloaded, the certificate database also needs to be loaded and unloaded. This means that any previously entered PINs are lost, which leaves Apache in a non-working state.

The solution to losing these PINs is the PassPhraseHelper. This is a stand-alone program that also opens the certificate database and stores a copy of the encrypted token password that was entered during the first load of the module. When mod_nss needs to open the certificate database during subsequent reloads, it queries the PassPhraseHelper for the token password.

I’m hardly an unbiased judge. Nonetheless, I’ll argue for these four paragraphs being more useful to a new Fortitude user than the original three above.

Moreover, these four paragraphs aren’t the only consequence of that quick IRC interruption.

Other versions of the re-write may yet end up in our general documentation of Apache for Red Hat Enterprise Linux users. And a question has been raised about Apache’s startup process that doesn’t appear to be documented anywhere: Is the load-unload-reload process described by Rob Crittenden equivalent to the ‘launch as root; perform a few preliminary activities; launch child processes as less-privileged users’ process described in Apache’s official documentation?

As I type this, I don’t know. But I think it’s worth finding out and documenting, one way or another.

As well, and beyond the specific improvements produced above, there are some more general conclusions (or rules, if you will) to draw from this.

1. Poor grammar and bad writing are often a sign of poor comprehension.

You’ll occasionally hear people say, “I know what I want to say, I just can’t say it.” They may not like to be told this, but what they really mean is, “I don’t know what I want to say.”

Similarly with a user manual: if a reader doesn’t understand a section, don’t assume the reader’s comprehension skills are to blame. It’s at least as likely the writer didn’t comprehend the topic well enough to explain it clearly.

Which leads us to rule:

2. Good documentation takes time.

The re-write above took about 1.5 work days, including research, between David O’Brien and myself. And it isn’t the end of the work that emerged from the initial question. The question arising from the re-write is still open (and, no, I haven’t raised a Bugzilla bug as yet).

Internally, Red Hat has a metric of one A4 page of finished documentation per working day. Like the ‘one-page-of-a-screenplay equals one-minute-of-screen-time’ rule, this is a rule-of-thumb that probably doesn’t hold up too well under scrutiny.

Even as a rule-of-thumb, however, it doesn’t bode well for anyone expecting high-quality, thousand-page user manuals to emerge in the fortnight between final beta testing and burning the Golden Master disc.

So, is the solution to both rules 1 and 2 to only use experts for your documentation?

No, it’s not. In fact, today’s tale is a clear example of rule:

3. Deep expertise is not automatically a prerequisite for good documentation.

I’m no expert on Apache. David O’Brien knows a lot more about Apache than I, but doesn’t consider himself an expert either. And it was our shared ignorance that lead to the questions we asked, which led to both of us having a better understanding of the application, which led to the improved documentation.

Socrates made an (incredibly annoying when it happens to you) virtue out of asking apparently dumb questions, but the underlying purpose of Socratic irony remains: to force people to re-consider their assumptions.

Socrates, of course, was using dialectic inquiry to come to a deeper understanding of the great questions of meaning and existence. But his methods are no less useful for such mundane goals as more clearly documenting Apache and Fortitude.

That said, Socratic irony doesn’t work as well if the questioner is genuinely ignorant.

So, if deep expertise isn’t a prerequisite, how much expertise is? My personal rule-of-thumb: given any topic, you need to know enough to conduct worthwhile interviews with experts in that field before you can write about it effectively.

It’s a good rule of thumb for general journalism and no less useful for technical writers, who won’t be interviewing in the journalistic sense but will be asking questions of experts to get the documentation right.

Which lead us straight to rule:

4. Don’t let working cultures that put too great a premium on knowing everything dominate.

(Yes, I’m talking to all you self-satisfied Linux and Unix folk.)

Eric Raymond’s How to ask Questions the Smart Way has many virtues, but it’s only a small smart-arse attitude change away from an environment in which cleverness and being ‘in the know’ are cudgels to beat people down rather than tools for helping them up.

Five rules. (Or, as I prefer to think of them: four rules and an axiom that should be tattooed on the back of every technical writer’s hands so they can’t accidentally forget it). Are they enough to guarantee good documentation?

Hardly. Good docs come from good writers.

But keeping to these four rules—and never forgetting the axiom—will definitely improve your documentation. If nothing else, recognizing and observing these rules will raise the status of documentation and the people producing it. And they’ll use that raised status in at least two ways.

First, they’ll engage in political posturing with other parts of your company, because that’s what all humans always do, no matter who or where they are.

Second, and more usefully, they’ll put more energy, effort, and enthusiasm into their work. Which will automatically improve every user manual you produce.

9 responses to “How to write really good documentation:Four rules and an axiom”

  1. Brian Forte says:

    The Computer Book Publishing List, a mailing list run by StudioB, had a thread on the question of technical writing vs book authoring in late-January 2007.

    The thread was started by a question from Stuart Mudie: What makes book authors different from tech writers, in your view?

    Dee-Ann Le Blanc defined the difference thus:

    Most tech writers are not Subject Matter Experts. They
    interview the SMEs. Book authors have to be the SMEs more often
    than not.

    Judyth Mermelstein got almost adversarial, suggesting

    In the computer world, technical writers work for the Company
    that pays them. They turn information the company provides into
    documentation that makes the product look good to the customer
    so it will sell well. That can mean using impenetrable jargon
    the boss is proud of inventing or omitting to mention known
    bugs and whatnot.

    Ideally, technical authors work for their Readers. They
    thoroughly examine both the product and its documentation. Then
    they try to provide all the information the reader really
    needs, in a better (more accurate, more readable, more
    logically organized or whatever) [form] than the documentation
    the manufacturer’s technical writer(s) produced.

    In the first case, you’re working for the industry; in the
    second, for the public good and your own satisfaction.

    Margy Levine Young takes a similar, if less polemic, view to Mermelstein:

    Computer book authors write from the point of view of the user:
    what do you want to do, and how do you do it. Manuals usually
    write from the point of view of the program: here’s what it can
    do, here are the menus, here are the options. It’s a big
    difference.

    Laura Lemay comes in on the tech writer’s side, noting

    There has been a push for “user-centered design” of
    documentation for at least 25 years. Anyone in corporate
    technical writing who is still writing program-centered docs is
    not paying attention to industry practice.

    (It’s worth noting some of Lemay’s career history here. According to her FAQ she ‘gave up writing computer books’ and is now a ‘contract technical writer in Silicon Valley [who writes] programmer documentation, mainly.’)

    Back to her comments in defence of tech writers, she closes her post to the CBP list with the almost standard refrain regarding companies and the low priority documentation has:

    a lot of documentation in companies is still written by whoever
    the company can scrape up at the last minute to write it, eg,
    the lowest status engineer, QA, an intern, the receptionist…

    In a 2nd, longer, post, Lemay expands on Mermelstein’s point, without getting as adversarial:

    [I]n documentation you represent the company and the company’s
    point of view. Documentation in some ways is a sales tool of
    the product. You’re expected to ignore or gloss over possible
    problems in the product you’re writing about, or advantages of
    the competition. In computer books you can be more honest, more
    broad, and more critical.

    In documentation, you have to document all parts of the product
    evenly, even the stupid features that no one uses (every
    product has some). In computer books, you can focus on the real
    features that people actually use most often.

    She also re-states her point regarding the importance, or lack thereof, accorded documentation by many companies as well as explaining why computer books aren’t seen in the same light:

    Many corporations consider documentation a necessarily evil but
    don’t give it much attention or respect. As a tech writer you
    may be brought onto a product too late to do a good job and
    then get no access to either the product or to anyone who will
    explain how the product works because they’re too busy to waste
    their time with you. This goes a long way toward explaining why
    a lot of documentation sucks so badly. As a computer book
    author the book is the product and is thus the center of
    attention.

    All well and good, but what has this to do with the four rules and an axiom above?

    Quite a lot. The CBP discussion thread provides an alternative way of expressing the underlying goal of the rules: to the extent it is practical to make documentation more like a computer book, make it so.

    There are constraints, of course. Lemay’s point about documenting ‘all parts of the product’ occasionally have legal force behind them.

    And the authorial voice several contributors are fond of isn’t especially welcome in formal documentation.

    That said, this doesn’t mean there should be no voice in good documentation. Like news reports, documentation is the better for being mostly short, declarative sentences. Using nouns and verbs in preference to adjectives and adverbs makes for clearer instructional prose. And avoiding the passive voice makes for easier to comprehend text.

    In short, the so-called ‘transparent voice’ of the good news reporter, praised because they don’t seem to be there at all, is the technical writer’s preferred voice.

    But, voice aside, making documentation more like a computer book is still a useful way of working towards the same goal as the four rules, especially in light of Lemay’s last point.

    How much better would the average user manual be if it was considered a product in its own right?

  2. Hoyt Duff, former co-author, Fedora Core Unleashed says:

    Rule 0 might help the truthiness of the documentation, but it may stifle
    natural changes in language usage. Rule Zero would best be restated
    as “Avoid jargon — or at least explain it.”
    As to the voice of the documentation, that is ultimately decided by the policy
    of the publisher or business. As such, the voice can often be subdued or
    even terse and dry, but good writing always reads well.

    Finally, I suggest:

    5. Explain “why” whenever possible; learning is incomplete without that
    understanding.

  3. Don in Brooklyn says:

    Speaking of jargon – numbering your five rules 0)-4) is a problem! Especially when it is followed by reference to the “five rules” when you have just finished with Rule 4.

  4. Tigger23505 says:

    Don,
    While I have to agree with you that having a rule 0 smacks of jargonism. What do you do when you have a rule that absolutely must head the list. The common solution is to call it rule 0. The other thing to keep in mind particularly in technical writing, is that many in the target audience are used to things like numbering the bits in a 32 bit word 0-31. As a result going with rule 0 is not going to be jarring to most readers, besides if you take all the jargon out anybody can understand what we’re saying.

    Anthony

  5. Brian Forte says:

    Don in Brooklyn wrote:

    Speaking of jargon – numbering your five rules 0)-4) is a
    problem! Especially when it is followed by reference to the
    “five rules” when you have just finished with Rule 4.

    and Tigger23505 wrote:

    having a rule 0 smacks of jargonism. What do you do when you
    have a rule that absolutely must head the list. The common
    solution is to call it rule 0. The other thing to keep in mind
    particularly in technical writing, is that many in the target
    audience are used to things like numbering the bits in a 32 bit
    word 0-31.

    FWIW, although I’m aware of the programmer’s habit of ordering from 0 rather than 1, I wasn’t primarily thinking of that when I decided to call my axiom ‘rule zero’.

    I was mostly thinking of the Zeroth law of thermodynamics and, to a lesser extent, Isaac Asimov’s Zeroth law of Robotics.

    In both cases, the zeroth law serves as a foundation stone or fundamental beginning point from which further laws can be derived (in the case of thermodynamics) or newly understood (in the case of Asimov’s Robotics Laws).

    High falutin’ company I’m aiming to put my laws in, I’ll admit. Never let it be said I lacked ambition, however. Capability? Well that’s a whole ‘nother question.

  6. Brian Forte says:

    Inclined as I am to put lots of links in my web-writing, the (unfortunately necessary) policy here of not allowing links in comments means my two comments above don’t appear quite as I intended.

    For the sake of completeness, I’ve re-posted both comments — complete with anchor tags and title attributes — to my own blog at http://nonstandarddeviation.com.

    Copy; paste; see my tendency to link to all and sundry in all its glory.

  7.   Four rules for good documentation by Communications from DMN says:

    [...] But you can avoid bad documentation. How? By following the following four rules from an article in Red Hat Magazine: [...]

  8. Tom Johnson says:

    I really enjoyed the post and comment thread here. I think a good, aggressive technical writer can overcome the cultural influences that often prevent user guides from being truly useful. You have to be persistent with people and refuse to accept it when they won’t meet with you, won’t review the documentation, or won’t include a visible help link in the application. Eventually you can overturn these cultural influences and deliver good documentation.

    I’d add one more rule here — provide numbered steps that are accurate. I don’t know how many online tutorials and other documentation I read lacks simple numbered steps for completing a task. It’s tech writing 101.

  9. Mark Crocker says:

    The funny thing is this is a form of documentation…
    and I found it really hard to read and was really demotivated to read it due to the confusing layout…

    I’m sure the points are good but I almost didn’t get to them