Rate this page del.icio.us  Digg slashdot StumbleUpon

How to write really good documentation: Donald Knuth was wrong

by Brian Forte

Donald Knuth was wrong.

While it’s true we still need ‘better documentation of programs’ it isn’t true, or it is no longer true, ‘that we can best achieve this by considering programs to be works of literature.’

Huh?

James L Quirk chimed in with a comment on my last article regarding semi-definite rules for the indefinite article. I’ve responded to several of his points in a comment of my own but I did not address his core point, which was the following:

The notion that software is written in one corner and documented in another, was challenged by Knuth with his introduction of literate-programming.

And there’s no gainsaying this point. Donald Knuth’s Literate Programming is an uncompromising and considered attack on the idea that code should be written by one person and documented by another.

And it’s got a lot going for it as an idea: the programmer as essayist, striving

for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other.

Unfortunately, there’s an unspoken assumption behind Literate Programming which undermines almost all the value to be had from Knuth’s insight. And that assumption is that everyone who reads the documentation accompanying a program is a programmer.

An assumption that is, now, almost entirely wrong. Whatever might have been the case in the past, today the safest thing to assume when writing documentation is that none of your readers are programmers. (Documentation of programmer-specific tools such as IDEs and programming languages is clearly an exception to this.)

So, if they aren’t programmers, who are your readers?

It’s not a very good answer, but the standard answer is: they’re users.

Now, drug-dealers think of the consumers of their product as ‘users’. Cynical marketing managers think of the whole population as ‘users’.

In each case the supplier sees themself as above and better than the people being supplied. And they see no value in these people except as consumers of a good that they will supply, again and again, for their own profit.

To call people who fire up Firefox and OpenOffice and even the bash shell ‘users’ is insulting and demeaning. Of course it’s insulting and demeaning when drug dealers and cynical marketing managers call their targets ‘users’ as well, but they tend to be openly contemptuous of their customers — current and potential — to begin with. Unfortunately, it’s still safe to assume many programmers have a similar attitude to their targets.

Which doesn’t make it any more accurate. People don’t use software the way people use cocaine.

For a start, software doesn’t run out (Microsoft’s and Google’s dreams of a never-ending rental stream for software as a ‘service’ notwithstanding).

Second, people don’t use software for its own sake; they use software to make and do other things.

I don’t fire up a text editor to admire its menus or command structure. I fire it up to write articles attacking the thinking of one of the greatest minds computing has ever produced. I don’t fire up a web-browser to admire the chrome and read the built-in bookmarks (all of which I’ve deleted anyway). I fire it up to gain access to documents and files and people on the Web.

And games are played for the fun and excitement they engender, not for the sake of their 3D-graphics and physics engines (which helps explain Nintendo’s success with their supposedly underpowered Wii).

Software is, then, a tool. And the overwhelming majority of people are tool-users: making, implementing and doing things with these tools.

Hardly a great insight.

Except that, in more mature human endeavours, the toolmaker and the tool-user aren’t adversarys. Luthiers don’t look down on guitarists for ‘merely’ playing the instruments a luthier makes. Aviation engineers don’t look down on pilots for ‘merely’ flying the planes they design and build. Just as important, while guitarists may kvetch about or heap praise upon the work of a luthier, they don’t look down on them for being toolmakers.

In the continuing absence of maturity in the software world, it’s the documentation that has to treat the tool-user with respect. Which is a further argument against Knuth’s Literate Programming. Since it’s all too common to see software toolmakers treat tool-users with short shrift, it’s a useful caution to have the ‘software is written in one corner and documented in another’.

Even if you’ve got access to grown-up programmers who recognise there isn’t some magical status associated with being a toolmaker, it’s probably a good idea to have documentation written by tool-users not toolmakers. After all, we don’t assume a luthier is the best person to write a how-to manual for guitarists.

A person reading a user-manual or how-to is trying to do something useful with a complex tool. They aren’t trying to understand how and why the tool was built the way it was. Write your documentation with this foremost in your mind. Documentation is not about comprehending the tool’s design and structure, it’s about understanding the tool’s implementation and use.

Long listings of a program’s capabilities serve a base purpose, but they aren’t enough. The admonition ‘read the man page‘, and less polite responses to tool-user questions, miss the key point: all too often documentation describes what a program does without contextualising this information in any useful way.

A listing of a program’s options is like an index in a book: not much use if you don’t know what term’s to look for in the first place.

Documentation should include real-world examples of what the program can do. And explanations of how a tool-user can use the program to do useful things.

For example, the man page for fstab starts as follows:

The file fstab contains descriptive information about the various file systems. It is the duty of the system administrator to properly create and maintain this file. fstab can be modified by special utils (e.g. fstab-sync(8)). Each filesystem is described on a sep
arate line; fields on each line are separated by tabs or spaces. Lines starting with ’#’ are comments. The order of records in fstab is important because fsck(8), mount(8), and umount(8) sequentially iterate through fstab doing their thing.

By contrast the Red Hat Enterprise Linux 5.0 Deployment Guide has a chapter on Implementing Disk Quotas which begins thus:

Disk space can be restricted by implementing disk quotas which alert a system administrator before a user consumes too much disk space or a partition becomes full.

Disk quotas can be configured for individual users as well as user groups. This makes it possible to manage the space allocated for user-specific files (such as email) separately from the space allocated to the projects a user works on (assuming the projects are given their own groups).

In addition, quotas can be set not just to control the number of disk blocks consumed but to control the number of inodes (data structures that contain information about files in UNIX file systems). Because inodes are used to contain file-related information, this allows control over the number of files that can be created.

The man page starts by describing what the fstab file is and how it is structured.

The Deployment Guide begins by talking about what a tool-user — in this case a system administrator — can do with the software and why they might want to do these things in the first place.

Just as important, the Deployment Guide doesn’t talk about other uses of the tool. The fstab file can also be used to manage the mounting of remote filesystems, for example. Which isn’t relevant to an administrator trying to implement disk quotas. The Deployment Guide does discuss fstab’s utility with regards remote filesystems, but it does so in Chapter 18, which is dedicated to remote filesystems.

Write about the software from the outside in, not the inside out. Don’t answer the question ‘what can this do? Answer the questions ‘What can I do with this? And why would I want to?’

9 responses to “How to write really good documentation: Donald Knuth was wrong”

  1. Anonymous says:

    You completely misunderstand the purpose of literate programming. It’s an approach to making program source code readable (to people who want to understand the code — other developers, or later maintainers). It’s not for manuals, end-user documentation, etc.

  2. Stephen Smoogen says:

    There are many audiences that have to be written for. A programmers main audience are herself and those she works with when they look at the code shortly after it is written. Her secondary audiences are those who will continue the work much later when the code is in maintenance, and she is somewhere far elsewhere. The third audience that the programmer has to deal with are those that will use the tool. Literate programming is meant to deal with the first two audiences as they are the primary people the programmer must really deal with.

    The majority of programmers are much like the engineer who designs the carburator on the car. The amount of work that takes is quite large, and making sure that you know that the other engineers and the repair mechanic know how to fix it later is what she is tasked with documenting. Automobile companies have entire other divisions of engineers and technical documentation people who deal with the end user who is going to drive the car. The driver of the car rarely cares whether the air/fuel placement needs to be looked at in the next model car. The other engineers do.

  3. Stephen Smoogen says:

    Another item you need to be aware of is that many of the terms of user, administrator, programmer, operator, etc are deeply ingrained in the psyche of people who program for Unix and Linux. These are religous constructs that we use to “understand and unlock” the mysteries of how the computer system really works. People who have led the way are going to be treated like Saints, Prophets and Angels because the human brain is designed that way.

    Be prepared for a lot of dismissal, flames, and weird cries of fear/anger/hatred for questioning/changing those precepts and members of the Holy Orders.

  4. Glen Turner says:

    Literate programming is not a tool for documenting the use of a program and Knuth never claimed that it was. Knuth never held the atitude you indirectly attribute to him, but actually wrote fine manuals for using his programs, such as the TeXbook.

    The TeXbook is so readable as to almost be a work of literature. I doubt the Red Hat documentation is that good.

    Your conmparison between fstab and quota documentation is apples to oranges. Let’s take quota. It has two sets of documentation: the manual page, which is detailed engineering documentation to assist fault finding; and Rob Elz’s paper “Disk quota in a UNIX environment”, which describes the motivation and use of quota. You are comparing the man page with your documentation. You should be comparing Rob’s paper with your documentation.

    Interestingly, the introductory comments in your documentation strongly echo the abstract in Rob’s paper. Perhaps you are missing an attribution? Or perhaps your notion of what is needed in documentation and that of the quota programmers isn’t that far appart?

    Dissing the man page is unfair. Your documentation doesn’t even attempt to give necessary technical information such as program status codes or the complete list of program arguments. That information needs to be recorded somewhere, and that is the role of the man pages.

  5. Anonymous says:

    > software doesn’t run out (Microsoft’s and Google’s dreams
    > of a never-ending rental stream for software as a ’service’
    > notwithstanding)”

  6. Anonymous says:

    >cough<

  7. Winnie says:

    still pro and cons

  8. anonymous says:

    If a programmer is making comments for other programmers perhaps they should consider creating fully automated tests for every method used. This allows new people to pick up the programming and when they break it they will know what went wrong and where. As for end user documentation, it is healthy to keep a distinction between audiences or you may try to write documentation covering all aspects and confuse/bore/annoy all readers.

  9. William Howell Sr. says:

    You touch on a very important part of a realworld problem, programming and documenting. Two very seperate issues.

    As a programmer I learned years ago, don’t let the code writer write the documentation for the end-user’ education.

    No coder, in my opinion, should write any “How To Install & Use This Programm” documentation for the “end-user”.