Chapter 12. Documentation

Table of Contents

12.1. Documentation for Users of the Markup
12.1.1. Reference Manual
12.1.2. User's Guide
12.1.3. Tool Guides
12.1.4. Quick Reference and Online Help
12.2. Documentation for Readers of the DTD

It's not enough to have developed the perfect markup model for your document type. DTD documentation is essential for explaining the logic behind the model and for helping authors and application developers use the DTD properly. In fact, because markup declarations consist only of the syntax rules for markup and cannot convey the markup's “semantic” intent, the ISO 8879 standard actually requires documentation to be provided for the markup declarations, and technically defines a DTD to comprise both the declarations and their documentation. Here, we describe the kinds of documentation that are useful to supply.

Chapter 4, Document Type Needs Analysis discussed how to prepare a document analysis report, which can be thought of as the documentation of the document type design process. Here, we concentrate on how to develop documentation for an actual DTD in the various stages of its development and use. To prepare this material, you need to take into account the following factors:

In this chapter, we'll focus on the documentation needed by document authors and DTD maintainers.

12.1. Documentation for Users of the Markup

DTD user documentation is essential for two reasons.

First, authors must have access to a thorough description of every distinction that the DTD makes between kinds of document data (that is, every component represented in the markup), so that they can apply the markup correctly and consistently. If they don't make these distinctions in the documents, it will be impossible to take advantage of all the uses that had been planned for the information.

Second, computer software users from all backgrounds have come to expect easy-to-use software and readily available documentation and online help. Technical writers of product documentation, who make up a large segment of the SGML-authoring population, have even higher expectations because producing such material is their business. Providing complete and accurate user documentation helps ease the transition of authors to an SGML document production environment.

Although good documentation is always labor-intensive to produce, over time the documentation pays for itself by reducing the need for user support. Even in the short term it will be a good investment, because it can serve as the basis for some of the training materials needed immediately, and because it can be used in the testing of the draft DTD. If the document analysis report has been properly completed and reviewed, much of the material in it can be used as the basis for the documentation, which can save significantly on time and money.

Following are the minimum components of a complete set of DTD user documentation:

  • Reference manual

  • Task-oriented user's guide

  • Quick reference and online help

  • Guide to using tools specifically with this DTD

12.1.1. Reference Manual

A reference manual for a DTD is similar to a software reference manual. It consists of a series of standalone modules or “man pages,” each describing a single markup construct, with the modules in each logical grouping being arranged alphabetically. Following are the typical topics of the modules, in priority order:

  1. Each element type and its attributes

    Don't document end-tags separately from start-tags, even if omitted-tag minimization will be used heavily in your environment, because it will make the documentation unnecessarily large and may be a barrier to authors' understanding of the hierarchical nature of SGML.

  2. Each “common attribute

    Any attribute that appears on multiple elements in substantially similar form should be documented on its own.

  3. Each available entity set for boilerplate text, special symbols, and so on

    You don't need to create a reference module for each individual entity.

  4. Each element collection that is allowed in multiple contexts

    These reference modules are necessary if the collections are identified by name in the rest of the reference modules.

The reference module for each element type should contain at least the following information:

Short name

The actual generic identifier, such as olist .

Full name

A descriptive phrase of arbitrary length that explicates the short name.

For example:

olist: An ordered list of related items.


A brief presentation of the rules for using the element. For elements in the document hierarchy, it's useful to provide one or more tree diagram fragments that show the direct parents and children of the element being described. For elements at the top level of information units, a tree diagram for the whole information unit is best. Alternatively, you can use a text-based synopsis like the following:

text, e.g., paragraphs

If the element allows minimization, this section can explain how to use it. In SGML-aware editing environments, usually minimization is not used.


A complete description of the element's purpose, how and where it should be used, and rules for distinguishing this element from other elements that might be used for similar purposes. The description should also include information on choosing attribute values properly.

For example, the description of olist might include the ways in which it is different from ulist (“unordered list”), which is for lists of items that are in an insignificant order.


A reference description of each attribute allowed on the element with its purpose, its allowed values, its default value (if any), and whether a value is required to be supplied.


It may be useful to list explicitly the elements allowed inside the current element and the elements in which the current element is allowed, if this information is not regularly part of the Description section or has not been clearly conveyed by the Synopsis section. In environments where SGML-aware editors will be used, however, this information is not as useful as it may seem.

In the case of elements that are members of collections that appear in many contexts, you may want to name the collections in which they appear, and provide a separate reference module for the whole collection, explaining where its members can appear.


Practical examples convey information about proper markup usage more effectively than any other kind of documentation. Show all the major configurations of the element, with various contents, attribute values, and so on. If its usage changes substantially in different contexts, demonstrate each one. Also, if minimization is allowed, you can demonstrate it here.

For each example of marked-up content, show a corresponding example of formatted or otherwise processed output, if possible. While it may seem inappropriate to show processed output, since this approach does not focus on information potential and longevity, usually authors have many other goals—for example, ensuring information accuracy and meeting deadlines. The most efficient way to help authors mark up information thoroughly and consistently is to demonstrate the effects of markup and, thereby, provide a solid rationale for that markup.

If your document production environment has not yet attached processing to some markup in your DTD, documenting the use of the markup will be difficult, not only because the effect of the markup can't be demonstrated in the DTD documentation, but also because the actual processing environment won't provide natural constraints on what the authors do. For example, if your DTD has a data-level element that produces no typographical change or other processing behavior, the motivation of authors to use that element will be extremely low.

Processing notes

This section can discuss the practical use of the element in a particular editing or processing environment. This is the place to explain how to work around problems in the tools. Segregating the information in this fashion makes it easier to maintain the documentation and update it when the tools or conventional markup practices change.

In general, the entire reference DTD should be documented in the reference manual, though if an authoring sub-DTD is being used, it may be sufficient to document for this audience only the portions of the model that the authors will see. In any case, the reference manual can document the whole model without any confusion because users will look up only the markup that interests them, rather than reading the whole document cover to cover. If you manage the reference documentation modularly, you can combine just the reference descriptions needed by each DTD in the family.

Software tools are available that can help you produce some DTD reference information directly from a DTD. This generated information can be useful, particularly for authors who do not have an SGML-aware editing environment available. However, keep in mind that high-quality examples and descriptions must still be provided by a human. Following are some types of information that can be generated:

  • Graphical diagrams or outline representations of the DTD rules

  • Alphabetical lists of elements with their short and full names (if the full names have been provided in structured comments in the DTD or in some other file)

  • Hypertext representations of a DTD that allow users to travel, for example, from the mention of an element in a content model to the declaration for that element

    This form of reference documentation is useful for authors who choose to read the actual DTD, as well as for DTD maintainers.

12.1.2. User's Guide

Providing a conceptual and task-oriented user's guide gives you the opportunity to explain the DTD from the authors' perspective rather than from the perspective of elements and attributes. The following basic topics should be covered in the user's guide:

  • General SGML and DTD concepts

  • How to insert elements, attribute values, comments, entity declarations, and entity references

  • The job of applying markup as both a power and a responsibility, and the consequences of Tag Abuse Syndrome

In addition, the user's guide should describe all the tasks that authors most need to know about. The best way to discover what specific task-oriented topics must be addressed is to ask authors what their problems and concerns are. The periods of DTD and software testing provide a perfect opportunity to make these inquiries and discover all the problematic tasks that authors face. These topics, in order to be effective, may need to refer explicitly to the tools and environment that authors will use.

The problems typically fall into one of the following areas:

  • I have some information; how do I put it in a document?

    For example, an author might need to document a procedure. If the DTD uses highly content-based markup, the user's guide could explain how to apply the elements meant specifically for procedures and provide real-life examples. Otherwise, it should explain the proper conventional usage of less precise markup, such as using a numbered list in a certain way or with certain attribute values.

    A useful adjunct to descriptions of markup choices is a set of “markup cookbooks,” whole documents (mock or real) demonstrating the proper use of all the available markup and the resulting processing, provided along with a task-oriented index into the documents.

  • How do I achieve a certain formatting or processing effect?

    The topics arising from this question might compare sets of markup that result in similar formatting, and discuss how to choose among them. For example, an author may want to make a phrase appear in italic type, providing the opportunity to explain that the reason for the typographical distinction (for example, because the phrase is in a foreign language) is as important as the appearance for the purposes of information processing.

  • Why must I use so much markup? How I can reduce the amount and do my job efficiently?

    For example, an author might be struggling with information that requires several nested layers of markup. This problem provides an opportunity to explain minimization techniques or explain the use of the available markup templates.

Don't neglect to build an index and a glossary. An index of tasks, employing terminology used by authors rather than that suggested by SGML or the DTD's generic identifiers, is especially helpful. Likewise, it is useful to provide a glossary that defines SGML terms in relation to terms already used in the authors' culture.

Your inquiries will identify not only tasks that need explanation, but probably also any sequences of elements that authors commonly use. You can then turn these sequences into markup templates that help authors insert consistent and correct markup.

12.1.3. Tool Guides

In addition to documentation that has the DTD as its focus, you also need tool guides—not the documentation provided with the editors and other software that authors will use, but rather a set of guides that are shorter and more focused on the creation of documents with a specific DTD. They contain the basic information authors need in order to do real work. Popular components of such documentation are:

  • A guided tour of standard procedures

  • Troubleshooting questions and answers

  • Tips on becoming a “power user” of the tools

In some environments, authors are responsible only for the creation of textual document content. In others, they may be responsible for many more facets of document production. When a company switches to SGML, authors may be asked to:

  • Key in markup as well as document content

  • Get pieces of information (such as IDs, file names, trademarks, bibliographical references, and so on) from a database

  • Produce and include nontextual objects, such as graphics

  • Keep track of the workflow, validate partial and whole documents against a particular DTD, and store the results in an SGML database

  • Format their documents for paper printing and prepare them for other kinds of distribution

According to the sophistication of the SGML documentation production system and the tasks assigned to the authors, they may have to learn how to use the following tools:

  • The editor

  • The workflow, storage, and archiving system and any additional databases

  • The formatting engines

  • The indexing engine for electronically distributed documents

This type of documentation will keep evolving and growing as you gather the needs of new users, and will have to be rewritten each time you change tools.

12.1.4. Quick Reference and Online Help

Although high-quality user documentation is crucial to any SGML project, most people hate using it. The usual complaints are that it is too bulky to handle and store, and too slow to use when you are looking for short, basic information. For this reason, users should also be provided with a quick reference card or sheet, as well as online help available from within the tools they use.

Unfortunately, most DTD quick reference cards we've seen are quite dull and not very helpful, for obvious reasons. For DTDs of over 100 elements, all that will fit on a single sheet of paper is an alphabetical list of the short and full names of the elements. For the sake of efficiency, some quick reference cards organize elements by category so that users can look in different places according to their problem: document hierarchy, information units, data-level elements grouped by function, and so on. The result is still not very useful.

However, there are ways to be more creative about the format and content of a DTD reference card, with positive effects on its efficiency. Figure 12.1, “Quick Reference Card for Troubleshooting DTD” shows a reference card designed for the troubleshooting DTD used in a nuclear plant, with the “help frame” side of the card being presented.

Figure 12.1. Quick Reference Card for Troubleshooting DTD

Quick Reference Card for Troubleshooting DTD

This reference card is made of two parts: a plastic envelope composing the front and the back of the reference card and a plastic card that slides inside the envelope.

On the front of the card is a diagram of the upper levels of the hierarchical structure of the DTD, a brief user's guide, and two lists of element collections, referred to by the names ELo and ELi. These names are referred to elsewhere on the card wherever the elements within the collections can be chosen from at the author's discretion.

The back of the card has two parts. On the right, there is the list of the most important elements, each with its name and occurrence rule, and on the left, the help frame. The help frame has a transparent window behind which slides the printed plastic card. By sliding the card behind the open window, you display the last element you typed and can read what the valid parent, sibling, and child elements are.

Of course, not all DTDs are small enough to be able to be represented on a small plastic reference card, but this one was very popular with users because it is simple and because they can carry it on-site in the plant for manually tagging their reports. One alternative for a larger DTD is to provide a package consisting of an outline of the major choices of document hierarchy, reduced copies of the tree diagrams representing the information units, and synopses of the usage of all the data-level elements. Usually, authors need more help with the lower levels of a document type than with the document hierarchy, where there is often much less choice about markup.

Other useful reference materials might be a set of two tables that relate the short element names to their descriptive names, possibly also providing the page number of the reference module where the element is described. Also, an equivalence table mapping the elements to the markup previously used is usually popular with authors, although such a table can easily be misused because the equivalences are often just rough approximations. If you provide this information, make sure to combine it with sufficient training.

If your word processor or SGML-aware editor allows for the possibility of online help, it can be very effective to provide reference information in this form, so that authors can look up individual elements and other markup constructs as they work. The modules written for the reference manual (discussed in Section 12.1.1, “Reference Manual”) make natural starting points for online help that can be made available in an editing environment.

12.2. Documentation for Readers of the DTD

In addition to the user documentation, every DTD needs to be accompanied by documentation meant for reviewers of the draft DTD, DTD maintainers and customizers, developers of processing applications, interchange partners, and other readers of the DTD. This information constitutes the DTD's maintenance documentation. Usually, the starting point is the reference manual (discussed in Section 12.1.1, “Reference Manual”). The following checklist shows the additional items that are essential for maintenance documentation.

  • Design Specification

    The implementor should explain any complex or tricky content models and descriptions of all choices made by the implementor after the initial work by the document type design team was done. If there are any significant contexts that applications need to query on, these should be described (if they haven't been covered sufficiently in the user documentation). This information can be thought of as a supplement to the document analysis report; it completes the design specification of the DTD.

  • SGML Declaration Documentation

    Any SGML declaration characteristics that the DTD relies on (especially if it is not accompanied by a declaration) should be documented.

  • Architecture Report

    This specification should include an explanation of the organization of the markup declarations, the architecture of its modules and parameter entities, and how to customize the DTD. The module-dependency notation used in Chapter 10, Techniques for DTD Reuse and Customization is useful here for conveying information about the number of modules, their organization, and their overall levels of dependency on each other.

  • Differences Report

    The implementor should document the particulars of any variant DTDs and the differences between the reference DTD and the interchange DTD, and the expectations for transforming from one to another.

  • Interchange Report

    All information needed for other organizations to duplicate the intended processing should be documented. For example:

    • The expected defaults for #IMPLIED attribute values

    • How CDATA attribute values should be handled

    • Expected processing for empty elements, ENTITY attribute values, and non-SGML notations

    • Places where an application is required to generate and output text

    • Markup that is intended to control the behavior of formatting applications directly (for example, presentational attributes for font choice)

    • The precedence of processing for document content, where conflicting markup has been supplied

  • Administrative Information and Change Reports

    The contact information for the current maintainer and information on how to submit bug reports and enhancement requests should be recorded. Also, for each update, a report should be provided listing the changes made, their rationales, and the number of the bug report or enhancement request that each change responds to.

  • Tool-Specific Information

    The implementor should document how to use or compile the DTD with particular software tools available in the environment, where the handling differs between tools. Often this information includes an entity catalog file that specifies the system locations of various pieces of the DTD and SGML declaration.