Table of Contents

Prerequisite Knowledge
Scope of This Book
Organization of This Book
Conventions Used in This Book

A document type definition (DTD) forms the foundation of an SGML edifice. The goal of Developing SGML Document Type Definitions is to help individuals and organizations develop high-quality, effective DTDs.

We have been involved in a variety of Standard Generalized Markup Language (SGML) projects, including design and development of large DTDs, in computer companies and industry-wide forums. Through these efforts, we've refined a DTD development methodology that can help anyone embarking on SGML projects develop DTDs that meet the goals of those projects. In this book we will describe our methodology and techniques for doing the work of designing, implementing, and documenting DTDs.


Developing SGML Document Type Definitions is intended to serve as a workbook for anyone who is, or might soon be, responsible for developing DTDs. The audience for this book includes people in the following roles:

  • Publications and MIS project managers and project leaders responsible for the successful migration to and implementation of SGML-based systems in their environments. If you're a manager in this position, this book will show you how to make the DTD development phase of your SGML project successful through resource and project planning. Part I, “Introduction and Overview” and Part IV, “Documentation, Training, and Support” are meant especially for people in this role.

  • Document authors, editors, and other subject matter experts who create, edit, or assemble the targeted information. They are authorities in the required form and content of that information and are typically in the best position to describe many document type requirements. If you're a subject matter expert, this book will show you how to uncover, express, and justify your requirements in clear, usable document analysis reports. Part II, “Document Type Design” is meant especially for people in this role.

  • DTD implementors, developers of document-processing applications, and system and database administrators responsible for implementing and maintaining DTDs and the systems and software tools that process the targeted information. If you're a developer, this book will show you how to contribute to the DTD requirements work and how to design and implement DTDs for readability, maintainability, and flexibility. Part III, “DTD Development” and some notes in Chapter 5, Document Type Modeling and Specification are meant especially for people in this role.

If you're embarking on a relatively small SGML project or a pilot for a larger effort, you may find yourself filling all three roles; this is often the case for the “SGML champion” in an organization.

Prerequisite Knowledge

If you're a subject matter expert participating in a document type design team or if you're a manager, you need no special knowledge before beginning to make use of this book. We'll teach the methodology, formalisms, and techniques you need to know for the design portion of the work, and Chapter 1, Introduction to SGML provides some basic information on SGML concepts. However, document type design team members will probably need additional training in SGML concepts to participate fully in the team's work.

If you're a DTD implementor, Part III, “DTD Development” is directed solely to you. Before reading these chapters, you need to be able to read and write SGML markup declarations, and you should be familiar with technical SGML terminology for common concepts, only a few of which are explained in the text. (Appendix A, DTD Implementor's Quick Reference provides a quick reference to SGML syntax.) We strongly suggest that you have on hand one of the various SGML reference books and other hardcopy and electronic sources as you read and work (see Appendix E, Bibliography and Sources for information on sources).

Scope of This Book

Developing SGML Document Type Definitions covers the breadth of the DTD development process for SGML (ISO 8879) applications, and it uses general-purpose examples in order to help you reach a beginning or intermediate level. For the most part, it doesn't discuss specific existing applications of SGML, such as industry-standard DTDs, or applications of the HyTime standard (ISO 10744) for hypertext and multimedia.

Our discussions and examples are based mostly on document production and publishing applications, that is, creation, processing, and delivery of documents in a business environment. Other types of applications, such as marking up documents for personal use or performing computerized analysis on documents written outside an organization's control, are not covered specifically. However, such projects can use our methodology and techniques with few modifications.

DTD development is an important part, but by no means the only part, of an SGML project. We stick closely to describing DTD-related tasks, leaving aside such topics as making the SGML decision in the first place, implementing conversion software and processes, and developing formatting stylesheets.

Appendix E, Bibliography and Sources suggests sources that address some of these additional topics.

Organization of This Book

This book is organized into four main parts and also has five appendices and a glossary.

Part I, “Introduction and Overview” introduces SGML and our methodology and discusses the management of a DTD development project.

Part II, “Document Type Design” explains the basic steps for the document type design team to use in analyzing the target document class and developing design requirements and rationales, resulting in the production of a “document analysis report” that can be used by a DTD implementor.

Part III, “DTD Development” describes how to implement markup requirements in a high-quality SGML DTD that takes into account any needs for customization and maintenance. It also discusses how to test the results.

Part IV, “Documentation, Training, and Support” explains how to document a DTD and train authors to apply markup correctly using it.

Appendix A, DTD Implementor's Quick Reference provides reference information for DTD implementors on constructing DTDs and SGML declarations.

Appendix B, Tree Diagram Reference provides reference information on the graphical tree diagram formalism we introduce for SGML information modeling.

Appendix C, DTD Reuse and Customization Sample provides an extended example of DTD technique for reuse and customization.

Appendix D, ISO Character Entity Sets summarizes the ISO character entity sets.

Appendix E, Bibliography and Sources suggests further reading and sources.

The Glossary explains phrases introduced in the text, including some terms that are defined in ISO 8879.

Conventions Used in This Book

This book uses the following typographical conventions:

  • In text, element type generic identifiers and attribute names are in fixed-width font, and attribute values are shown in the same font with quotation marks.

  • In text, the general & entity; names and parameter %entity; names are in fixed-width font and are delimited as if they were references to the named entities.

  • Computer-related literal strings that must be used exactly as shown are in fixed-width font; variable parts of computer strings are in fixed-width oblique font. SGML reserved name keywords, such as ATTLIST, are shown in all capitals.

  • New terms are shown in boldface type where they are introduced and defined in text; these terms are also defined in the glossary.

  • Checklists containing practical advice appear throughout this book. Items in checklists have this special symbol next to them:

DTD implementors in particular should note the following conventions that we use for DTD examples, primarily in Part III, “DTD Development”:

  • Where real elements, attributes, and so on must be declared and used in DTD examples, we usually assume a straightforward structure-oriented book DTD and use simple descriptive names for the parts of the structure: div or division for a division element, para for a paragraph, and so on. These declarations don't correspond to any existing DTD, and any two examples are likely to be incompatible with each other.

  • The DTD examples assume the reference concrete syntax and quantity set, with the exception of the NAMELEN quantity, which we assume is 32. We assume OMITTAG minimization is set to YES, so we generally include omitted-tag specification characters in element declarations. In order not to distract from whatever issue is under discussion, we usually provide hyphen ( - ) specification characters to indicate that tags cannot be omitted. SGML declarations are discussed in Section A.9, “SGML Declarations”.

  • Ambiguous and otherwise invalid DTD and markup examples have the following symbol next to them, to remind you not to use them as templates for your own DTDs:


Drafts of this book were written in the DocBook DTD and variants thereof with the SoftQuad Author/Editor™ and ArborText ADEPT•Editor™ products for Microsoft Windows™, were reviewed with ADEPT ElectronicReview™, and were formatted with ADEPT•Publisher™ for UNIX™ systems. The graphics were prepared with Visio ™. The final typeset output was produced with FrameBuilder® and ADEPT•Publisher™. OmniMark™ was used to validate the SGML examples.


You always write the book you wish you'd had the first time around. A great many people were involved in the events that shaped our experiences, opinions, and working styles, and we owe them a large debt for their contributions to the methodology and this book. Of course, any errors herein are ours alone.

Many thanks to ArborText and SoftQuad for providing the software we used in writing the book. We also gratefully acknowlege Berger-Levrault/Advanced Information Systems, and Dominique Vigneaud in particular, for introducing us to an early version of the tree diagram notation as a way to document DTDs. Thanks go to Andrew Rogers, creator of the rcard recipe-formatting program, for allowing us to use the rcard markup and output in examples in Chapter 4, Document Type Needs Analysis. Tim Allen kindly helped us validate the SGML examples in the book.

The proposal and, later, the book were given attention by many thoughtful reviewers. We are especially grateful to Terry Allen, Lee Fogal, Charles Goldfarb, Kathy Greenleaf, Paul Grosso, Eduardo Gutentag, Dominique Péré, Russ Rauhauser, Yuri Rubinsky, and Rich Yampell for their helpful contributions. We would also like to thank Deborah Dormitzer for recipes and advice, and Earl Grey for inventing “writing juice.

Jeanne says:

Many thanks go to my former colleagues at Groupe Bull, for their generosity in allowing me to use real-life examples; Jacques Rousseaux, for his help in modeling the project workflow tasks using the Mallet methodology; Christophe Lecluse, for technical advice; and Eve, for agreeing to do this project, as it provided the perfect excuse for me to spend some quality time with her in the States. Most of all, I thank Jean Charles Burou and our son Alexandre, for their incredible patience. I fondly hope that they can forgive me for all the nights we missed going out on the town and for the succession of babysitters (respectively). Alexandre has heard SGML being spoken since before he was born; it's a wonder his first word wasn't “pee-cee-data.

Eve says:

I owe much to Aidan Killian; Ludo Van Vooren; Aviva Bock; the many people at ArborText and Digital Equipment Corporation who gave me moral support; Jeanne and her family, for putting me up and putting up with me; my own family and friends, for being patient even when I got really boring; and especially my husband and true love Elias Israel, not only for coming up with the title, not only for providing an example of how one could write a book and survive, but for his ceaseless aid and comfort.

We dedicate this book to M'Hammed El Andaloussi and to the memory of Ned Maler.