Table of Contents

Prerequisite Knowledge
Scope of This Book
Organization of This Book
Conventions Used in This Book
I. Introduction and Overview
1. Introduction to SGML
1.1. SGML, Document Types, and Documents
1.2. SGML and Other Markup Systems
1.2.1. Procedural Markup Versus Declarative Markup
1.2.2. System-Specific Markup Versus Generic Markup
1.2.3. Noncontextual Markup Versus Contextual Markup
1.2.4. SGML Markup Strengths
1.3. SGML Constructs
1.3.1. Elements
1.3.2. Attributes
1.3.3. Entities
1.3.5. Putting the Pieces Together
1.4. SGML Document Processing
2. Introduction to DTD Development
2.1. DTD Development Phases
2.2. SGML Information Modeling Tools and Formalisms
3. DTD Project Management
3.1. The Global Picture
3.1.1. Types of Interaction with Documents
3.1.2. Components of an SGML-Based Production System
3.1.3. The Reference DTD and Its Variants
3.2. Preparing to Launch the Project
3.2.1. Defining the Project Goals and Strategic Directions
3.2.2. Controlling the Project Risks
3.2.3. Staffing the Project
3.2.4. Listing the Project Deliverables
3.2.5. Planning the Schedule and Budget
3.2.6. Writing the Project Plan
3.3. Launching the Project
3.3.1. Setting Up the Project Group
3.3.2. Identifying Future Users
3.3.3. Defining the Scope of Documents
3.3.4. Listing the Project Constraints
3.3.5. Planning the Project Workflow
3.4. Handling Project Politics
II. Document Type Design
4. Document Type Needs Analysis
4.1. Preparing for the Design Work
4.1.1. Learning Basic SGML Concepts
4.1.2. Learning to Recognize Semantic Components
4.1.3. Learning the Tree Diagram Notation
4.1.4. Scoping the Work
4.1.5. Planning to Prepare Deliverables
4.1.6. Learning About Teamwork Norms
4.1.7. Gathering Analysis Input
4.2. Performing the Needs Analysis
4.2.1. Step 1: Identifying Potential Components
4.2.2. Step 2: Classifying Components
4.2.3. Step 3: Validating the Needs Against Similar Analyses
5. Document Type Modeling and Specification
5.1. Preparing for the Modeling Work
5.2. Performing the Modeling Work
5.2.1. Step 4: Selecting Semantic Components
5.2.2. Step 5: Building the Document Hierarchy
5.2.3. Step 6: Building the Information Units
5.2.4. Step 7: Building the Data-Level Elements
5.2.5. Step 8: Populating the Branches
5.2.6. Step 9: Making Connections
5.2.7. Step 10: Validating and Reviewing the Design
5.3. Producing the Document Analysis Report
5.4. Updating the Model
6. Modeling Considerations
6.1. Distinctions Between Components
6.1.1. Multiple Elements
6.1.2. Single Element in Different Contexts
6.1.3. Single Element with Partitioned Content Models
6.1.4. Single Element with Multiple Attribute Values
6.2. Container Elements Versus Flat Structures
6.3. Documents as Databases
6.4. Strictness of Models
6.5. Divisions
6.6. Paragraphs
6.7. Generated Text
6.8. Augmented Text
6.9. Graphics
7. Design Under Special Constraints
7.1. Customizing an Existing DTD
7.2. Designing Document Types as an Industry-Wide Effort
III. DTD Development
8. Markup Model Design and Implementation
8.1. Determining the Number of DTDs
8.1.1. Creating DTDs for Nested Document Types
8.1.2. Creating Variant Element and Attribute Declarations
8.2. Interpreting and Handling Element Content Model Specifications
8.2.1. Handling Specifications That Specify Ambiguous Content Models
8.2.2. Forcing the Occurrence of One of Several Optional Elements
8.2.3. Limiting the Occurrence of Any-Order Elements
8.2.4. Handling Specifications for Mixed Content
8.3. Handling Specifications for Attributes
8.3.1. Designing Enumerated-Type Attributes
8.3.2. Designing ID and ID Reference Attributes
8.3.3. Designing Attributes with Implied Values
8.4. Useful Markup to Consider
8.4.1. Semantic Extension Markup
8.4.2. Markup That Eases Document Conversion
8.5. Designing Markup Names
8.6. Designing Markup Minimization
8.7. Addressing Other Factors in Markup Design
8.7.1. Allowing Markup Characters as Document Content
8.7.2. Defining Entities for Special Symbols and Characters
8.7.3. Creating Text Databases and Templates
8.7.4. Supplying a Default Entity Declaration
9. Techniques for DTD Maintenance and Readability
9.1. Using Good Coding Style
9.1.1. Comment Style
9.1.2. White Space Style
9.2. Organizing Element and Attribute Declarations
9.3. Managing Parameter Entities for Element Collections
9.4. Synchronizing the Content Models and Attributes of Multiple Elements
9.5. Creating New Attribute Keywords
10. Techniques for DTD Reuse and Customization
10.1. Categories of Customization
10.1.1. Subsetted Markup Models
10.1.2. Extended Markup Models
10.1.3. Renamed Markup Models
10.2. Facilitating Customization
10.2.1. Making DTDs Modular
10.2.2. Making Content Models Customizable
10.2.3. Including Markup Declarations Conditionally
10.2.4. Making Markup Names Customizable
10.3. Customizing Existing DTDs
11. Validation and Testing
11.1. Setting Up and Managing a Bug-Reporting System
11.2. Validating the DTD
11.3. Validating the Markup Model
11.3.1. Wrong or Overly Constrained Model
11.3.2. Overly Broad Model
11.4. Testing the Use of the DTD in the Real World
11.4.1. Usability with Applications
11.4.2. Usability with People
IV. Documentation, Training, and Support
12. Documentation
12.1. Documentation for Users of the Markup
12.1.1. Reference Manual
12.1.2. User's Guide
12.1.3. Tool Guides
12.1.4. Quick Reference and Online Help
12.2. Documentation for Readers of the DTD
13. Training and Support
13.1. Audiences for the Training
13.2. User Support
13.3. Phase 1: Initial Training
13.3.1. Introduction
13.3.2. Lectures and Paper Exercises
13.3.3. Computer Labs
13.3.4. Conclusion of Initial Training
13.4. Phase 2: Training Followup
13.5. Phase 3: Refresher Course
13.6. Phase 4: Quality Inspection of Documents
13.7. Phase 5: Information and Training on DTD Updates
13.8. Training Program Administration
13.8.1. Prerequisites
13.8.2. Number of Participants
13.8.3. Choice of Trainers
13.8.4. Length and Organization of the Training
13.8.5. Training Materials
13.9. The Learning Curve
13.9.1. Time Span
13.9.2. Productivity Assessment
13.10. Training Challenges
A. DTD Implementor's Quick Reference
A.1. Element Declarations
A.2. Attribute Definition List Declarations
A.3. Entities
A.3.1. General Entity Declarations
A.3.2. Parameter Entity Declarations
A.4. Comments
A.5. Marked Section Declarations
A.6. Notation Declarations
A.7. Processing Instructions
A.8. Document Type Declarations
A.9. SGML Declarations
A.9.1. Document Character Set
A.9.2. Capacity Set
A.9.3. Concrete Syntax Scope
A.9.4. Concrete Syntax
A.9.5. Feature Use
A.9.6. Application-Specific Information
A.10. Formal Public Identifiers and Catalogs
B. Tree Diagram Reference
B.1. Elements
B.2. Sequential and Either-Or Relationships
B.3. Occurrence Specifications
B.4. Collections and Any-Order Groups
B.5. Groups
B.6. Attributes
B.7. Additional Notations
B.8. Tree Diagram Building Process
C. DTD Reuse and Customization Sample
C.1. Original DTD Structure
C.2. Modified DTD Structure
C.2.1. Main DTD Files
C.2.2. Document Hierarchy and Metainformation Modules
C.2.3. Information Pool Module
C.2.4. Markup Model Changes Made
D. ISO Character Entity Sets
E. Bibliography and Sources

List of Figures

1.1. SGML, DTDs, and Document Instances
1.2. SGML Documents and Presentation Instances
1.3. Recipe Elements
1.4. Recipe Attributes
1.5. Recipe Entity and Reference
1.6. Recipe Elements, Attributes, and Entity
1.7. Two SGML Documents Conforming to the Recipe DTD
2.1. Some Potential Tree Structures for Recipe Documents
2.2. DTD-Level Graphical Description of Recipe Containment Rules
2.3. Railroad Diagrams for Recipe Content Models
2.4. Recipe DTD Tree Diagram
3.1. Document Interaction Classification
3.2. Conversion and Transformation Processes
3.3. Derivation Pattern for Variant DTDs
3.4. Conversion and Transformation Data Flow
3.5. Project Staff
3.6. Typical DTD Project Workflow
4.1. Restaurant Menu for Component Exercise
4.2. Restaurant Menu with Flat Structure Identified
4.3. Restaurant Menu with Nested Structure Identified
4.4. Semantic Component Form
4.5. Sample Chicken Recipe
4.6. Sample Cookie Recipe
4.7. Sample Cake Recipe
5.1. Element Form
5.2. Identifying the Document Hierarchy Components
5.3. Initial Cookbook Document Hierarchy
5.4. Initial Recipe Document Hierarchy
5.5. Complete Cookbook Document Hierarchy
5.6. Identifying the Information Units in the Information Pool
5.7. Tree Diagram for Ingredient List
5.8. Tree Diagrams for Recipe Data-Level Components
5.9. Context Population Matrix
5.10. Element Collection Form
5.11. Identifying Links
8.1. Tree Diagrams for Nested Document Types
8.2. Specification for Part Numbers Resulting in Content Model Ambiguity
8.3. Specification for Jokes Resulting in Content Model Ambiguity
8.4. Specification for Back Matter That Can Be Empty
8.5. Final Tree Diagram for Back Matter That Always Has Content
8.6. Specification for Song Collection Including Ballads
8.7. Final Tree Diagram for Song Collection Including Ballads
8.8. Specification for Dictionary Entry Collection
8.9. Specification for List Items with Problematic Mixed Content
9.1. Onion Approach to Collection Parameter Entities
9.2. Building Block Approach to Collection Parameter Entities
10.1. Relationship of Subsetted, Extended, and Renamed Markup Models to the Original
10.2. Modular DTD Structure for Sharing One Information Pool Among Several Document Hierarchies
10.3. Modular DTD Structure for Nested Document Types
10.4. Modular DTD Structure for Incorporating Standard Fragments
11.1. Change Request Form
11.2. Bug List for SGML Project
12.1. Quick Reference Card for Troubleshooting DTD
A.1. Element Declaration Syntax
A.2. Attribute Definition List Declaration Syntax
A.3. Functional Entity Types
A.4. General Entity Declaration Syntax
A.5. Parameter Entity Declaration Syntax
A.6. Comment Syntax
A.7. Marked Declaration Syntax
A.8. Notation Declaration Syntax
A.9. Processing Instruction Syntax
A.10. Document Type Declaration Syntax
A.11. SGML Declaration Syntax
A.12. CHARSET Parameter Syntax
A.13. CAPACITY Parameter Syntax
A.14. SCOPE Parameter Syntax
A.15. SYNTAX Parameter Syntax
A.16. FEATURES Parameter Syntax
A.17. APPINFO Parameter Syntax
A.18. Formal Public Identifier Syntax
B.1. Tree Diagram Notation Summary
C.1. Original Structure of the Memo, Letter, and Report DTDs
C.2. Modified Structure of the Memo, Letter, and Report DTDs
C.3. Precise Modular Relationships of the Reorganized Memo, Letter, and Report DTDs

List of Tables

3.1. Interdependencies in the Components of a Document Production System
4.1. rcard Markup Language Documentation
7.1. Comparison of Design Steps for New and Customized Document Types
13.1. Materials for a Typical DTD Training Program
A.1. Attribute Declared Values
A.2. Attribute Default Values
A.3. Reference Capacity Set
A.4. Reference General Delimiter Set
A.5. Reference Reserved Name Set
A.6. Reference Quantity Set
A.7. Formal Public Identifier Keywords
C.1. Information Unit Classes and Collections
C.2. Data-Level Classes and Collections
D.1. ISO Entity Sets
D.2. ISO Entities Sorted by Name

List of Examples

1.1. SGML Document for Pudding Recipe
1.2. Recipe DTD
1.3. SGML Document for Fudge Recipe
4.1. Contents of a Typical Document Analysis Report
A.1. Sample SGML Declaration
A.2. Reference Concrete Syntax Specification
A.3. Core Concrete Syntax Specification
C.1. Memo DTD Before Reorganization
C.2. Letter DTD Before Reorganization
C.3. Report DTD Before Reorganization
C.4. New Memo DTD Driver File
C.5. New Letter DTD Driver File
C.6. New Report DTD Driver File
C.7. Memo Hierarchy Module
C.8. Letter Hierarchy Module
C.9. Report Hierarchy Module
C.10. Memo and Letter Metainformation Module
C.11. Information Pool Module