Chapter 7. Design Under Special Constraints

Table of Contents

7.1. Customizing an Existing DTD
7.2. Designing Document Types as an Industry-Wide Effort

If you're designing a document type from scratch for your own organization, Chapter 4, Document Type Needs Analysis and Chapter 5, Document Type Modeling and Specification describe what you need to know. However, if your project has one of the following unusual characteristics, this chapter explains how the document type design process might differ:

7.1. Customizing an Existing DTD

It can be advantageous to use an existing DTD or DTD fragment as the basis for a new document type, because the similarity to the existing model may allow you to use processing applications that are already developed, and you may be able to benefit from the availability of existing expertise, documentation, and training courses. If you must interchange documents with business partners, basing your reference DTD on an industry-standard DTD will almost certainly make sense, if such a standard DTD exists for your industry segment. (The relationship between these DTDs is described in Section 3.1.3, “The Reference DTD and Its Variants”.) The document type design process will need to change to reflect this reality.

One obvious difference is that the supporting documentation for the existing DTD will play a major part in both analysis and specification. Therefore, it is essential to assess the available documentation and ensure that it is complete and understandable before you begin. In a customization project, usually the DTD implementor, or another person fluent in SGML or in that particular DTD, must be available to give a “running interpretation” of the DTD. In fact, it's not uncommon for SGML consultants to be hired to reorganize and redocument a standard DTD so that its assumptions about scope, constraints, usage, and processing can be made clear (or reconstructed, if they were documented insufficiently in the first place). An extremely effective technique for starting to read and understand a DTD is to sketch out its tree diagrams; the effect of recreating the model by hand is very powerful.

Another difference is that the design team will need to become familiar with the different types of variation between markup models: subsetting, extension, and renaming (discussed in Section 10.1, “Categories of Customization”). The team members may need some help from the DTD implementor for this relatively technical topic, but it's important to know the costs and benefits of each type of variation so that they can be factored into design decisions. If certain constraints along these lines are known at the outset, the project documents will make this clear. For example, it may already be dictated that some parts of the original markup model must be subsetted out, or that the model must be extended to account for proprietary information.

Often, it will be stated as a goal that the reference DTD must be a proper subset of the interchange DTD, meaning that all instances of the former should conform to the latter. However, be aware that this is often an unrealistic goal. What usually happens is that authors will be instructed to use the standard DTD in ways that are unique to that organization, in effect creating semantic extensions (a notion discussed in Section 8.4.1, “Semantic Extension Markup ”) while conforming to the “letter of the law.” Thus, document instances might still require transformation of some sort when they are exchanged with business partners.

Table 7.1, “Comparison of Design Steps for New and Customized Document Types” compares the design steps in projects for new document types versus customized document types.

Table 7.1. Comparison of Design Steps for New and Customized Document Types

Step New DTD Customized DTD
1 Identify potential semantic components. You should place a heavy priority on identifying component ideas from the original DTD, while taking note of your unique requirements.
2 Classify components. If the documentation for the original DTD already mentions some classifications, consider whether they make sense for the current analysis. Using them allows you to stay within the terminology and “mindset” of the interchange DTD, while leaving you free to recognize additional patterns.
3 Validate components. Check carefully to make sure the original DTD is well represented.
4 Select components. Your project may already hold the assumption that certain types of components are “in” or “out,” but you should ensure that your own legitimate needs aren't rejected. Make sure to state the rationale for each rejection of a standard component.
5–7 Build document hierarchy, information units, and data-level elements. For each component that has a representative element in the original DTD, instead of starting with a blank slate, use its model as a proposed starting point and modify it as necessary. For components with no representation in the original DTD, model them the usual way. Document the rationale for every extension of the original model, in addition to documenting the “absolute” rationale for your modeling choices.
8 Populate the branches. Pay close attention to the element collections used in the original DTD, and try to subset rather than extend them, except where you have added your own unique elements that must be included. Extensions in this area can make it very difficult to transform documents into compliance with the interchange DTD. If the original DTD doesn't already use a building-block approach for managing its element collections, you may want to model them in a matrix just to clarify the existing choices.
9 Connect the model to the outside world. Except for linking mechanisms, such connections tend to be idiosyncratic to each variant DTD.
10 Validate and complete the design. Same.

Doing a proper job of customizing will demand just as many skills and expertise as a ground-up effort, maybe even more, and the results will be unsatisfying unless you do a real analysis—not just one that assumes the standard DTD describes the entire universe of potential semantic components. If may even be best to perform a analysis and only then choose a DTD to serve as a base, if you have this flexibility.

It usually takes less time to customize an existing DTD than to design one from scratch because the modeling phase can be completed more quickly. However, because of the time and skills needed for analysis, customizing an existing DTD should be done primarily for reasons of interchange and application availability, not for reasons of cutting the DTD development schedule in half.

7.2. Designing Document Types as an Industry-Wide Effort

Quite a few efforts have already been conducted for the creation of DTDs that can be used across an entire industry, for example, for aircraft maintenance, semiconductor manufacturing, and computers. Many more such efforts will be undertaken as the use of SGML grows. If you participate in an effort to design a document type for a whole industry, be prepared for the process to be more complex, more costly, and more lengthy than any single-company project. It's not uncommon for the project to take 18 to 24 months.

Because the participants usually represent competitors, they tend to have strongly conflicting perspectives and they may be reluctant to share analysis data that they feel is proprietary or sensitive. This situation can result in an air of suspicion over the proceedings, as well as temporary political alliances to serve business agendas unrelated to DTD development. On the other hand, the participants are usually counterparts in their respective organizations and may have more in common with each other than with their colleagues back at the home office, which can make the work a pleasurable meeting of the minds and a chance to stay current in the field.

Following are specific suggestions to help an industry-wide effort to be successful. (Section 3.4, “Handling Project Politics ” discusses handling the politics of DTD projects in general.)

  • Keep Participation Stable and of a Controllable Size

    For every company added as a participant, the potential for complexity increases. To make the work as efficient as possible, it's ideal to have only one representative from each company on the design team, and some limited number of additional representatives from each company on the steering committee (which is likely to be larger than for a single-company project). If the enormous effort required of participants is made clear from the start, you have a better chance of keeping the participant population to a reasonable number and of having the same people participate throughout the project.

  • Clarify and Document the Process Beforehand

    The design team should have a clear decision model, and may even need to go as far as signing a written description of the processes for offering suggestions, making decisions, and handling deadlocks. Each design team representative should be officially empowered by his or her company to participate in the decision-making, so that the discussions and results will have validity.

    The design team's facilitator and recordist can be picked from among the participants, if it is generally agreed that they can be impartial. If such people can't be found, the consortium of companies may need to resort to hiring consultants.

  • Be Organized and Disciplined

    It probably seems like dull advice, but you won't regret any of your efforts to plan the work, resources, and schedule; stick to the plan; document every single decision you make; and insist on thorough reviews and explicit signoffs. Because of the toll that cultural conflicts can take on the effort, be sure to define and check your understanding of all terms, even the seemingly most obvious. You may need to resort to using neutral terms that no company uses in real life, so as not to favor any particular culture.

  • Consider Building Some Flexibility into the Result

    Depending on the project's goals, the markup model might appropriately be either prescriptive or descriptive. Regardless of its general approach, though, the model for an industry DTD often must be relatively forgiving in order to accommodate divergent cultural and structural styles. If some of the participating companies already use SGML, some parts of the modeling work can resemble a massive DTD customization effort, where all the existing models are unified into a new model that accommodates their essential characteristics.

    Usually, the main goal of an industry-wide DTD project is to ease interchange between business partners. A subsidiary goal may be to encourage software vendors to support the resulting DTD, which will be widely used, in a robust way. There is potential for conflict in these goals because the perfect interchange DTD will probably not be a perfect authoring or presentation DTD, even though some participants may expect to use off-the-shelf software directly with the DTD sanctioned by the industry group.

    Even for DTDs whose only ostensible purpose is interchange, often a segment of the user population will plan to author and process documents using the interchange DTD. It's essential to be realistic and accurate in your goal-setting. Particularly for very large models and user populations, you may want to consider casting your effort as the modeling of a “library” of elements, of which a subset can be chosen for actual use, an approach that is particularly useful for elements in the information pool. The DTD implementor can then structure the DTD with appropriate kinds of customization in mind. (Techniques for DTD reuse and customization are discussed in Chapter 10, Techniques for DTD Reuse and Customization .)

  • Recognize Special Interchange Needs

    Industry interchange situations may require that information about the original presentation or delivery of the document travel with it. For example, it may be necessary to ship augmented information (as discussed in Section 6.8, “Augmented Text”) that has been folded back into the instance. This requirement makes the interchange DTD act somewhat like a presentation DTD.

  • Don't Give Up Till All Decisions Have Been Made

    Because the effort is so expensive and can drag on for such a long time, often the participants run out of steam and some work is left undone. Before you start, recognize that this failure to finish can seriously jeopardize the whole effort, and commit to finishing the job properly. Not only with the project have a greater chance of success, but all the participants will have a sense of personal satisfaction.