Determine the root element of the document. The root element in this example is CATALOG.
List all the other elements used in the document.
The other elements used in the catalog are:
BOOK,
TITLE,
AUTHOR,
YEAR-PUBLISHED, and
ISBN.
Create Document Tree structure
After listing all the elements, create a document tree structure for them. This figure represents the document tree for the elements in the book catalog:
Note whether elements need to appear in a specific order. In the book catalog example, assume the elements
TITLE,
AUTHOR,
YEAR-PUBLISHED, and
ISBN
must appear in that order. Also note whether certain elements have content that must be present, can be present, or may be present multiple times. In
this example, it is clear that you need one and only one TITLE and one and only one YEAR-PUBLISHED element for each BOOK element. For now, you will use only one AUTHOR element for each BOOK element as well, although that could change.
Any number of BOOK elements can exist, though there should be at least one book present to consider this a catalog.
Finally, you will note which elements will contain text and which will only contain other elements.
The elements that will only contain other elements in this example are the root element CATALOG and the BOOK element. All other elements will contain character data.
Creating the DTD
From the information gathered, you can now create a comprehensive DTD that can be used to validate this catalog and any others expected to have the same format.
<!DOCTYPE CATALOG [
<!ELEMENT CATALOG (BOOK)+>
<!ELEMENT BOOK (TITLE,AUTHOR,YEAR-PUBLISHED,ISBN?)>
<!ELEMENT TITLE (#PCDATA)>
<!ELEMENT AUTHOR (#PCDATA)>
<!ELEMENT YEAR-PUBLISHED (#PCDATA)>
<!ELEMENT ISBN (#PCDATA)>
]>
This DTD indicates the following:
The root element is CATALOG.
The CATALOG element contains one or more BOOK elements. This is indicated by the + suffix.
The BOOK element contains TITLE, AUTHOR, YEAR-PUBLISHED, and ISBN, in this order.
Each of these can contain only character data.
ISBN need not be present, but if it is present, no more than one ISBN element can be used per BOOK
element, and it will be the last element in the BOOK element set.
DTD syntax characters
DTD syntax characters throughout this module. In most of the examples we will use, the vertical white space in the DTD is optional. Indentation has been used to make the code more readable, but you need not indent your DTDs to use them. The next lesson shows you how to declare basic elements in a DTD.