In this module, you learned how to recognize XML structure and how to write custom tags to denote this structure in an XML document.
You learned the five basic rules for well-formedness, and you know that well-formedness is only the minimum requirement for a document to be considered an XML document.
You have seen that tag names can help or hinder the human readability factor in XML documents. Finally, you learned that we may well miss the potential for interoperability unless individuals adhere to common vocabularies when creating XML documents.
-
Learning Objectives
Having completed this module, you should now be able to:
- Construct XML documents
- Clarify the differences between tags in XML and HTML
- Specify the five rules of creating a well-formed XML document
- Create an HTML document that follows the well-formed XML rules
- Create a well-formed XML document
- Add clarity and information to XML documents using comments, CDATA sections, and encoding
XML documents tend to one of two extremes. At one extreme are documents that are primarily text, with markup inserted occasionally to bring out the text structure and select some text ranges for special treatment. A typical feature of this kind of XML is that its elements frequently have a mixed-content model: text and elements mixed together as siblings (children of the same element). The other kind looks as if it comes out of a relational database, and frequently it does. It is highly structured, with repeating elements of the same internal content. Text in this kind of XML appears only in the leaf elements of the
tree, never at the same level as children elements. See Listing 3-7.
Listing 3-7. Data-Oriented XML
< ?xml version="1.0"?>
<pdata>
<person id="CM123" access-level="customer">
<name>
<title>Mr.</title>
<last>Monster</last>
<first>Cookie</first>
<middle>C</middle>
</name>
<address>
<street>123 Sesame Street</street>
<city>New York</city>
<state>NY</state>
<zip>10023</zip>
</address>
<bdate>
<year>1969</year>
<month>11</month>
<day>2</day>
</bdate>
<email>cookie@sesamestreet.com</email>
<favorites>
<color>red</color>
<drink>cookie juice</drink>
</favorites>
</person>
</pdata>
This module introduced the following terms:
- Empty Element: 'The assumption of a programming language that everything is known at compile time. It is always obvious which method will be invoked in which class at which point in the flow of a program.
- Mixed Content Model: Contains content that is not homogenous.
- CDATA: The term CDATA, meaning character data, is used for distinct, but related purposes in the markup languages SGML and XML.
In the next module, DTD basics will be discussed.