Lesson 4	Discerning Inherent Structure
Objective	Determine the Inherent Structure of Information within XML Documents

Determine Inherent Structure of Information in XML Documents

The first step in creating any XML document should be to determine the inherent structure of the information within the document. Structure depends on individual preferences. You can structure a simple document in many ways. Examine the following frames to consider the structure of sample documents. You will examine a business letter and a product catalog:

Advantages to creating letters as XML documents

Consider a simple business letter sent to a law office. Attorneys, among other business professionals, need to be able to track all correspondence. Some law offices have developed complex systems that store letters for retrieval using complex, coded directory structures. Others put letters into a database. Some perform full-text searches of the data by looking for names until they find the needed document.
But if attorneys start creating letters as XML documents, they could search using more specific criteria to find what they need more quickly. This system would reduce the amount of time required to service one client and free up time to pursue additional revenue.
As you can imagine, XML is not appropriate for every task. As in carpentry, users are encouraged to use the most appropriate tool to complete the job at hand. If this letter is never going to be searched for, referenced or read again, there is no point in making it an XML document. But if this document will be kept and viewed in the future, then thought should be given to structuring the information for maximum future usability.

Discerning structure and the Structure of sample documents

1) Examine this basic business letter. — 1. Examine this basic business letter. What do you see in terms of structure? Remember that for this course, we are not concerned with the internal formatting of the document.

2) Basic way of structuring the data — 2. One very basic way of structuring the data in the business letter.

3) An XML structure for the document could be written like this. — 3. An XML structure for the document could be written like this (tags are in bold print to make them easy to see). This example shows proper, albeit general, structure. A more specific structure could bring even greater benefits. In the next example, we will add a great deal more specificity.

4) Same letter structured into more specific pieces of information — 4. Same letter structured into more specific pieces of information. In this example, a lot more specifiy has been added. However this specifity requires much more work for the user, who will have to encode the letter.

5) XML code for the version of the business letter in the previous frame — 5. XML code for the version of the business letter in the previous frame. Remember that XML is case-specific. You can choose to use any capitalization scheme you desire, but you must adhere to your choice strictly.

6) Document that is well formed but cannot be interpreted by a human — 6. While the above is in fact a well-formed document strictly in a syntactical sense, it will mean nothing from a human point of view. What is the significance of Hrumph or bL?
You see that well-formedness is useless without a sensical structure ad humanly recognizable context.

7) One of the most appropriate examples of XML usage is describing elements in a product list or catalog. — 7. One of the most appropriate examples of XML usage is describing elements in a product list or catalog.

8) The catalog data represented in XML. — 8. The catalog data represented in XML

9) Documents that are well-formed create natural tree-like structures that stem from the root — 9. Documents that are well-formed create natural tree-like structures that stem from the root. Consider the tree structure of the book catalog you just examined. This figure demonstrates one way in which you could represent this tree visually.

Structure of XML Documents

XML documents form a tree structure that starts at "the root" and branches to "the leaves".

XML documents are formed as element trees.
An XML tree starts at a root element and branches from the root to child elements.
All elements can have sub elements (child elements):

Data-Centric Versus Document-Centric
The examples you have seen concentrated on what are known as data-centric uses of XML. This is where raw data is combined with markup to help give it meaning, make it easier to use, and enable greater interoperability. There is a second major use of XML and markup in general, which is known as document-centric. This is where more loosely structured content is annotated with metadata. HTML is usually considered to be a document-centric use of SGML (and XHTML, is similarly a document-oriented application of XML) because HTML is generally content that is designed to be read by humans rather than data that will be consumed by a piece of software. XML is designed to be read and understood by both humans and software but, as you will see later, the ways of processing the different styles of XML can vary considerably.
Document-centric XML is generally used to facilitate multiple publishing channels and provide ways of reusing content. This is useful for instances in which regular content changes need to be applied to multiple forms of media at once. A few years ago I worked on a system that produced training materials for the online sector. A database held a large number of articles, quizzes, and revision aids that could be collated into general training materials. These were all in an XML format very similar to XHTML, the XML version of HTML. Once the content was finalized in this database, it was transformed using XSLT into media suitable for both the Web and a traditional printed output. When using document-centric XML in this sort of system, whenever content changes, it is only necessary to alter the underlying data for changes to be propagated to all forms of media in use. Additionally, when a different form of the content is needed, to support mobile web browsers for example, a new transformation is the only necessary action.
The next lesson shows you how to create a well-formed document from text.

Discerning XML - Quiz

Click the Quiz link below to check your understanding of rules for XML documents.
Discerning XML - Quiz