Determine the Inherent Structure of Information within XML Documents
Determine Inherent Structure of Information in XML Documents
The first step in creating any XML document should be to determine the inherent structure of the information within the document.
Structure depends on individual preferences. You can structure a simple document in many ways.
Examine the following frames to consider the structure of sample documents.
You will examine a business letter and a product catalog:
Advantages to creating letters as XML documents
Consider a simple business letter sent to a law office. Attorneys, among other business professionals, need to be able to track all correspondence.
Some law offices have developed complex systems that store letters for retrieval using complex, coded directory structures.
Others put letters into a database. Some perform full-text searches of the data by looking for names until they find the needed document.
But if attorneys start creating letters as XML documents, they could search using more specific criteria to find what they need more quickly.
This system would reduce the amount of time required to service one client and free up time to pursue additional revenue.
As you can imagine, XML is not appropriate for every task. As in carpentry, users are encouraged to use the most appropriate tool to complete the job at hand. If this letter is never going to be searched for, referenced or read again, there is no point in making it an XML document.
But if this document will be kept and viewed in the future,
then thought should be given to structuring the information for maximum future usability.
Discerning structure and the Structure of sample documents
Structure of XML Documents
XML documents form a tree structure that starts at "the root" and branches to "the leaves".
XML documents are formed as element trees.
An XML tree starts at a root element and branches from the root to child elements.
All elements can have sub elements (child elements):
Data-Centric Versus Document-Centric
The examples you have seen concentrated on what are known as data-centric uses of XML.
This is where raw data is combined with markup to help give it meaning, make it easier to use, and enable greater interoperability. There is a second major use of XML and markup in general, which is known as document-centric. This is where more loosely structured content is annotated with metadata. HTML is usually considered to be a document-centric use of SGML (and XHTML, is similarly a document-oriented application of XML) because HTML is generally content that is designed to be read by humans rather than data that will be consumed by a piece of software. XML is designed to be read and understood by both humans and software but, as you will see later, the ways of processing the different styles of XML can vary considerably.
Document-centric XML is generally used to facilitate multiple publishing channels and provide ways of reusing content.
This is useful for instances in which regular content changes need to be applied to multiple forms of media at once. A few years ago I worked on a system that produced training materials for the online sector. A database held a large number of articles, quizzes, and revision aids that could be collated into general training materials. These were all in an XML format very similar to XHTML, the XML version of HTML. Once the content was finalized in this database, it was transformed using XSLT into media suitable for both the Web and a traditional printed output. When using document-centric XML in this sort of system, whenever content changes, it is only necessary to alter the underlying data for changes to be propagated to all forms of media in use. Additionally, when a different form of the content is needed,
to support mobile web browsers for example, a new transformation is the only necessary action.
The next lesson shows you how to create a well-formed document from text.
Discerning XML - Quiz
Click the Quiz link below to check your understanding of rules for XML documents. Discerning XML - Quiz