|
Instructional Module X11c
|
|
||
| |
||
|---|---|---|
|
Definition from W3C
|
The specifications (rules) for XML are managed and shown at the W3C site: http://www.w3.org/TR/REC-xml. These specifications are very formal and technical. They are intended for software developers rather than Web-page coders. For a taste of formal specifications, here is the definition of a well-formed XML document: [Definition: A textual object is a well-formed XML document if:]
Document [1] document ::= prolog element Misc* Matching the document production implies that:
[Definition: As a consequence of this, for each non-root element C in the document, there is one other element P in the document such that C is in the content of P, but is not in the content of any other element that is in the content of P. P is referred to as the parent of C, and C as a child of P.] What does this mean? From this very abstract series of definitions we get to very specific rules. The following section explains how... |
|
|
Document structure
|
This refers to the definition of Document given right below:
This means a document needs to be made up of a prolog, followed by an element, and may optionally have miscellaneous content at the end. |
|
![]() |
Prolog is the part the gives information about the type of document this is. In XHTML, these lines would be considered the prolog:
|
|
|
Root Element
|
Element is defined here:
This means that a well-formed XHTML document must have a root element, which contains all the other elements. The root element in an XHTML document is:
This can't be inside any other element - that is, there can't be in any other tags that enclose the <html> </html> tags. (The tags that make up the Prolog don't have end tags.) |
|
|
The Misc. (miscellaneous) section is defined simply to allow the file to contain "white space" after the element. White space can be spaces (generated by pressing the space bar), tabs, or new-lines (from pressing the Enter or Return keys). These characters are often encountered at the ends of text files and may make it easier for text editing software to manipulate the text; the formal definition allows for this as a convenience. |
|
|
Parents and Children
|
The final part of the definition describes the concept of nesting. It means you can put one element inside another, but one element can't be partly inside and partly outside another. Think of a set of kitchen mixing bowls. Organizing the document this way results in a hierarchical structure - a bit like a family, more like a company. The final definition of the well-formedness series goes into detail about this:
The reason for this parent-child arrangement is partly to take advantage of a very efficient data structure that can be used in computers, called a binary tree structure. When a browser reads an XHTML file, rather than looking at it simply as a string of letters, if organizes it into a binary tree. This actually simplifies that way the files are processed and speeds up displaying the Web page. But if the elements (tags) aren't properly nested, processing is slowed down and confused. The Web page may be displayed wrong. Here is a diagram of what a well-formed XHTML document looks like to a browser: |
|
This example is described in more detail in module X20d "CSS Anatomy". |
||
| Summary of Well-Formedness | Well-Formedness can be summarized in these three simple rules:
|
|
| |
|
|---|---|
|
XML Definition
|
A document is valid when all the elements (tags) conform to the XML-based definition. XML definitions are put into Document Type Definitions (DTDs). The DTD for XHTML is the document referred to in the Prolog of the Web page:
Don't worry - I won't quote any of the Document Type Definition here! But take a look at the DTD yourself: browse to http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd. This is a document designed, not primarily for humans, but for computer software to read. However, more than any human-oriented description, textbook or instructional material, it is the ultimate authority on what XHTML is. |
|
Elements and Attributes
|
XML languages are built of elements, which are the chunks of text that include:
For example:
Empty elements consist of:
Empty element examples:
Attributes are modifiers or characteristics of elements. The attributes belonging to each type of element are described in the DTD. XML attributes consists of two parts:
Attribute examples:
|
|
Validation Service
|
Because XHTML is somewhat more complex and picky than HTML, W3C offers a validation service on-line. This determines whether a file is valid XHTML or not. (And remember that to be valid, a file must be well-formed as well.) At the validation service Website, you can have a file validated either on the Web or from a file on your computer. Selecting the file is quite easy. The hard part is understanding the error messages that come up! They can be very frustrating, and the best guide is experience. Validate all your XHTML files at W3C's MarkUp Validation Service: |
| |
|
|---|---|
![]() |
Click here for review questions. |
|
Audience |
|
| Objectives | |
| Module X11c: How to Do XHTML Right |
This document is part of a modular instruction series
in Computer Instruction. For more information, see the overview
or the list of modules in this series, X, XML, etc..
This document has been used in the following classes: INP 150. |
| History: |
Original: 9 September 2003, by Laurence J. Krieg
Last modification: Monday, 31-Aug-2009 11:48:07 EDT |
| Copyright |
Copyright © 2003, Laurence
J. Krieg, Washtenaw Community College Instructors: You may point to this file in your Web-based materials; however, its location may change without notice. Students: You are welcome to make a copy for your personal use. All other uses: Please contact the author, Laurence J. Krieg, for permission: krieg@ieee.org. |