|
Instructional Module X51
|
|
|
-->
Document Type Definitions for XML | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
In a Nutshell:Document Type Definitions are formal documents beginning with a DOCTYPE statement and containing definitions of the elements and attributes found in the document type. The purpose of this module is to make it
possible to understand DTD; other learning resources are available
for learning to create DTDs. Learn more in this section about...
|
||||||||||||||||||||||||||||
Referring to DTDs |
DTDs can occur either within an XML document or in a separate document, on their own. Except in textbook examples, though, document type definitions need to be public documents available through the Internet for all to use, so almost all XML documents based on DTDs refer to them in their prolog area. this module does not discuss the use of DTD statements within an XML document. To get from the XML document to the DTD, there has to be a reference, known as a Document Type Declaration. Here's what the declaration for XHTML looks like: Here's what it means: <! is the opening delimiter for DTD tags html is the root element of the document PUBLIC refers to the intention that the document be available for everyone to use. The alternative is SYSTEM, meaning the DTD is for some organization's internal use.
> is the closing DTD tag delimiter |
|||||||||||||||||||||||||||
Elements Defined |
The overall structure of DTDs is a series of element and attribute definitions. The order doesn't matter, and attributes can be listed separately, because they are linked to their elements by name. Here is a brief example:
ELEMENT (in upper-case) is the keyword introducing the definition of an element-type. example (case-sensitive) in this case is the name of the root element of this type of document (#PCDATA) is the type of data accepted in this element. PCDATA is the most common data type: plain text, including entities - the character sequences that can be rendered as other characters or sequences, like é rendered as é. Because nesting elements is such an important part of XML, element definitions include any nested elements they might (or might not) contain:
Here, element example must contain one title element, one description element, and one date element. What if one or more contained elements are optional?
The * after the element name means the parent element may have zero or more elements. In this came, there need not be a title or a date, but there could be any number of them; and there must be exactly one description. Some of the other possibilities: + means one or more, and ? means zero or one. Instead of a comma between chiled elements, you can put a vertical bar character | which means there's a choice of elements. Ask yourself: What does this mean?
<!ELEMENT example (title*, description+, date?, author |
source)>Answer: the example element can have: zero or many title elements at least one, but potentially many description elements date is optional, but no more than one can occur and either one author or one source element must be present This example show two more features of DTDs: first, it's possible to mix text and elements; second, that you can create groups that repeat.
Element furthermore can have either text or a heading element, but there can be any number of them repeated as long as there is at least some text or a heading element. |
|||||||||||||||||||||||||||
Attributes Defined |
AttributesRecall that an attribute is a modifier of an element, contained in the first tag of the element, like type in the <furthermore> element:
Here's how a DTD would set this up:
Notice these points:
Attributes Required?DTDs offer several options for indicating whether an attribute is required, not required, or has a default value.
Attribute Data Types
|
|||||||||||||||||||||||||||
| Entities Defined | What is an Entity? What is it For?An entity in XML (and before that, in SGML) is a type of abbreviation or short string that can be used for a number of purposes:
How Entities are DefinedLet's define an entity that expands to "film studio, production company, or sponsoring organization" for our movies ontology.
We can then use the entity fps this way in an XML document:
The XML software that reads this file will interpret is as:
Unparsed EntitiesSuppose we want to include movie trailers in our movie XML file. The trailer is in a video format such as Quicktime or AVI, and XML software doesn't (normally) handle this type of data. Since XML software does not try to parse ("understand") these files, there is a special way to include media files, including graphic images of all kinds, and sound files as well as video. This uses a type of entity called unparsed entities. We could define an element and attribute to refer to trailers, and we would also have to indicate what medium it is or what application can handle it - for example, a Quicktime player. This could be declared in a DTD file:
Notes about these declarations:
The entity that contains the media object reference can be declared in the DTD file, but since a separate entity is needed for each media object, it makes more sense to declare it in the XML file where it will be used.
Notes about this code:
How are Entities Used Within DTDs?Entities can be used to store part -or all - of a DTD file. This can be useful for:
When entities are used this way, they are known as parameter entities. A couple of brief examples will illustrate what this looks like:
In effect, file furthermore.dtd will look like this:
Notes about parameter entities:
So as you can see, entities are used in several different ways. |
|||||||||||||||||||||||||||
More about DTDs |
This is just a quick overview of DTDs. Get a more information! Consult these references: Tutorials
References
List |
|||||||||||||||||||||||||||
Interpreting DTDs | |
|---|---|
|
Now it's time to practice what you know about DTDs and try interpreting some. The general idea is to read a fairly simple DTD, and explain in English what the elements, attributes, and entities are for. There are three tasks in this section; the first two are fairly simple, and the third a little more challenging, but not too difficult for starters. |
|
Task 1 |
We'll start with a simple DTD from a presentation to the Reuters News Service coders, at http://www.fisd.net/presentations/Reuters500/tsld004.htm. Questions to answer:
|
Task 2 |
Let's take a look at another simple example and work through it. Thanks to Elizabeth Castro's Cookwood Press Web site and her book, XML for the World Wide Web (Peachpit Press, 2001). Browse to the Endangered Species DTD at http://www.cookwood.com/xml/examples/dtd_creating/end_species.dtd. Study the DTD to find answers to these questions:
|
Task 3 |
You may be already familiar with the Dublin Core Metadata Initiative (DCMI) and its 15 elements, if you used them in completing assignment module X65h. Though DCMI does not use its DTD as the primary definition of its elements, it makes one available for information purposes. Browse to the DCMI DTD at http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-xml-dtd.dtd and answer the following questions:
|
| |
|
|---|---|
|
Audience
|
|
| Objectives | |
| Module X51: Document Type Definitions for XML |
This document is part of a modular instruction
series in Computer Instruction. For more information, see the overview
or the list of modules in this series, X: XML,
XHTML, DHTML, CSS. This document has been used in the following classes: CIS
179. |
| History |
Original: 7 November 2006, by Laurence J. Krieg
Last modification: Monday, August 31, 2009 |
| Copyright |
Copyright © 2007, Laurence
J. Krieg, Washtenaw Community
College
Instructors: You may point to this file in your Web-based materials; however, its location may change without notice. Students: You are welcome to make a copy for your personal use. All other uses: Please contact the author, Laurence J. Krieg, for permission: krieg@ieee.org. |