|
Instructional Module X50c
|
|
|
| |
|
|---|---|
|
What are XML Namespaces? | |
|---|---|
In a Nutshell:A "namespace" is simply a collection of names used of some purpose or other. When XML files are put on the Internet, names of elements need to be kept unique regardless of who created them. To do that, namespaces are created using URIs as a way to distinguish one namespace from another. Within a document, an abbreviation is created to make it easy to distinguish element names from different namespaces. Learn more in this section about...
|
|
What is a Namespace? |
OK, let's start by understanding in general what a "namespace" is: A namespace is a collection of names used of some purpose or other. Just what does this mean?
Ask
yourself: How do humans get around the problem of multiple people
with the same name? Answer: Lots of
ways, including nicknames and Social Security Numbers. Ask yourself: How do computers (computer scientiests, really) get around the problem of multiple variables with the same name? Answer: By inventing languages with local variables, and then avoiding global variables; by prefixing unique identifiers to variable names when variable names must be global. |
How does XML Create and Use Namespaces? |
XML uses names for two things: elements, and attributes of the elements. If XML files all contained element and attribute names invented by the same person who creates the file, and if XML files were not shared on the Internet, there would be no problem. Since neither of these conditions is true, we've got a problem! How did XML get into this trouble?
How did W3C Get Us Out of the Problem?W3C's XML Core Working Group provided this two-part solution in 1999:
Declaring a NamespaceFirst, you need to declare the namespace. There are some variations: You can either declare it in the prolog, or in an element; you can define a "nickname" or not, at your convenience. The prolog is the part before the root element; here's how you do it there:
More commonly, the namespace is declared in the root element - the element that contains all the others. First, without a nickname:
This makes the unique identification of the namespace, and without the "nickname" makes this the default namespace for all element and attribute names between <movies> and </movies>. In the root element with a "nickname":
This makes it possible to use the "nickname" m to identify an element name as part of the namespace declared as "http://poggin.wccnet.edu/xml/movies". Using Namespace "Nicknames"The "nickname" is put in front of an element or attribute name, separated from it by a colon ":" (and no space).
In the example above, the every element and every attribute are prefixed with the "nickname" m we gave to the namespace. This is good, but it can get a bit tiresome if you're typing it all yourself. Another option is to declare the namespace as the default, so you don't have to use the "nickname". Here's how:
Let's look at an example that uses two namespaces: our own for movies, and the Dublin Core namespace.
What we've done here is to declare our movies namespace as the default. We did this by not putting :m in front of the URI. But we used the Dublin Core title element, prefixing it with the "nickname" dc declared in the movies element. A note about the terminology:
|
Official and Unofficial Namespace Use |
The W3C Namespace Recommendation gives the syntax for creating namespaces. The sole purpose for it is to distinguish between names used in different ontologies. The URI used in declaring a namespace looks as if it should lead to a definition of that namespace, but officially it doesn't. In fact, the URI does not even have to be a URL (a Universal Resource Locator) because it need not have any location associated with it. If you prefer, you could use your telephone number as a URI - that's officially legitimate, as defined in RFC 3986, which defines URIs!
Officially, any URI that is unique in the global context, is a legitimate namespace identifier. But...Widely accepted "best practice" is to use the URL that points to the definition of the namespace - either a DTD or a schema. This is helpful for two reasons:
So whenever possible, a namespace URI should point to a definition of the namespace. Again: this is best practice, not an official requirement. |
What is an XML DTD? | |
|---|---|
In a Nutshell:A DTD is a Document Type Definition. The DTD defines each element, its data type, and its attributes. A DTD's statements begin with <! and end with >. Learn more in this section about...
|
|
DTDs: the Concept |
DTD: Document Type Definition DTDs, in their present form, emerged as part of SGML in the 1980s. They use Augmented Backus-Naur Form, a system for expressing technical specifications and rules, widely used in computer science. The purpose of a DTD is to define the elements and attributes of a document type - the variety of markup, whether XML or SGML - in such a way that both humans and software can use it. |
Recognizing DTDs |
DTDs can be either separate documents, or part of the XML file. Either way, you can recognize them by their delimiters:
XML documents often have a Document Type Declaration in their prolog. This refers to the Document Type Definition elsewhere. For example, an XHTML document declares its type this way:
The Document Type Definition itself (either in a separate document or in the XML file) typically consists of defintions of entities, elements, and their attributes, in the general form:
More about DTDs in module X51c. |
What is an XML Schema? | |
|---|---|
In a Nutshell:An XML schema is a document that uses XML to define an XML language type, or document type. You can recognize a schema because it uses standard XML delimiters, begins with the <xs:schema> element as its root, and generally refers to the namespace of http://www.w3.org/2001/XMLSchema. Learn more in this section about...
Got it already? Check
yourself...
|
|
The Idea of Schemas |
The idea behind schemas is to define types of XML using XML syntax. This makes simpler software able to validate XML files, because it only needs to understand XML syntax, rather than having to interpret ABNF as it would with a DTD. In addition, schemas are able to define more types of data format than DTDs do. This makes it possible for software to check much more precisely the validity of data entered in these files. In turn, this makes schema-defined XML more compatible with database management systems and other software that requires strict adherence to data types and formats. |
Recognizing Schemas |
XML schema documents are easy to recognize:
Schemas can be constructed using any of several schema systems, the most widely used of which is that provided by W3C. More about schemas in module X52c. |
What's the Difference between DTDs and Schemas? | |||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
In a Nutshell:DTDs and schemas both provide ways to describe XML languages. DTDs are an older way that uses ABNF; schemas are a newer way that uses XML itself. Schemas provide greater simplicity and flexibility, but lack the ability to define "entities". Learn more in this section about the difference between DTDs and schemas, and recommendations about which to use.
|
|||||||||||||||||||||||||||||||
Comparing DTDs and Schemas |
DTDs appeared with SGML, the ancestor of XML, in the early 1980s. Schemas, on the other hand, began appearing in the late 1990s along with XML itself. This chart outlines the differences:
Ask
yourself, before looking at the next section: Which would
you recommend using for a new XML project: DTD, or schema? Answer:
See "Recommendations" below. |
||||||||||||||||||||||||||||||
Recommendations |
Given the "competition" between DTDs and the various types of schema languages, what do you think is a wise course to follow when beginning a new project?
First of all, most large projects don't start with totally new XML definitions. You would probably want to use DTDs if you're extending a large standard based on DTDs, or schemas if the opposite is true. But that's not a hard-and-fast rule, since nothing prevents using a DTD in one namespace and a schema in another. There's a lot to be said for consistency, though! The added data typing and checking available with schemas makes it possible to keep the data more consistent and accurate. All newer software is able to deal with schemas, particularly the W3C standard variety. Among the varieties of schema available, the W3C standard is almost always the best type to choose. Some varieties offer special features or capabilities not available in the W3C standard, but as with any choice between standard and non-standard, the standard offers the best long-term compatibility and path to the future. What about DTDs? The one feature offered by DTDs that is not available in schemas is the ability to define entities. Entities are short sequences of characters beginning with & and ending with ; that are defined to become other sequences of characters when an XML file is processed. The most familiar of these are the ones "built-in" to the XML standard, such as: < > & é
which are rendered as: < > & é (For more detail on standard entities, see module W22f.) You don't need a DTD to use the standard entities. However, DTDs let you custom-build your own entities, so that longer sequences of characters - such as standard names and addresses that are repeated frequently in a data file - can be abbreviated. Although convenient, there is nothing in custom-built entities that can't be handled fairly easily by software in other ways. The most frequent circumstance making it necessary to use DTDs is older, specialized software. All the general purpose XML editors are comfortable with schemas, but older, specialized software may be dependent on DTDs. This might be the case if your project calls for building on an existing ontology that defines its XML using DTDs, and has not been updated to handle schemas. Bottom line: unless there is some circumstantial reason not to, use W3C schemas. |
||||||||||||||||||||||||||||||
| |
|
|---|---|
|
Audience
|
|
| Objectives | |
| Module X50c: DTDs, Schemas, and Namespaces |
This document is part of a modular instruction
series in Computer Instruction. For more information, see the overview
or the list of modules in this series, X: XML,
XHTML, DHTML, CSS. This document has been used in the following classes: CIS
179. |
| History |
Original: 6 November 2006, by Laurence J. Krieg
Last modification: |
| Copyright |
Copyright © 2007, Laurence
J. Krieg, Washtenaw Community
College
Instructors: You may point to this file in your Web-based materials; however, its location may change without notice. Students: You are welcome to make a copy for your personal use. All other uses: Please contact the author, Laurence J. Krieg, for permission: krieg@ieee.org. |