Coding XHTML |
|||
|
|
|
| |
|
|---|---|
|
How XHTML evolved from HTML and XML
|
HTML was Developed From SGMLSGML, the Standard Generalized Markup Language, was designed as a way to build information-formatting languages. HTML, designed by Tim Berners-Lee in 1989-90, is one of the languages derived from SGML. XML is a "Meta-Language"HTML and the World Wide Web showed the potential for sharing information among millions of computers around the world. XML was derived from SGML as a simplified and extended set of rules for designing information mark-up languages. XML isn't used directly, but is used to create new codes for specific information-sharing purposes. Because it's a language for creating languages, it is called a "meta-language". XML works by having each language derived from it defined in a Document Type Definition (DTD). This is a file that uses XML to define each of the tags of the language and their purpose. In addition, information about how each tag appears is put into a Stylesheet. XHTML adapts HTML to XMLWith XML providing more possibilities for information-sharing, a new version of HTML was created to take advantage of the new features. That's XHTML. But the new capabilities come at a price: Certain things have to be done differently, so for people who already know HTML, there are old habits to break and new ways to adopt. One of the main differences between HTML and XHTML is the need to separate structure from format... |
|
Separating Structure from Format
|
What are "Structure" and "Format"?Structure is the outline of the information - the skeleton. It consists of things like:
Format is how the information is presented. It could include things like:
Reasons for Separating ThemThe main reason for separating structure and format is that there are so many ways information can be presented. The Web, and more generally the Internet, is used to transmit and present information in a stunning variety of media:
And more interestingly, computerized agents ("bots") are evolving to become information-search tools with the potential to leave today's search engines far behind. For all but computer screens, HTML is a hassle to use, because it is very difficult or impossible to tell which tags can be ignored by (for example) a text reader, and which tags have information that helps make the meaning and structure clear. How the Separation WorksXHTML requires that we avoid using tags that are purely formatting devices, such as <font>, <b>, <i>. Instead, we use "styles" - Cascading Style Sheets (CSS) that work somewhat like the styles in Microsoft Word. Styles allow us to define the formatting for each of the structural tags - the headings, paragraph, lists, table parts, "strong" and "emphasis" tags. By changing the definition of the formatting for each style, one document can be used in a wide variety of ways. |
| |
|||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
From the Top
|
What in the World is This?When a browser - or more generally a User Agent (UA) - receives a file to display, the first thing it needs to know is what kind of file it is. "What in the world does this file contain?" A closely related second questions is, "What am I supposed to do to present it to humans on this device?" When HTML was the main language used on the Web, browsers had only to look at the <HTML> tag at the beginning, and use their own programmed instructions to read the file. With XML and its many derivative language, the job is not so simple. Instead of being programmed directly to handle HTML, browsers are now programmed to handle XML descriptions of the language in the file. Since XML-based languages are constantly being developed and improved, browsers need to find an on-line definition of how to handle this variant of XML-based language. So: the first thing in a document needs to be a reference to the language being used, and where to find its definition - the Document Type Description, or DTD. In the version of XHTML we're using now, this means including a DOCTYPE statement at the beginning of the file:
By using this, all files that use XHTML refer to the same XML definition, located at the World Wide Web Consortium (W3C). W3C is the organization that defines HTML, XML, and XHTML - plus many other derivatives of XML. If you're curious about XML, you can look
at the DTD yourself. XHTML Namespace DeclarationIn addition to the DTD, the browser uses a Namespace Declaration. This is a more complete form of the <HTML> tag used in older HTML documents:
This contains a reference (for human reading) to the documents that explain XHTML, and also defined the human language of the document - in this case "en" = English. Character Set DefinitionThere are many types of characters out there, and you can't be too careful. ;-) Actually, the "characters" referred to are letters and symbols. If the World Wide Web is to be truly world-wide, it must accomodate the letters and symbols of all the major languages, including Greek, Hebrew, Arabic, Russian, Korean, Chinese, Japanese and many others - maybe even Elvish. Before a document can truly claim to be a citizen of the World Wide Web, it needs a "passport" defining what character set (alphabet) it uses. To specify English, put this line in the Head area of your file:
|
||||||||||||||||||||||||||
|
Internal Differences
|
Once you've got those two lines in the file, you can write normal HTML code, except for a few little differences... The Case for CaseXHTML tags all lower-case. Although most browsers understand the
tags if you write them in upper-case or with initial caps, they aren't
really defined that way. HTML has always had some elements that are case-sensitive:
character codes for accented letters like Tags to AvoidSince HTML was originally designed to have structure and format tags all mixed up together, we now have to avoid tags that simply format the text. The main ones to avoid are:
It's also better to use styles instead of putting align=x and valign=y in your headings, paragraphs, and table elements. Close the @*#*%*& Tag Behind You!The final difference is the need to close all tags. For every tag you start, you must have a closing tag as well - or put a forward-slash at the end of the tag. In HTML, having a closing tag for some, like <p> and <li> was optional. Not in XHTML. Here's a list of tags you need to watch out for:
|
||||||||||||||||||||||||||
|
How You Can Tell
|
How can you tell if you got it right?Most browsers still operate on the theory that people don't like to see error messages. When they run into a coding error, they just quietly ignore whatever they don't understand, and do the best they can. That may be OK for amateur Web designers, but not for professionals. Professionals need to get it right - not just for their reputation and self-esteem, but because correct, valid code is more likely to work on multiple browsers and is easier to maintain. So, we validate our code. Various ValidatorsThe W3C maintains an on-line validator which you can access at their site: http://validator.w3.org/. In addition to W3C, you can validate at the Web Design Group: http://www.htmlhelp.com/tools/validator/. At both places, you enter the URL of a Web page, or at W3C you can give it the path to a local file on your disk. The software goes over your page with a fine-tooth comb. In a few moments, it returns a list of errors - or if you're lucky or persistent, the good news that your file is valid. Another way to access the validators is to use bookmarklets or favelets - bookmarks or favorites that authomatically ask the browser to take the page you're looking at to a validator. One place where you can get them is at Gazingus: http://www.gazingus.org/js/?id=102. |
||||||||||||||||||||||||||
|
Try It...
|
Play with HTML, XHTML, and ValidatorsThe best way to understand XHTML and validators is to play with them. Here's something to play with...
When you have validated your code for this example, try the same with a file of your own. Choose a small file you've created for an earlier class or small project. |
||||||||||||||||||||||||||
| |
|
|---|---|
| Review | Click here |
| Audience |
This module is for people who are familiar with HTML and want to learn how to code in XHTML. A knowledge of HTML is expected at least up through simple tables (module W24h).
|
| Objectives |
On successful completion of this module, you will be able to:
|
| Module X10d: Coding XHTML |
This document is part of a modular instruction series in Computer Instruction. For more information, see the overview or the list of modules in this series, X: "XML, XHTML, DHTML, and CSS". This document has been used in the following classes: INP 270. |
| History: |
Original: 20 January 2003
Last modification: Monday, 31-Aug-2009 11:48:07 EDT |
| Copyright |
Copyright © 2003, Laurence
J. Krieg, Washtenaw Community College Instructors: You may point to this file in your Web-based materials; however, its location may change without notice. Students: You are welcome to make a copy for your personal use. All other uses: Please contact the author, Laurence J. Krieg, for permission: krieg@ieee.org. |