|
Instructional Module X01c
|
|
||
| |
|
|---|---|
|
The Semantic Web is a vision and a coming reality. The vision is both wide and deep, the reality both technical and philosophical. It is a vision of computers much smarter and more helpful than they are today; of a much broader purpose for the World Wide Web. The Semantic Web is the vision of Tim Berners-Lee, James Hendler, Ora Lassila, and their colleagues of the World Wide Web Consortium.
In this module, we'll be looking at the vision, how it will work, and how all Web designers will collaborate to make the vision become a reality. |
| |
|||
|---|---|---|---|
|
Agents at Work
|
Read this short "story" written by Berners-Lee, Hendler, and Lassila to illustrate a possible Semantic Web scenario in their Scientific American article of May, 2001: The entertainment system was belting out the Beatles' "We Can Work It Out" when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His sister, Lucy, was on the line from the doctor's office: "Mom needs to see a specialist and then has to have a series of physical therapy sessions. Biweekly or something. I'm going to have my agent set up the appointments." Pete immediately agreed to share the chauffeuring. At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules. (The emphasized keywords indicate terms whose semantics, or meaning, were defined for the agent through the Semantic Web.) In a few minutes the agent presented them with a plan. Pete didn't like it — University Hospital was all the way across town from Mom's place, and he'd be driving back in the middle of rush hour. He set his own agent to redo the search with stricter preferences about location and time. Lucy's agent, having complete trust in Pete's agent in the context of the present task, automatically assisted by supplying access certificates and shortcuts to the data it had already sorted through. Almost instantly the new plan was presented: a much closer clinic and earlier times — but there were two warning notes. First, Pete would have to reschedule a couple of his less important appointments. He checked what they were — not a problem. The other was something about the insurance company's list failing to include this provider under physical therapists: "Service type and insurance plan status securely verified by other means," the agent reassured him. "(Details?)" Lucy registered her assent at about the same moment Pete was muttering, "Spare me the details," and it was all set. (Of course, Pete couldn't resist the details and later that night had his agent explain how it had found that provider even though it wasn't on the proper list.) [Scientific American May, 2001: "The Semantic Web"] A lot of every-day frustrations are embodied in this story, but their solution was worked out much more quickly and with less aggravation than it would be today. This illustration, which accompanied the article, may be helpful:
|
||
|
When We Have Agents
|
Agents, of course, are not shadowy cloak-and-dagger figures. They are computer programs that are able to take our general requests for information, goods, services, or actions, and work on our behalf to provide what we want.
Agents are already in use now for limited purposes. "Bots" (not game-bots or battle-bots!) are programs that can now do some of these tasks - take a look at their capabilities at BotSpot (http://www.botspot.com/). One reason "bots" aren't more widely used is that they're severely limited. Why is this? Because information on the Web is primarily intended for people, not for machines. People have a much deeper fund of information about the world and about language than computers do. In order to make bot/agents really useful, we need to give them the kind of information they need. The information they need is semantic information. So what's that? According to Webster:
In other words, if you ask the computer to set up a physical therapy appointment for your mother, how does it know what you mean? The "semantic" Web is a web of meanings - meanings computers can understand. |
||
|
Pieces of the Puzzle
|
The quest for computers with understanding has been a long and hard one. In 1968, when Arthur C. Clarke and Stanley Lubber wrote 2001: A Space Odyssey, they fully expected that by 2001, computers and humans would be able to converse with each other. Although computers have done many things these two visionaries did not expect, understanding human language is not one. The gulf between human meaning and computer understanding has been too wide to bridge - so far. Have we learned enough now to bridge this gulf? Tim Berners-Lee and his colleagues believe we've come close. They believe it can be done - to a useful extent - by putting the World Wide Web together with pieces of a puzzle that are now taking shape:
Let's take a look at these "puzzle pieces" in the next section. |
||
| |
||||
|---|---|---|---|---|
|
Resource Description Framework (RDF)
|
The first puzzle-piece is a tool for developing and extending languages that computers can understand. Another way to look at this is a language for creating machine-readable vocabulary. The tool for the Semantic Web is known as the Resource Description Framework, RDF. These are the basic ideas behind RDF:
|
|||
|
How Does This Relate?
|
Subject - Object - PredicateThe relationship between these basic ideas is symbolized in diagrams such as this:
Subjects and Objects are shown here in the green ovals. The Predicate is the line connecting the two. The diagram represents the relationship:
Notice that all the elements in this example look like URLs. Objects can also be represented by simple text notations, like "Smith, Hieronymous". The URI references are to resources. That's because the purpose of agents is to bring resources together with the people who need them. SourcesOn the Semantic Web, resources are represented by Uniform Resource Identifiers, or URIs. A URI has the same form as a URL, but can represent more than just Web pages. A resource can be a person, a service, an image, a Web page, a database, an entry in a database, or anything that can be controlled by computers. Anyone who can create a Web page can also create a resource. Resources are created in much the same way as Web pages, using RDF/XML notation.
|
|||
|
Machine-Readable Vocabulary
|
In order for computers to understand what we want to express in RDF, we need to turn the diagrams into a machine-readable language. This language is available: it's the eXtensible Markup Language, XML. "The eXtensible
Markup Language is accepted as THE emerging standard for data interchange
on the Web. XML allows authors to create their own markup (e.g. <AUTHOR>),
which seems to carry some semantics. However, from a computational perspective
tags like <AUTHOR> carries as much semantics as a tag like <H1>.
A computer simply does not know, what an author is and how the concept
author is related to e.g. a concept person. XML may help humans predict
what information might lie "between the tags" in the case
of <trunk></trunk>, but XML can only help. For an XML processor,
<trunk> and <i> and <bookTitle> are all equally (and
totally) meaningless. Yes, meaningless. This has direct consequences
for economy on the web." XML is flexible, powerful, and relatively simple. It has already been used to create many languages that are now in use for a wide variety of purposes. The best known of these is probably XHTML, the eXtensible HyperText Markup Language.
The XML-based language for the Semantic Web is called RDF/XML. It is used in many contexts for many purposes. Several organizations have used RDF/XML to create vocabularies - technically referred to as ontologies . Best-known examples are:
Note: These are not (X)HTML documents, so if you click on these links, you'll need to view the source code to see what they really contain. Because these are on the Web, they can be referred to from any Web-connected computer in the world. And because they are in machine-readable RDF/XML, they can be used by agents. |
|||
|
RDF Sprouts
|
RDF has sprouted several specialized vocabularies. These links take you to a description of each vocabulary in the RDF Primer:
|
|||
| |
||
|---|---|---|
What's needed before we can get software agents to work effectively for us? In addition to building RDF-based systems, individual Web designers will need to create pages that can be better understood by agents. The main thing we can do is to separate content from structure and presentation in our pages. What does this mean? We'll take a brief look in this section; more detail is found in module X13c. |
||
|
Separation of Content from Presentation
|
Content is the informational "payload" of a Web page. This includes:
Presentation is the manner in which the content is made available to the user. Why? If we separate content from presentation, agents will be able to get the content without being confused by irrelevant presentation code. Also, the same information "payload" can be delivered in multiple contexts, to multiple devices, by multiple agents. |
|
|
Separation of Presentation from Structure
|
Structure is the logical framework of the content. It's the outline or organizational skeleton behind the contents. Why? In traditional HTML, presentation often is used to convey an idea of the structure. For example, instead of using an <h1> tag for the main heading, a designer might use a <p> tag with a font tag making it large and bold. This conveys the importance of that line to a viewer, but does not convey anything to an agent. Separating structure from presentation allows the use of structural tags which are much more meaningful to software agents. |
|
|
XML and CSS to the Rescue
|
By redefining HTML using XML, the W3C was giving us a tool to make Web pages consistent with RDF and other XML-based languages. This means agents will be able to process and display many more kinds of information. Cascading Style Sheets, CSS, allows the page designer to determine the presentation of any HTML or XML entity. The designer can even specify one presentation for a normal computer screen, and another for a printer. Soon, other devices will be incorporated in the specifications of CSS as well. Let's see what future Web pages will look like... |
|
|
What Future Web Pages will Look Like
|
Here are some general guidelines to help your Web pages look good to people as well as software agents:
|
|
| |
|
|---|---|
| References |
|
![]() |
Click here for review questions. |
|
Audience
|
|
| Objectives |
On successful completion of this module, you will be able to:
|
| Module X01c: The Semantic Web |
This document is part of a modular instruction
series in Computer Instruction. For more information, see the overview
or the list of modules in this series, X: XML, XHTML,
DHTML, CSS. This document has been used in the following classes: INP
150.
|
| History: |
Original: 19 September 2003, by Laurence J. Krieg
Last modification: Monday, 31-Aug-2009 11:48:07 EDT |
| Copyright |
Copyright © 2003, Laurence
J. Krieg, Washtenaw Community College
Instructors: You may point to this file in your Web-based materials; however, its location may change without notice. Students: You are welcome to make a copy for your personal use. All other uses: Please contact the author, Laurence J. Krieg, for permission: krieg@ieee.org. |