Box UK - be creative. be innovative. be bold.

How XML and RDF are updating HTML

By standardising methods of communication (in this case the HTTP and TCP/IP protocols), the Internet has connected different cultures, using different technologies, in geographically dispersed locations.

The next stage in the evolution is to standardise not only the way in which the resources are shared, but also the underlying structure of the resources. The information contained within can then be ‘electronically’ read – and understood – rather than having to rely on the human-eye to digest and comprehend.

HTML

Currently, the majority of documents on the Internet are constructed using HTML – the Hyper-Text Mark-up Language. However, HTML provides a very crude means for document storage:

  • Common components cannot be shared across documents; instead they must be included within every file (e.g. navigation, logos, copyright information, etc.). This can lead to inconsistent branding and difficult updates (e.g. if the navigation was to change, every page would need editing).

  • HTML, as a language, specifies the format of the information, not the meaning. As a result:

    • Accurate, relevant searches – based on semantic meanings - are almost impossible to build.

    • Computer controlled content exchange is not possible. If a partner site wished to automatically query an HTML site for its latest list of services or products, then no meaningful response could be given.

    • Re-formatting the collection of documents (for example to appear on digital TV, or a WAP phone, or in a PDF brochure) is extremely difficult, as the formatting has been hard-coded, rather than applied in a separate process.


XML

Extensible Mark-up Language (XML ) is the obvious next stage for the Internet. By describing the information that the document contains, using metadata, the aforementioned problems can be overcome, and electronic syndication can be conducted at greater speeds and with greater effectiveness. XML is now becoming widely established in the content-management environment.

XML documents are plain text format – making them, by default, platform-independent. Due to their structured, self-descriptive format, they are also easily converted to other well-structured languages. Examples of these are: HTML – the standard web-document language, PDF, and WML – the WAP (Wireless Application Protocol; for mobile phones) equivalent of HTML. This enhances the potential reach of an XML based system.

RDF

The rapid growth of the World Wide Web, over the past ten years, has made an enormous amount of information available. This awards users of the Internet with a seemingly endless library to source and dissect resources.

In practice, however, the sheer volume of information - combined with little means for cataloguing or classification - creates barriers between the user and any relevant content that may be available.

To overcome these barriers, a standard means for accurately describing and mapping out resources has been introduced. Through the standard syntax of XML, an RDF (Resource Description Framework) description can accurately catalogue web-based resources. Through the exchange and dissemination of these descriptions, advanced Internet applications can source content with much greater accuracy and efficiency.

View this page in pdf formatView this page in rdf format

Glossary

RDF
Resource Description Framework
XML
Extensible Markup Language
WML
Wireless Markup Language
WAP
Wireless Application Protocol
PDF
Portable Document Format
HTML
HyperText Markup Language
Metadata
Metadata is structured data about data.

About This Page

Published: 12th Mar 2003
Level: Elementary
Type: Articles
Tech: XML
Tech: RDF