SXML - XML, XML Information Set and SXML

XML, XML Information Set and SXML

An XML document is essentially a tree structure. The start and the end tags of the root element enclose the whole content of the document, which may include other elements or arbitrary character data. Text with familiar angular brackets is an external representation of an XML document. Applications ought to deal with an internalized form: an XML Information Set, or its specializations (such as the DOM). This internalized form lets an application locate specific data or transform an XML tree into another tree.

The W3 Consortium defines the XML Information Set (Infoset) as an abstract data set that describes information available in a well-formed XML document. An XML document's information set consists of a number of information items, which denote elements, attributes, character data, processing instructions, and other components of the document. Each information item has a number of associated properties, e.g., name, namespace URI. Some properties—for example, "children" and "attributes"—are collections of other information items. Although technically Infoset is specified for XML, it largely applies to other semi-structured data formats, in particular, HTML.

XML document parsing is just one of possible ways to create an instance of XML Infoset.

It is worth a note that XML Information Set recommendation does not attempt to be exhaustive, nor does it constitute a minimum set of information items and properties. Its purpose is to provide a consistent set of definitions for use in other specifications that need to refer to the information in a well-formed XML document.

The abstract data model defined in the XML Information Set Recommendation is applicable to every XML-related specification of the W3 Consortium. Namely, the Document Object Model can be considered the application programming interface (API) for dealing with information items; the XPath data model uses the concept of nodes which can be derived from information items, etc. The DOM and the XPath data model are thus two instances of XML Information Set.

XML Information Set Recommendation itself imposes no restrictions on data structures or interfaces for accessing information items. Different interpretations are thus possible for the XML Information Set abstract data model. For example, it is convenient to consider an XML Information Set a tree structure, and the terms "information set" and "information item" are then similar in meaning to the generic terms "tree" and "node" respectively.

An information item may be also considered as a container for its properties, either text strings (e.g. name, namespace URI) or containers themselves (e.g. child elements for an XML element). The information set is thus a hierarchy of nested containers. Such a hierarchy of containers comprising text strings and other containers greatly lends itself to be described by an S-expression, because the latter is recursively defined as a list whose members are either atomic values or S-expressions themselves. S-expressions are easy to parse into an internal representation suitable for traversal; they also have a simple external notation, which is relatively easy to compose even by hand.

SXML is a concrete instance of the XML Infoset in the form of S-expressions. Infoset's goal is to present in some form all relevant pieces of data and their abstract, container-slot relationships with each other. SXML gives the nest of containers a concrete realization as S-expressions, and provides means of accessing items and their properties. SXML is a "relative" of XPath and the DOM, whose data models are two other instances of the XML Infoset. SXML is particularly suitable for Scheme-based XML/HTML authoring, XPath queries, and tree transformations.

XML and SXML can thus be considered two syntactically different representations for the XML Information Set.

Read more about this topic:  SXML

Famous quotes containing the words information and/or set:

    If you have any information or evidence regarding the O.J. Simpson case, press 2 now. If you are an expert in fields relating to the O.J. Simpson case and would like to offer your services, press 3 now. If you would like the address where you can send a letter of support to O.J. Simpson, press 1 now. If you are seeking legal representation from the law offices of Robert L. Shapiro, press 4 now.
    Advertisement. Aired August 8, 1994 by Tom Snyder on TV station CNBC. Chicago Sun Times, p. 11 (July 24, 1994)

    Who shall set a limit to the influence of a human being? There are men, who, by their sympathetic attractions, carry nations with them, and lead the activity of the human race. And if there be such a tie, that, wherever the mind of man goes, nature will accompany him, perhaps there are men whose magnetisms are of that force to draw material and elemental powers, and, where they appear, immense instrumentalities organize around them.
    Ralph Waldo Emerson (1803–1882)