Simple API For XML - Benefits

Benefits

SAX parsers have some benefits over DOM-style parsers. A SAX parser only needs to report each parsing event as it happens, and normally discards almost all of that information once reported (it does, however, keep some things, for example a list of all elements that have not been closed yet, in order to catch later errors such as end-tags in the wrong order). Thus, the minimum memory required for a SAX parser is proportional to the maximum depth of the XML file (i.e., of the XML tree) and the maximum data involved in a single XML event (such as the name and attributes of a single start-tag, or the content of a processing instruction, etc.).

This much memory is usually considered negligible. A DOM parser, in contrast, typically builds a tree representation of the entire document in memory to begin with, thus using memory that increases with the entire document length. This takes considerable time and space for large documents (memory allocation and data-structure construction take time). The compensating advantage, of course, is that once loaded any part of the document can be accessed in any order.

Because of the event-driven nature of SAX, processing documents is generally far faster than DOM-style parsers, so long as the processing can be done in a start-to-end pass. Many tasks, such as indexing, conversion to other formats, very simple formatting, and the like, can be done that way. Other tasks, such as sorting, rearranging sections, getting from a link to its target, looking up information on one element to help process a later one, and the like, require accessing the document structure in complex orders and will be much faster with DOM than with multiple SAX passes.

Some implementations do not neatly fit either category: a DOM approach can keep its persistent data on disk, cleverly organized for speed (editors such as SoftQuad Author/Editor and large-document browser/indexers such as DynaText do this); while a SAX approach can cleverly cache information for later use (any validating SAX parser keeps more information than described above). Such implementations blur the DOM/SAX tradeoffs, but are often very effective in practice.

Due to the nature of DOM, streamed reading from disk requires techniques such as lazy evaluation, caches, virtual memory, persistent data structures, or other techniques (one such technique is disclosed in ). Processing XML documents larger than main memory is sometimes thought impossible because some DOM parsers do not allow it. However, it is no less possible than sorting a dataset larger than main memory using disk space as memory to sidestep this limitation.

Read more about this topic:  Simple API For XML

Famous quotes containing the word benefits:

    It is with benefits as with injuries in this respect, that we do not so much weigh the accidental good or evil they do us, as that which they were designed to do us.—That is, we consider no part of them so much as their intention.
    Laurence Sterne (1713–1768)

    I do seriously believe that if we can measure among the States the benefits resulting from the preservation of the Union, the rebellious States have the larger share. It destroyed an institution that was their destruction. It opened the way for a commercial life that, if they will only embrace it and face the light, means to them a development that shall rival the best attainments of the greatest of our States.
    Benjamin Harrison (1833–1901)

    Through all opposition the personal benefits of the reform [dress] [bracketed word in original] have compensated; but had it been mainly sacrifice, the thought of working for the amelioration of women and the elevation of humanity would still have been the beacon-star guiding me on amid all discouragements.
    Susan Pecker Fowler (1823–1911)