Web Archiving - Aspects of Web Curation

Aspects of Web Curation

Web curation, like any digital curation, entails:

  • Certification of the trustworthiness and integrity of the collection content
  • Collecting verifiable Web assets
  • Providing Web asset search and retrieval
  • Semantic and ontological continuity and comparability of the collection content

Thus, besides the discussion on methods of collecting the Web, those of providing access, certification, and organizing must be included. There are a set of popular tools that addresses these curation steps:

A suite of tools for Web Curation by International Internet Preservation Consortium:

  • Heritrix - official website - collecting Web asset
  • NutchWAX - search Web archive collections
  • Wayback (Open source Wayback Machine) - search and navigate Web archive collections using NutchWax
  • Web Curator Tool - Selection and Management of Web Collection

Other open source tools for manipulating web archives:

  • WARC Tools - for creating, reading, parsing and manipulating, web archives programmatically
  • Search Tools - for indexing and searching full-text and metadata within web archives

Read more about this topic:  Web Archiving

Famous quotes containing the words aspects of, aspects and/or web:

    All the aspects of this desert are beautiful, whether you behold it in fair weather or foul, or when the sun is just breaking out after a storm, and shining on its moist surface in the distance, it is so white, and pure, and level, and each slight inequality and track is so distinctly revealed; and when your eyes slide off this, they fall on the ocean.
    Henry David Thoreau (1817–1862)

    The power of a text is different when it is read from when it is copied out.... Only the copied text thus commands the soul of him who is occupied with it, whereas the mere reader never discovers the new aspects of his inner self that are opened by the text, that road cut through the interior jungle forever closing behind it: because the reader follows the movement of his mind in the free flight of day-dreaming, whereas the copier submits it to command.
    Walter Benjamin (1892–1940)

    Any newspaper, from the first line to the last, is nothing but a web of horrors.... I cannot understand how an innocent hand can touch a newspaper without convulsing in disgust.
    Charles Baudelaire (1821–1867)