Similarity Enhanced Transfer - Method

Method

The developers of SET found that for if a particular piece of content has several different versions available for download from a P2P network, there may be enough similarity between the files in the different releases that they can all be used as a download source for a single version. In particular they found, (quoted from ):

  • MP3 music files with identical sound content but different header bytes (artist and title metadata or headers from encoding programs) were 99% similar.
  • Movies and trailers in different languages were often 15% or more similar.
  • Media files with apparent transmission or storage errors differed in a single byte or small string of bytes in the middle of the file.
  • Identical content packaged for download in different ways (e.g., a torrent with and without a README file) were almost identical.

SET uses a technique called handprinting - which is based on earlier techniques known as "Shingling" that have been used to filter junk e-mails - to seek out the files that contain similar chunks of data to those in the requested file. The SET system computes a handprint for each file, and can take chunks of data from files which are both identical and similar to the one being searched for. The lower similarity ranking that SET searches for, the more sources for that data are likely to be found. The authors claim that the extra overhead of locating these sources does not out-weigh the benefit of using them to help saturate the recipient's available bandwidth and that exploiting similar sources can significantly improve download time.

In tests, SET improved the transfer time of an MP3 music file by 71% and a 55Mb movie trailer went 30% faster using the researchers' techniques to draw from movie trailers that were 47% similar. SET could help most with less popular files, but it is not believed to improve transfer rates much for popular data, where there is already a huge set of people downloading it. Experiments suggest that in the other cases, SET can help a lot.

Note however, that SET can only improve download speed when the downloader's connection is not the bottleneck. This is more often the case for unpopular downloads.

Read more about this topic:  Similarity Enhanced Transfer

Famous quotes containing the word method:

    You that do search for every purling spring
    Which from the ribs of old Parnassus flows,
    And every flower, not sweet perhaps, which grows
    Near thereabouts into your poesy wring;
    You that do dictionary’s method bring
    Into your rhymes, running in rattling rows;
    Sir Philip Sidney (1554–1586)

    I know no method to secure the repeal of bad or obnoxious laws so effective as their stringent execution.
    Ulysses S. Grant (1822–1885)

    “English! they are barbarians; they don’t believe in the great God.” I told him, “Excuse me, Sir. We do believe in God, and in Jesus Christ too.” “Um,” says he, “and in the Pope?” “No.” “And why?” This was a puzzling question in these circumstances.... I thought I would try a method of my own, and very gravely replied, “Because we are too far off.” A very new argument against the universal infallibility of the Pope.
    James Boswell (1740–1795)