Merge Sort - Use With Tape Drives

Use With Tape Drives

An external merge sort is practical to run using disk or tape drives when the data to be sorted is too large to fit into memory. External sorting explains how merge sort is implemented with disk drives. A typical tape drive sort uses four tape drives. All I/O is sequential (except for rewinds at the end of each pass). A minimal implementation can get by with just 2 record buffers and a few program variables.

Naming the four tape drives as A, B, C, D, with the original data on A, and using only 2 record buffers, the algorithm is similar to Bottom-up implementation, using pairs of tape drives instead of arrays in memory. The basic algorithm can be described as follows:

  1. Merge pairs of records from A; writing two-record sublists alternately to C and D.
  2. Merge two-record sublists from C and D into four-record sublists; writing these alternately to A and B.
  3. Merge four-record sublists from A and B into eight-record sublists; writing these alternately to C and D
  4. Repeat until you have one list containing all the data, sorted --- in log2(n) passes.

Instead of starting with very short runs, the initial pass will read many records into memory, do an internal sort to create a long run, and then distribute those long runs onto the output set. The step avoids many early passes. For example, an internal sort of 1024 records will save 9 passes. The internal sort is often large because it has such a benefit. In fact, there are techniques that can make the initial runs longer than the available internal memory.

A more sophisticated merge sort that optimizes tape (and disk) drive usage is the polyphase merge sort.

Read more about this topic:  Merge Sort

Famous quotes containing the words tape and/or drives:

    We shall see but little way if we require to understand what we see. How few things can a man measure with the tape of his understanding! How many greater things might he be seeing in the meanwhile!
    Henry David Thoreau (1817–1862)

    Our drives are reducible to the will to power. The will to power is the ultimate fact at which we arrive.
    Friedrich Nietzsche (1844–1900)