Check Digit - Design

Design

Check digit algorithms are generally designed to capture human transcription errors. In order of complexity, these include the following:

  • single digit errors, such as 1 → 2
  • transposition errors, such as 12 → 21
  • twin errors, such as 11 → 22
  • jump transpositions errors, such as 132 → 231
  • jump twin errors, such as 131 → 232
  • phonetic errors, such as 60 → 16 ("sixty" to "sixteen")

In choosing a system, a high probability of catching errors is traded off against implementation difficulty; simple check digit systems are easily understood and implemented by humans but do not catch as many errors as complex ones, which require sophisticated programs to implement.

A desirable feature is that left-padding with zeros should not change the check digit. This allows variable length digits to be used and the length to be changed.

If there is a single check digit added to the original number, the system will not always capture multiple errors, such as two replacement errors (12 → 34) though, typically, double errors will be caught 90% of the time (both changes would need to change the output by offsetting amounts).

A very simple check digit method would be to take the sum of all digits (digital sum) modulo 10. This would catch any single-digit error, as such an error would always change the sum, but does not catch any transposition errors (switching two digits) as re-ordering does not change the sum.

A slightly more complex method is to take the weighted sum of the digits, modulo 10, with different weights for each number position.

To illustrate this, for example if the weights for a four digit number were 5, 3, 2, 7 and the number to be coded was 4871, then one would take 5×4 + 3×8 + 2×7 + 7×1 = 65, i.e. 5 modulo 10, and the check digit would be 5, giving 48715.

Systems with weights of 1, 3, 7, or 9, with the weights on neighboring numbers being different, are widely used: for example, 31 31 weights in UPC codes, 13 13 weights in EAN numbers (GS1 algorithm), and the 371 371 371 weights used in United States bank routing transit numbers. This system detects all single-digit errors and around 90% of transposition errors. 1, 3, 7, and 9 are used because they are coprime to 10, so changing any digit changes the check digit; using a coefficient that is divisible by 2 or 5 would lose information (because ) and thus not catch some single-digit errors. Using different weights on neighboring numbers means that most transpositions change the check digit; however, because all weights differ by an even number, this does not catch transpositions of two digits that differ by 5, (0 and 5, 1 and 6, 2 and 7, 3 and 8, 4 and 9), since the 2 and 5 multiply to yield 10.

The ISBN-10 code instead uses modulo 11, which is prime, and all the number positions have different weights . This system thus detects all single digit substitution and transposition errors (including jump transpositions), but at the cost of the check digit possibly being 10, represented by "X". (An alternative is simply to avoid using the serial numbers which result in an "X" check digit.) ISBN-13 instead uses the GS1 algorithm used in EAN numbers.

More complicated algorithms include the Luhn algorithm (1954), which captures 98% of single digit transposition errors (it does not detect 90 ↔ 09), while more sophisticated is the Verhoeff algorithm (1969), which catches all single digit substitution and transposition errors, and many (but not all) more complex errors. Both these methods use a single check digit and will therefore fail to capture around 10% of more complex errors. To reduce this failure rate, it is necessary to use more than one check digit (for example, the modulo 97 check referred to below, which uses two check digits - for the algorithm, see International Bank Account Number) and/or to use a wider range of characters in the check digit, for example letters plus numbers.

Read more about this topic:  Check Digit

Famous quotes containing the word design:

    Humility is often only the putting on of a submissiveness by which men hope to bring other people to submit to them; it is a more calculated sort of pride, which debases itself with a design of being exalted; and though this vice transform itself into a thousand several shapes, yet the disguise is never more effectual nor more capable of deceiving the world than when concealed under a form of humility.
    François, Duc De La Rochefoucauld (1613–1680)

    The reason American cars don’t sell anymore is that they have forgotten how to design the American Dream. What does it matter if you buy a car today or six months from now, because cars are not beautiful. That’s why the American auto industry is in trouble: no design, no desire.
    Karl Lagerfeld (b. 1938)

    We find that Good and Evil happen alike to all Men on this Side of the Grave; and as the principle Design of Tragedy is to raise Commiseration and Terror in the Minds of the Audience, we shall defeat this great End, if we always make Virtue and Innocence happy and successful.
    Joseph Addison (1672–1719)