Motivation and Uses
The name of the lexicographic order comes from its generalizing the order given to words in a dictionary: a sequence of letters (that is, a word)
- a1a2 ... ak
appears in a dictionary before a sequence
- b1b2 ... bk
if and only if the first ai, which is different from bi, comes before bi in the alphabet.
That comparison assumes both sequences are the same length. To ensure they are the same length, the shorter sequence is usually padded at the end with enough "blanks" (a special symbol that is treated as coming before any other symbol). This also allows ordering of phrases. For the purpose of dictionaries, etc., padding with blank spaces is always done. See alphabetical order.
For example, the word "Thomas" appears before "Thompson" in dictionaries because the letter 'a' comes before the letter 'p' in the alphabet. The 5th letter is the first that is different in the two words; the first 4 letters are "Thom" in both. Because it is the first difference, the 5th letter is the most significant difference (for an alphabetical ordering).
A lexicographical ordering may not coincide with conventional alphabetical ordering. For example, the numerical order of Unicode codepoints does not always correspond to traditional alphabetic orderings of the characters, which vary from language to language. So the lexicographic ordering induced by codepoint value sorts strings in an unambiguous canonical order, but it does not necessarily "alphabetize" them in the conventional sense.
An important property of the lexicographical order is that it preserves well-orders, that is, if A and B are well-ordered sets, then the product set A × B with the lexicographical order is also well-ordered.
An important exploitation of lexicographical ordering is expressed in the ISO 8601 date formatting scheme, which expresses a date as YYYY-MM-DD. This date ordering lends itself to straightforward computerized sorting of dates such that the sorting algorithm does not need to treat the numeric parts of the date string any differently from a string of non-numeric characters, and the dates will be sorted into chronological order. Note, however, that for this to work, there must always be four digits for the year, two for the month, and two for the day, so for example single-digit days must be padded with a zero yielding '01', '02', ..., '09'.
Another example of digits ordered lexicographically is 101,102,103,104,105,106,107,108,109,110,111,112... 200, 201, 202 etc.
Another generalization of lexical ordering occurs in social choice theory (the theory of elections). Consider an election in which there are 4 candidates A, B, C and D, each voter expresses a top-to-bottom ordering of the candidates, and the voters' orderings are as follows:
18% | 17% | 33% | 32% |
---|---|---|---|
A | B | C | D |
B | A | D | B |
C | C | A | A |
D | D | B | C |
The MinMax voting method is a simple Condorcet method that counts the votes as in a round-robin tournament (all possible pairings of candidates) and judges each candidate according to its largest "pairwise" defeat. The winner is the candidate whose largest defeat is the smallest. In the example:
- The largest defeat of A is by D: 65% (33%+32%) rank D over A.
- The largest defeat of B is by D: 65% (33%+32%) rank D over B.
- The largest defeat of C is by A (or B): 67% (18%+17%+32%) rank A over C (and B over C).
- The largest defeat of D is by C: 68% (18%+17%+33%) rank C over D.
MinMax declares a tie between A and B since the largest defeats for both are the same size, 65%. This is like saying "Thomas" and "Thompson" should be at the same position because they have the same first letter. However, if the defeats are compared lexically, we have the MinLexMax method. With MinLexMax, because the largest defeats of A and B are the same size, their next largest defeats are then compared:
- A's next largest defeat is 0%. (This is a padding, since A has only one defeat.)
- B's next largest defeat is by A: 51% (18%+33%) rank A over B.
Since B's next largest defeat is larger than A's, MinLexMax elects A, which makes more sense than the MinMax tie since a majority rank A over B.
Another usage in social choice theory is the Ranked Pairs voting method. Although usually defined by a procedure that constructs the order of finish, Ranked Pairs is equivalent to finding which of all possible orders of finish is best according to a minlexmax comparison of the majorities they reverse. In the example above, the Ranked Pairs order of finish is ABCD (which elects A). ABCD affirms the majorities who rank A over B, A over C, B over C and C over D, and reverses the majorities who rank D over A and D over B. The largest majority that ABCD reverses is 65%. The only other ordering that wouldn't reverse a larger majority is BACD (which also reverses 65%). ABCD is a better order of finish than BACD because the lexically relevant set of majorities—the majorities on which ABCD and BACD disagree—is {A over B} and BACD reverses the largest majority in this set.
Read more about this topic: Lexicographical Order
Famous quotes containing the word motivation:
“Self-determination has to mean that the leader is your individual gut, and heart, and mind or were talking about power, again, and its rather well-known impurities. Who is really going to care whether you live or die and who is going to know the most intimate motivation for your laughter and your tears is the only person to be trusted to speak for you and to decide what you will or will not do.”
—June Jordan (b. 1939)