Japanese Cryptology From The 1500s To Meiji - The Two-Letter, Ten-Chart Code

The Two-Letter, Ten-Chart Code

Hyakutake Harukichi was among the first group of Japanese officers to study in Poland and on his return was made the chief of the code section of the third department of the army general staff. This was in 1926. Naturally enough, one of his first concerns was strengthening Army codes. He started by designing a new system to replace a four-letter code used by military attachés that had been in use since around 1918. The replacement was the two-letter, ten-chart code that Yardley mentions but mistakenly attributes to Kowalefsky in about 1920. Yardley gives the following description of Hyakutake's new system and its effectiveness: ZZZ

This new system was elaborate and required ten different codes. The Japanese would first encode a few words of their message in one code, then by the use of an "indicator" jump to another code and encode a few words, then to still another code, until all ten had been used in the encoding of a single message.
Messages encoded in this manner produced a most puzzling problem, but after several months of careful analysis, I discovered the fact that the messages were encoded in ten different systems. Having made this discovery, I quickly identified all the "indicators." From this point on it was not difficult to arrive at a solution.

Yardley also describes the Japanese system of sectioning their messages but does not make it clear if this applies to the two-letter, ten-chart code. Takagawa's description of Hyakutake's code does not mention any sectioning but otherwise closely matches Yardley's account. It is possible then that sectioning was not a part of Hyakutake's new system. Which code systems involved sectioning and when the systems were used is not clear. Interestingly, Michael Smith mentions in The Emperor's Codes that British codebreakers were surprised by the appearance of sectioning in Japanese codes around 1937. The British had been reading some Japanese codes since at least as far back as the Washington Naval Conference. If they did not see sectioning in Army codes until 1937, in which code did Yardley see sectioning during his time at America's Black Chamber? Further research is necessary to answer that question.

It is clear from Yardley's description that Hyakutake's new system was not very effective. The system used 10 charts, each with 26 rows and columns labeled from a to z. This gives 626 two-letter code groups. Most words and phrases will not be in the code and must be spelled out in kana. In this respect it is similar to, but larger than, the first Japanese code that Yardley broke in 1919. The difference is that this time however there were ten codes instead of just one. Basically, Hyakutake created a poly-code system where the code changes every few words. This is just a code version of a polyalphabetic substitution cipher. Polyalphabetic ciphers use several different enciphering alphabets and change between them at some interval, usually after every letter. The strength of a polyalphabetic cipher comes from how many alphabets it uses to encipher, how often it switches between them, and how it switches between them (at random or following some pattern for example). The Vigenere is probably the most famous example of a polyalphabetic substitution cipher. The famous cipher machines of World War II encipher in a polyalphabetic system. Their strength came from the enormous number of well-mixed alphabets that they used and the fairly random way of switching between them.

With a bit of luck, experienced cryptanalysts have been able to break polyalphabetic ciphers for centuries. From the late 19th century they did not even need luck --- Auguste Kerckhoffs published a general solution for polyalphabetic ciphers in 1883 in his book La Cryptographie militaire.

So although Hyakutake's new code system was original, the fundamental idea underlying the system was well known, as were its weaknesses. With only 626 code groups, it is more cipher than code. As mentioned above, the ten different code charts just make it a polyalphabetic cipher --- one with only ten "alphabets." Methods like Kerckhoffs' superimposition can be used to convert several polyalphabetically encoded messages into ten monoalphabetically encoded message chucks. Chunks which are very easily solved. It is not surprising that the members of Yardley's Black Chamber broke the code in a few months.

The use of ten charts may have been an illusory complication --- rather than improve the security of the code, it probably made the code weaker. If, instead of ten different code groups for 626 terms, Hyakutake had used the ten charts (with slight modification to make each group unique) to provide code groups for closer to six thousand terms, the code would have been much stronger.

Including more terms means that fewer have to be spelled out in kana --- which is the whole point of using a code. Further, the reduction in duplication allows more flexibility in assigning homophones. Instead of ten groups for each letter, word, or phrase, each could receive homophones based on its frequency of occurrence. For example, the cryptographer can assign an appropriately large number of homophones to high-frequency letters and words like "n," "shi," and "owari" and only one or two code groups to lower frequency elements.

Likewise, if code groups were used to indicate a switch to a new chart, this could also have weakened the code unnecessarily. In fact, Yardley specifically mentions it as making the codes easier to cryptanalyze. Generally speaking, substitution systems switch alphabets as often as possibly because that provides the best security. Their strength lies in how many alphabets they use and how randomly they switch between them.

So switching charts after every couple of words is not as secure as switching after every word. Also important for security is how the cryptographer switches between the charts. If Hyakutake's system required the code clerk to switch codes charts pseudo-randomly, that would provide more security than requiring a set sequence of changes. This is more important if the charts are derived from one another in some predictable manner. If, for example, the plaintext battle engaged is aa on chart 1, ab on chart 2, and ac on chart 3, then switching between the charts in order will pose much less difficulty for the cryptanalyst than using the charts in a more random order.

Regular polyalphabetic substitution ciphers often rely on code words to determine alphabet changes. Each letters of the code work references a different alphabet. With the ten charts of Hyakutake's system, a code number would be easy to use for pseudo-random changes --- "301934859762" means encode the first word or phrase with the third table, the second word or phrase with the tenth (zeroth) table, etc. The thirteenth word or phrase would be encoded with the third table again. Of course to give maximum security this code number needs to be changed frequently.

Unfortunately, there is no information on how tables were changed except for Yardley's vague "until all ten had been used in the encoding of a single message," quoted above. This unfortunately says nothing of the order the charts are used in.

Read more about this topic:  Japanese Cryptology From The 1500s To Meiji

Famous quotes containing the word code:

    ...I had grown up in a world that was dominated by immature age. Not by vigorous immaturity, but by immaturity that was old and tired and prudent, that loved ritual and rubric, and was utterly wanting in curiosity about the new and the strange. Its era has passed away, and the world it made has crumbled around us. Its finest creation, a code of manners, has been ridiculed and discarded.
    Ellen Glasgow (1873–1945)