Chapter VIII

Previous

Case 8. The Playfair cipher. This is the English military field cipher; as the method is published in English military manuals and as it is a cipher of proven reliability, it may be met with in general cipher work. The Playfair cipher operates with a key word; two letters are substituted for each two letters of the text.

The Playfair cipher may be recognized by the following points: (a) It is a substitution cipher, (b) it always contains an even number of letters, (c) when the cipher is divided into groups of two letters each, no group consists of the repetition of the same letter as SS or BB, (d) there will be recurrence of pairs throughout the message, following in a general way, the frequency table of digraphs of pairs, (e) in short messages there may be recurrence of cipher groups representing words or even phrases, and these will always be found in long messages.

In preparing a cipher by this method, a key word is chosen by the correspondents. A large square, divided into twenty-five smaller squares, is constructed as shown below and the letters of the key word are written in, beginning at the upper left hand corner. If any letter recurs in the key word, it is only used on the first occurrence. The remaining letters of the alphabet are used to fill up the square. It is customary to consider I and J as one letter in this cipher and they are written together in the same square.

If the key word chosen is LEAVENWORTH, then the square would be constructed as follows:

L E A V N
W O R T H
B C D F G
IJ K M P Q
S U X Y Z

The text of the message to be sent is then divided up into groups of two letters each, and equivalents are found for each pair.

Every pair of letters in the square must be: Either (1) in the same vertical line. Thus in the above example each letter is represented in cipher by that which stands next below it, and the bottom letter by the top one of the same column; for instance, TY is represented by FV.

Or (2) in the same horizontal line. Each letter in this case is represented by that which stands next on its right, and the letter on the extreme right by that on the extreme left of the same horizontal line with it; for instance RH is represented by TW.

Or (3) at opposite corners of a rectangle. Each letter of the pair is represented by the letter in the other corner of the rectangle in the same horizontal line with it; for instance TS is represented by WY.

If, on dividing the letters of the text into pairs, it is found that a pair consists of the same letter repeated, a dummy letter, as X, Y, or Z, should be introduced to separate the similar letters.

If the message to be sent were “The enemy moves at dawn,” it would be divided into pairs:

TH EX EN EM YM OV ES AT DA WN
and enciphered: HW AU AL AK XP TE LU VR MR HL

The message is then broken up into groups of five letters for transmission.

To decipher such a cryptogram, (knowing the key word), the receiver divides it into pairs, and from his table finds the equivalent of these pairs, taking the letter immediately above each, when they are in the same vertical line; those immediately on the left, when in the same horizontal line; and those at opposite angles of the rectangle when this is formed.

It is evident, from the foregoing description, that any letter of the plain text may be represented in cipher by one of five letters, viz: The one next below it and the other four letters in the same horizontal line with it in the square. Take, for example, the letter D of the plain text, in combination with each of the other letters of the alphabet. We have, using the key LEAVENWORTH:

DA DB DC DE DF DG DH DI DK DL DM DN DO DP DQ DR DS DT DU DV DW DX DY DZ
MR FC FD CA FG FB GR BM CM BA MX GA CR FM GM MD BX FR CX FA BR MA FX GX

This gives D represented by B C F G M
4 4 8 4 4 times,

and, connected with these five letters representing D,
we have A R D M X B C G
5 5 2 4 5 1 1 1 times.

Note that these letters are those of the vertical column containing D plus the letters B, C and G, of the horizontal line containing D.

Lieut. Frank Moorman, U. S. Army, has developed a method for determining the letters which make up the key word in a Playfair cipher. In the first place, a key word necessarily contains vowels in the approximate proportion of two vowels to three consonants and it is also likely that a key word will contain other common letters. This key word is placed in the first row or rows. Now if a table is made, showing what letters in the cipher occur with every letter, it will be found that the letters having the greatest number of other letters in combination with them are very likely to be letters of the key word, or in other words, letters occurring in the first or second lines. An example will make this clear:

Message

DB FN EX TZ MF TO VB QB QT OB XA OF PR TZ EQ RH QK QV DX OK AB PR QI EL TV KE EX XS FS BP WD BO BY BF RO EA BO RH QK QV TX GU EL AB TH TR XN ON EA AY XH BO HN EX BS HR QB ZM SE XP HF GZ UG KC BD PO EA AY XH BO XP HF KR QI AB PR QI EL BX FZ BI SE FX PB RA PR QI WC BR XD YG TB QT EA AY XH BO HN EX BS HR QB PR QI EL BX BT HB QB NF SI SE BX NU XP BU RB XB QR OX BA TB RH BP WD RP RO GU GX QR SE ZY OX BA EL AX CW BY BA SX RK RO PR HB OP BD PI CN OX EM RP KR XT EL AX CW EQ FZ SX EL RH RO PR HB UX DA SE XN ZN GU EL BX FS DG DB TB ZL VE RH BO RQ.

From this message, we make up the following table, considering the letters of each pair:

First Letters of Pairs

A B C D E F G H I K L M N O P Q R S T U V W X Y Z
A 3 1 4 1 1
B 3 2 3 1 1 4 1 3 1 1
C 1 1
D 2 2 1
E 1 5 1
F 1 2 1 1 1
G 1 1 1 1
H 5 1 3
I 1 1 5 1
K 1 2 1
L 8 1
M 1 1
N 1 1 2 1 2 1
O 6 1 4 1
P 2 1 2 3
Q 2
R 1 2 2 7 2 1
S 2 2 1
T 1 2 1
U 1 3 1
V 2
W 2
X 1 5 1 4 1 1 3 2 1 1
Y 5 1
Z 2 1 2

From this table we pick out the letters B, E, F, O, R, T, X, as tentative letters of the key word on account of the variety of other letters with which they occur. As there are but two vowels for seven letters, we will add A to the list on account of its occurrences with B, D, E, R, and X. This leaves the letters for the bottom lines of the square as follows:

. . . . .
. . . C D
G H IJ K L
M N P Q S
U V W Y Z

Referring to the table again we find the most frequent combination to be EL, occurring 8 times, with no occurrence of LE. Now, TH is the commonest pair in plain text, and HT is not common. The fact that H occurs in the same horizontal line with L and that E and T are probably in the key, will lead us to put E in the first line over H and T in the first line over L, so as to make EL equal TH.

The next most frequent combination is PR occurring 7 times, with RP occurring twice. In the square as partially arranged, PR equals M_ or N_ or Q_ or I_. We may eliminate all these except N_, and this N_ could only be NO or NA, so that we will put, tentatively the R in the second line over H and the O and A in the same line over IJ. We have then:

. E . . T
. R AO C D
G H IJ K L
M N P Q S
U V W Y Z

Let us now check this by picking out the combinations beginning with EL and seeing if the table will solve them. We find, ELTV, ELAB, ELBXFZ, ELBXBT, ELAXCWBY, ELAXCWEQ, ELRH, ELBXFS. Now, on the assumption that the letter after EL represents E, we have it represented by A three times, B three times, R once and T once. This requires that A and B be put in the same horizontal line with E, since T is already there, and R is tentatively under E.

The combination ELTV now equals THEZ. If the T were moved one place to the left, it would be THEY, a more likely combination, but this requires the L to be moved one place to the left also, by putting I or K in the key word and taking out O, R or X and returning it to its place in the alphabetical sequence. The most frequent pairs containing O are B O six times, R O four times, and O X three times. Now these pairs equal respectively E N, E S and H E, if O is put between N and P in the fourth line. We will therefore cease to consider it as a letter of the the key word. The combination ELAB can only be THE_ on the assumption that A is the first letter to the right of E. The combination ELBX occurs three times. If it represents THE_, the B must be the first letter of the first line and the X must now be placed under E where the R was tentatively put. We can get THE_ out of ELRH by putting R in the first line or leaving it where it is, but the preponderance of the BX combination should suggest the former alternative.

A new square showing these changes will look like this:

B E A T R
. X . . .
G H . L M
N O P Q S
U V W Y X

As I put in the space under B will give the word BEATRIX and as a vowel is clearly necessary there, we will so use the IJ and leave K between H and L. This leaves C, D and F to be placed. It appeared at first that F was in the key but if it is in the second line, in proximity to the letters of the first line, it will give the same indications. Completing the square then, we have

B E A T R
IJ X C D F
G H K L M
N O P Q S
U V W Y Z

With this square, the message is deciphered without difficulty.

“It is very frequently neces(x)sary to employ ciphers and they have for many centuries been employed in the relations betwe(x)en governments, for com(x)munication betwe(x)en com(x)manders and their subordinates and particularly betwe(x)en governments and their agents in foreign countries; there are many cases in history where the capture of a message not in cipher has made the captors of the message victorious in their military movements.”

It will be seen that the method of Lieut. Moorman enabled us to pick out six letters of the key word out of eight letters chosen tentatively. The reason for the appearance of F has already been noted; the letter O occurred with many other letters because it happened to remain in the same line with N and S and to be under H. It thus was likely to represent any of these three letters which occur very frequently in any text.

Two-character Substitution Ciphers

Case 9.—Two-character substitution ciphers. In ciphers of this type, two letters, numerals, or conventional signs, are substituted for each letter of the text. There are many ways of obtaining the characters to be substituted but, in general, these ciphers may be considered as special varieties of Case 6 or Case 7. The ciphers which come under this case are not well suited to telegraphic correspondence because the cipher message will contain twice as many letters as the plain text. However they are so used; an example is at hand in which two numerals are substituted for each letter and this makes transmission by telegraph very slow.

Case 9 can be recognized by some or all of the following points; the number of characters in the cipher is always an even number; often only a few, say five to ten, of the letters of the alphabet appear; either a frequency table for pairs of the cipher text resembling the normal single letter frequency table can be made, or groups of four letters will show a regular recurrence, from which the cipher can be solved as in Case 7.

Case 9a.—

Message

RNTGN RAAGR NARNA GTGRA TGAAN NANGG RARAT NAANR NNNRN AAAGG AANGR NGGNN NRNAA AANRA TNANN NGGRN RNNRG TTGRG TGGRN ARNTG NNART GGRNR GRNNT GTGAA NNARN ARNRT TGAGG GAAAA NANNA RNAGA NGNAT NNNAT

This message contains 160 letters and it will be noted that the only letters used are A, G, N, R and T.

We may expect a simple two-letter substitution cipher at once. It will simplify the work if we divide the cipher into groups of two letters and then, if we find there are 26 or less recurring groups, to assign an arbitrary letter to each group and work out the cipher by the method of Case 6.

RN TG NR AA GR NA RN AG TG RA TG AA NN AN GG RA RA TN AA NR NN NR NA AA GG AA NG RN GG NN NR NA AA AN RA TN AN NN GG RN RN NR GT TG RG TG GR NA RN TG NN AR TG GR NR GR NN TG TG AA NN AR NA RN RT TG AG GG AA AA NA NN AR NA GA NG NA TN NN AT

With arbitrary letters substituted, we have

A B C D E F A G B H B D I J K H H L D C I C F D K D M A K I C F D J H L J I K A A C N B O B E F A B I P B E C E B B D I P F A Q B G K D D F I P F R M F L I S

Now, preparing a frequency table, with note of prefixes and suffixes we have:

Frequency Prefix Suffix
A 7 1111111 FMKAFF BGKACBQ
B 10 1111111111 AGHNOAPIBQ CHDOEIEBDG
C 6 111111 BDIIAE DIFFNE
D 9 111111111 CBLFKFBKD EICKMJIDF
E 4 1111 DBBC FFCI
F 8 11111111 ECCEPDPM ADDAAIRL
G 2 11 AB BK
H 4 1111 BKJH BHLL
I 9 111111111 DCKJBEDFL JCCKPBPP
J 3 111 IDL KHI
K 5 11111 JDAIG HDIAD
L 3 111 HHF DJI
M 2 11 DR AF
N 1 1 C B
O 1 1 B B
P 3 111 III BFF
Q 1 1 A B
R 1 1 F M
S 1 1 I

A brief study of this table and the distribution in the cipher leads to the conclusion that B, F and C are certainly vowels and are, if the normal frequency holds, equal to E, O, and A or I. Similarly D and I are consonants and we may take them as N and T. I is taken as T because of the combination IP (=possibly TH) occurring three times. The next letter in order of frequency is A; it is certainly a consonant and may be taken as R on the basis of its frequency. Let us now try these assumptions on the first two lines of the message. We have

R E A N _ O R _ E _ E N T _ _ _ _ _ N A T A O N _ N _
I I I

This is clearly the word REINFORCEMENTS and, using the letters thus found, the rest of the line becomes AMMUNITIONAND. We have then the following letters determined:

Arbitrary letters A B C D E F G H I J K L M
Plain Text R E I N F O C M T S A U D

If these be substituted we have for the message:

REINFORCEMENTS AMMUNITION AND RATIONS MUST ARRI_E _EFORE T_E FIFTEENT_ OR _E CANNOT _O_D OUT_.

From this the remainder of the letters are determined:

Arbitrary letters N O P Q R S
Plain text V B H W L X

Now let us substitute the two-letter groups for the arbitrary letters:

Arbitrary letters K O G M B E P C R H D F A J I L N Q S
Two-letter groups GG RG AG NG TG GR AR NR GA RA AA NA RN AN NN TN GT RT AT
Plain text A B C D E F H I L M N O R S T U V W X

It is evident that the cipher was prepared with the letters of the word GRANT chosen by means of a square of this kind:

G R A N T
G A B C D E
R F G H I K
A L M N O P
N Q R S T U
T V W X Y Z

Thus TG=E, AN=S, etc., as we have already found.

Case 9-b

Message

1950492958 3123252815 4418452815 2048115041
2252115345 5849134124 5028552526 5933195222
5245113215 6215584143 2861361265 2945565015
2342455850 6345542019 1550185311 2115415828
1124174553 4554205950 2552454132 1533492048
5018152364

An examination of the groups of two numerals each which make up this message, shows that we have 11 to 36 and 41 to 65 with eleven groups missing. Now the 11 to 36 combination is a very familiar one in numeral substitution ciphers (See Case 6-c) and it will be noted that 41 to 66 would give us a similar alphabet. Let us make a frequency table in this form:

Group Frequency Group Frequency
11 11111 41 11111
12 1 42 1
13 1 43 1
14 44 1
15 111111111 45 111111111
16 46
17 1 47
18 111 48 11
19 111 49 111
20 1111 50 11111111
21 1 51
22 11 52 1111
23 111 53 111
24 11 54 11
25 111 55 1
26 1 56 1
27 57
28 11111 58 11111
29 11 59 11
30 60
31 1 61 1
32 11 62 1
33 11 63 1
34 64 1
35 65 1
36 1 66

Each of these tables looks like the normal frequency table except for the position of 20 and 50 which should represent T, by all our rules, and should be apparently 30 and 60. But suppose we put the alphabet and corresponding numerals in this form:

1 2 3 4 5 6 7 8 9 0
1 or 4 A B C D E F G H I J
2 or 5 K L M N O P Q R S T
3 or 6 U V W X Y Z

Then A=11 or 41, J=10 or 40 and T=20 or 50 as we found. Using the above alphabet, the message may easily be read. Note that this cipher is made up of ten characters only, the Arabic numerals.

Case 9c—

Message

1156254676 2542294432 1949294015 1423217211 2979703115
4924213511 7424147875 7646252444 5143254845 3179742533
4055461512 7573227945 1627481511 7042351944 1378252149
2514764553 1548342126 7215254075 1611257845 4642217415
4952197929 7015242143 2925444933 1970187531 4079254829
4551491411 7321171554

An examination of this message shows it to consist of forty-four different two-figure groups running from 11 to 79. Let us prepare a frequency table of these groups.

Group Frequency
11 111111
12 1
13 1
14 1111
15 111111111
16 11
17 1
18 1
19 1111
20
21 1111111
22 1
23 1
24 1111
25 11111111111
26 1
27 1
28
29 111111
30
31 111
32 1
33 11
34 1
35 11
36
37
38
39
40 1111
41
42 111
43 11
44 1111
45 11111
46 1111
47
48 1111
49 111111
50
51 11
52 1
53 1
54 1
55 1
56 1
57
58
59
70 1111
71
72 11
73 11
74 111
75 1111
76 111
77
78 111
79 11111

We at once note the resemblance between the frequency tables for the groups 11 to 19 and 21 to 29; for the groups 30 to 36 and 50 to 56; and for the groups 40 to 49 and 70 to 79. Also the groups 11 to 19 and 21 to 29 have a frequency fitting well with the normal frequency table of the letters A to I; the groups 41 to 49 and 71 to 79 have a frequency fitting well with the normal frequency table of the letters K to S; and the groups 31 to 36 and 51 to 56 have a frequency fitting well with the normal frequency table of the letters U to Z. We have J and T unaccounted for, but note what occurred in Case 9-b and that 40 and 70 would correspond well with T if they followed respectively 49 and 79. We may now make up a cipher table as follows:

1 2 3 4 5 6 7 8 9 0
1 or 2 A B C D E F G H I J
4 or 7 K L M N O P Q R S T
3 or 5 U V W X Y Z

and this table will solve the cipher message.

In ciphers coming under case 9-b and 9-c, it is not uncommon to assign some of the unused numbers such as 85, 93, etc., to whole words in common use or to names of persons or places. In case such groups are found, the meaning must be guessed at from the context; but if many messages in the same cipher are available, the meaning of these groups will soon be obtained. The appearance of such odd groups of figures in a message does not interfere materially with the analysis, and it will be apparent at once on deciphering the message that they represent whole words instead of letters.

                                                                                                                                                                                                                                                                                                           

Clyx.com


Top of Page
Top of Page