When an unknown cipher has been put into the substitution class by the methods already described we may proceed to decide on the variety of substitution cipher which has been used. There are a few purely mechanical ways of solving some of the simple cases of substitution ciphers but as a general rule some or all of the following determinations must be made: 1. By preparation of a frequency table for the message we determine whether one or more substitution alphabets have been used and, if one only has been used, this table leads to the solution. 2. By certain rules we determine how many alphabets have been used, if there are more than one, and then isolate and analyze each alphabet by means of a frequency table. 3. If the two preceding steps give no results we have to deal with a cipher with a running key, a cipher of the Playfair type, or a cipher where two or more characters are substituted for each letter of the text. Some special cases under this third head will be given but, in general, military ciphers of the substitution class will usually be found to come under the first two heads, on account of the time and care required in the preparation and deciphering of messages by the last named methods and the necessity, in many cases, of using complicated machines for these processes. Case 4-a. Message OBQFO BPBRP QBAML OBHIF PILFQ FJBOX OFLNR BIXOZ EL From the recurrence of B, F and O, we may conclude that a single substitution alphabet was used for this message. If so and if the alphabet runs in the same order and direction as the regular alphabet, the simplest way to discover the meaning of the message is to take the first two words and write alphabets under each letter as follows, until some line makes sense:
The word RETIRESE occurs in the fourth line, and, if the whole message be handled in this way we find the rest of the fourth line to read USTED POR EL MISMO ITINERARIO QUE MARCHO. The message was enciphered using an alphabet where A = X, B = Y, C = Z, D = A, etc. noting that as this message is in Spanish the letters K and W do not appear in the alphabet. Case 4-b. Message HUJZH UIUPN OZYTS VQXMI SMOMX MQHUD UMREI SESJU AG This is a message in Spanish. We will handle it as in case 4-a, setting down the whole message.
Here each word of the message comes out on a different line, and noting in each case the letter corresponding to A, we have the word QUEMADOS which is the key. The cipher alphabet changed with each word of the message. A variation of this case is where the cipher alphabet changes according to a key word but the change comes every five letters or every ten letters of the message instead of every word. The text of the message can be picked up in this case with a little study. Note in using case 4 that if we are deciphering a Spanish message we use the alphabet without K or W as a rule, altho if the letters K or W appear in Case 5-a. Message DNWLW MXYQJ ANRSA RLPTE CABCQ RLNEC LMIWL XZQTT QIWRY ZWNSM BKNWR YMAPL ASDAN This message contains K and W and therefore we expect the English alphabet to be used. The frequency of occurrence of A, L, N, R and W has lead us to examine it under case 4 but without result. Let us set down the first two words and decipher them with a cipher disk set A to A and then proceed as in case 4.
The message is thus found to be enciphered with a cipher disk set A to E and the text is: BRITISH GOVERNMENT PLACED CONTRACTS WITH FOLLOWING FIRMS DURING SEPTEMBER. Case 5-b. Same as case 4-b except that the cipher message must be deciphered by means of a cipher disk set A to A before proceeding to make up the columns of alphabets. The words of the deciphered message will be found on separate lines, the lines being indicated as a rule by a key word which can be determined as in case 4-b. The question of alphabetic frequency has already been discussed in considering the mechanism of language. It is a convenient thing to put the frequency tables in a graphic form and to use a similar graphic form in comparing unknown alphabets with the standard frequency tables. For instance the standard Spanish frequency table put in graphic
Our first assumption might be that B = A and F = E but it is evident at once that in that case, S, T, U and V (equal to R, S, T and U) do not occur and a message even this short without R, S, T or U is practically impossible. By trying B = E we find that the two tables agree in a general way very well and this is all that can be expected with such a short message. The longer the message the nearer would its frequency table agree with the standard table. Note that if a cipher disk has been used, the alphabet runs the other way and we must count upward in working with a graphic table. Note also that if, in a fairly long message, it is impossible to coÖrdinate the graphic table, reading either up or down, with the standard table and yet some letters occur much more frequently than others and some do not occur at all, we have a mixed alphabet to deal with. The example chosen for case 6-a is of this character. An examination of the frequency table given under that case shows that it bears no graphic resemblance to General RemarksAny substitution cipher, enciphered by a single alphabet composed of letters, figures or conventional signs, can be handled by the methods of case 6. For example, the messages under case 4-a and 5-a are easily solved by these methods. But note that the messages under case 4-b and 5-b cannot so be solved because several alphabets are used. We will see later that there are methods of segregating the different alphabets in some cases where several are used and then each of the alphabets is to be handled as below. Case 6-a. Message QDBYP BXHYS OXPCP YSHCS EDRBS ZPTPB BSCSB PSHSZ AJHCD OSEXV HPODA PBPSZ BSVXY XSHCD This message was received from a source which makes us sure it is in Spanish. The occurrence of B, H, P and S has tempted us to try the first two words as in case 4 and 5 but without result. We now prepare a frequency table, noting at the same time the preceding and following letter. This latter proceeding takes little longer than the preparation of an ordinary frequency table and gives most valuable information. Frequency Table
It is clear from an examination of this table that we have to deal with a single alphabet but one in which the letters do not occur in their regular order. We may assume that P and S are probably A and E, both on account of the frequency with which they occur and the variety of their prefixes and suffixes. If this is so, then B and H, are probably consonants and may represent R and N respectively. D and X are then vowels by the same method of analysis. Noting that HC occurs three times and taking H as N we conclude that C is probably T. Substitute these values in the last three words of
Now Z is always prefixed by S and may be L. Taking X=I and D=O, (they are certainly vowels), V=G and Y=M, we have
Substituting these values in the rest of the message we have
We may now take Q=F, O=D, E=S, R=B, T=C, A=P and J=U and the message is complete. We are assisted in our last assumption by noting that S=E and E=S, etc., and we may on that basis reconstruct the entire alphabet. The letters in parenthesis do not occur in the message but may be safely assumed to be correct.
It is always well to attempt the reconstruction of the entire alphabet for use in case any more cipher messages written in it are received.—— Case 6-b. Message Lt. J. B. Smith, Royal Flying Corps, Calais, France.
Graham-White. The address and signature indicate that this message is in English. There are 250 letters in the cipher; the vowels AEIOU occur 109 times or 43.6%, the letters LNRST occur 62 times or 24.8%, and the letters KQVXZ occur 5 times or 2%. The proportion in the case of the vowels is somewhat too large and, in the case of the letters LRNST, it is too small. It is then questionable whether this is a transposition cipher altho, at first glance it might appear to be one. On examination for parts of possible words we are at once struck by the occurrence at irregular intervals of recurring groups, viz:
This is a strong indication that the cipher is a substitution cipher, so, to make an examination a frequency table will be constructed. Frequency Table
Superficially, this looks like a normal frequency table, but O is the dominant letter, followed by H, E, A, T, I, N, S, in the order named. It is certainly Case 6 if it is a substitution cipher at all. Let us see what can be done by assuming O=E; the triplet ENO, occurring six times might well be THE and E=T and N=H. A glance at the frequency table shows this to be reasonable. Now substitute these letters in some likely groups. FNOHOENO becomes _HE_ETHE; FTEN becomes _TH; ENOENHO becomes THETH_E; ENOHO becomes THE_E. A bit of study will show that F=W, T=I and H=R and the frequency table bears this out except that H(=R) seems to occur too frequently. The recurring groups containing DAC (see above) occur in such a way that we may be sure DAC is one word, FTRR is another and FTEN(=WITH) is a third. Now FTRR becomes WI__, which can only be completed by a double letter. LL fills the bill and we may say R=L. As DAC starts the message and is followed by FTRR (=WILL) it is reasonable to try DAC=YOU. Looking up DAC in the frequency table it is evident that we strain nothing by this assumption. We now have:
Now take the group ENOUTHOMEAH which occurs twice. This becomes THE_IRE_TOR and if we substitute U=D and M=C we have THE DIRECTOR. Next the group (FTRR)BHAMOOUEA becomes (WILL) _ROCEEDTO and the context gives word with missing letter as PROCEED, from which B=P. Next the group (ENO) IZTIETASMOSEOHIEYOCK(FNOHO) becomes (THE)__I_TIO_CE_TER_T_EU_(WHERE) and the group (FTEN)EFAPHOSMNIZTIEAHL becomes (WITH)TWO_RE_CH__I_TOR_. Now the word YOCK = (_EU_) is the name of a place, evidently. We find another group containing Y, viz: ENOSTSMAYBISD which becomes THENINCO_PANY so that evidently we should substitute M for Y. The other occurrence of Y (=M) is in the group EAYOEQISU which becomes TOMET_AND. A reasonable knowledge of geography gives us the words MEUX and METZ so that X should be substituted for K and Z for Q. We now have sufficient letters for a complete deciphering of the message.
The message deciphers: YOU WILL PROCEED TO THE AVIATION CENTER AT MEUX WHERE THE DIRECTOR HAS _EEN ORDERED TO FURNISH YOU WITH A HI_H POWER _LERIOT AEROPLANE. YOU WILL THEN IN COMPANY WITH TWO FRENCH AVIATORS ASSI_NED _Y THE DIRECTOR PROCEED TO METZ AND DESTROY THE THREE ZEPPELINS REPORTED PREPARIN_ THERE FOR A RAID ON PARIS. The substitution of B for G, G for W and K for V completes the cipher. This cipher is difficult only because the cipher alphabet is made up, not haphazard, but scientifically with proper consideration for the natural frequency of occurrence of the letters. In cipher work it is dangerous to neglect proper analysis and jump at conclusions. In the study of Mexican substitution ciphers, several alphabets have been found which are made up in a general way, like the one discussed in this case. Case 6-c.—It is a convenience in dealing with ciphers made up of numbers or conventional signs to substitute arbitrary letters for the numbers and signs. Suppose we have the message:
By arbitrary substitution of letters this is made
This message is now in convenient shape to handle as Case 6-a and on solution is found to read: ALL PERSONS HAVE BEEN ORDERED TO LEAVE FORTIFIED AREA. In the same way the message
is found to be made up entirely of numbers between 11 and 36 with the numbers 23, 28 and 30 occurring most frequently. This immediately suggests an alphabet made up of the numbers from 11 to 36 inclusive and each cipher group of figures represents two letters. By arbitrary substitution of letters for groups of two numbers we obtain:
and this message is also in shape to handle as Case 6-a. It reads, on solution, SEVEN HUNDRED MEN LEFT YESTERDAY FOR POINTS ON LOWER RIO GRANDE. |