Crypto Series: Vigenère’s Cipher (2)
As I promished, we're gonna see a different method to obtain the key length used to encrypt a text using Vigenère's algorithm. This is a method somewhat more difficult to understand than Kasiski's method, since it requires some mathematical analysis to obtain the recipe.
Friedman's test or the incidence of coincidences
This method, discovered by William F. Friedman in the 1920s, is based on computing the index of coincidences of the cryptogram's letters. The idea is that for two random letters from the cryptogram to be the same, there is a possibility that they were also the same in the original plaintext if the number of letters they have in between is a multiple of the key length.
Basically, we'll take the X first letters of the cryptogram and the X last letters, and count the number of coinciding letters in the same position. Finally, we'll divide this number by the number of letters taken and then we will have the index of coincidence.
Considering a source providing independent characters with the frequency distribution of English, and uniformly distributed characters for the key (i.e. all letters with the same frequency, 1/26 for the English alphabet), we have that:
- The probability that any two letters are the same is approximately 0.0385 when X is not a multiple of the key length
- The probability that any two letters are the same is approximately 0.0688 when X is a multiple of the key length.
So, with this process wi can determine that high values for the index of coincidence will mean that the shifted distance X is a multiple of the key length, and this way we will determine the most likely key length.
Let's see how we get these probabilities, so that we are able to obtain them in case of having a language different than English. We simply have to consider that for any two ciphertext characters and
to coincide, the following relation must hold:
Then, we consider two different cases: if L divides i-j, then , since in that case we have that
. So, the probability for this case is:
However, when i-j is not a multiple of L, then for the two ciphertext characters to be equal the following equation needs to hold
But as we said before, the distribution of key characters is uniform, and therefore this probability is:
That's it for today. This time there is no example, but stay tuned cause we'll see an exercise soon 🙂
And I hope next week I'm able to post some practical exercise using Cryptool to analyze a Vigenère cipher or something alike.
April 12th, 2009 - 04:49
Deberes!! 😉