Limited Entropy Dot Com Not so random thoughts on security featured by Eloi Sanfèlix

9Mar/095

Crypto Series: Classical ciphers

During some posts we're gonna get introduced into classical ciphers. From Wikipedia, "a classical cipher is a type of cipher used historically but which now have fallen, for the most part, into disuse".

This post will study one of the most known classical ciphers, the Caesar cipher, and other similar ciphers.

Caesar Cipher

Caesar's cipher, named after Julius Caesar, is a substitution cipher that simply substitutes each letter by the letter K positions to the right in the alphabet. So, for a K value of 3, A would be encrypted as D, B as E, C as F and so on.

In mathematical terms, considering an alphabet with 26 letters, where A would be letter 0 and Z letter 25, we can define these encryption and decryption operations as:

E_k(m) = (m + k) mod{26}

D_k(c) = (m - k) mod{26}

Where mod{26} means reducing the result modulo 26, or in simpler terms, if the result is above or below the 0-25 range, we would add/subtract 26 as many times as needed to make it fall into this range.

As you can see, very simple. For instance, if we encrypt the sentence CRIPTOGRAFIA PARA TODOS under key 5, we get the following ciphertext: HWNUYTLWFKNF UFWF YTITX.

Here we can already see one of the weaknesses of this cipher: the structure of the plaintext remains. As you can see, last word in the message starts with a Y, then it has two T's, one I and one X. Therefore, we know that this word has the second and fourth letter identical. Also second and fourth letters are identical in the second word, but different to the ones in the last word.

This, in a large text and within a context, could lead us to decipher great part of the text. For instance, knowing that it's a text about information security, we can try to find words with the same structure as security or information and map these letters for all the text. With this, we would have parts of other words, and with some luck we would be able to obtain more letters by guessing those words. Continuing like this, at the end we would have the complete text.

Another tool that allows us to easily analyse this kind of ciphers is frequency analysis, which we mentioned previously. If we take a text encrypted using this system and count the number of appearances of each letter, and then obtain (or generate) a table of relative frequencies for the target language, we can match the most frequent letter in the ciphertext and the most frequent letter in the target language.

Then, since the same shift is applied to all the letters, we would have the key and would be able to obtain the complete message. In case of getting a non-sense message, we could try with the second most frequent letter instead of the first one. Since it's a statistical analysis, it's possible that the character distribution in our text doesn't completely match the original distribution, but will certainly be similar.

Simple substitution ciphers

Caesar's cipher we just analysed is one of the so-called simple substitution ciphers, which always substitute each symbol of the input alphabet by a given symbol of the output alphabet.Besides Caesar's cipher, Atbash cipher is another quite famous substitution cipher, where each the alphabet is inverted: A->Z, B->Y, ... Y->B, Z->A.

But not only these two simple substitution ciphers exist. We can create any modification of the input alphabet as output alphabet. Even then, all these ciphers suffer from the same problem: the structure is maintained and they are quite easy to break using frequency analysis and word matching.

Example: Breaking a simple substitution cipher

This time I encrypted an English text. This is how the ciphertext looks like:

ZL VAGRERFGF NOBHG FRPHEVGL ERYNGRQ GBCVPF UNIR QEVSGRQ N YVGGYR OVG, ZBIVAT SEBZ CHER FBSGJNER NAQ ARGJBEXVAT FRPHEVGL GB PELCGBTENCUL NAQ CENPGVPNY NGGNPXF BA PELCGBTENCUVP VZCYRZRAGNGVBAF, YVXR FVQR PUNAARY NANYLFVF NGGNPXF.

VA GUVF EROBEA OYBT V JVYY GEL GB VAGEBQHPR GUR ERNQREF VAGB GURFR GBCVPF JVGUBHG TRGGVAT VAGB GBB PBZCYRK ZNGUF. GUR NVZ VF GB CEBIVQR NA HAQREFGNAQVAT BS PELCGBTENCUL JVGUBHG UNIVAT ERNQREF YBFG BA ZNGURZNGVPNY PBAPRCGF. LBH JVYY GRYY JURGURE V NPUVRIR GUVF TBNY BE ABG.

Looks pretty complicated, doesn't it? Let's see how to approach this example, assuming this is a simple substitution cipher. First of all, we're gonna count how many times appears each letter, and then divide it by the total number of letters. I've done it with this simple program I quickly coded, although it's possible to do it with Cryptool but I don't have it available right now.

Once it's done, we sort it by frequency. For instance, copy-pasting the output of the program into a spreadsheet in Google Docs and pressing order by the corresponding column. The top 3 letters are:

G     52    0.124402

R     38    0.090909

V     37    0.088517

So, we go to a frequency table for English (here) and see that E is the most frequent letter in this language. Now we subtract 'G'-'E'=7. If we apply this key using a Caesar's cipher, we just get garbage. However, if we take 'R' as 'E, then 'R'-'E'=13. Deciphering using Caesar's cipher, we get:

MY INTERESTS ABOUT SECURITY RELATED TOPICS HAVE DRIFTED A LITTLE BIT, MOVING FROM PURE SOFTWARE AND NETWORKING SECURITY TO CRYPTOGRAPHY AND PRACTICAL ATTACKS ON CRYPTOGRAPHIC IMPLEMENTATIONS, LIKE SIDE CHANNEL ANALYSIS ATTACKS.

IN THIS REBORN BLOG I WILL TRY TO INTRODUCE THE READERS INTO THESE TOPICS WITHOUT GETTING INTO TOO COMPLEX MATHS. THE AIM IS TO PROVIDE AN UNDERSTANDING OF CRYPTOGRAPHY WITHOUT HAVING READERS LOST ON MATHEMATICAL CONCEPTS. YOU WILL TELL WHETHER I ACHIEVE THIS GOAL OR NOT.

Much more readable 🙂 Don't you recognize it? Look at http://www.limited-entropy.com/en/about 🙂

We've decrypted the text, although not at our first try, but at our second. Another option would have been using 'G' as 'T', since T is the second most frequent letter in English. The result is exactly the same.

However, facing an unknown transformation, we would have been to play with other hints besides frequency analysis. For instance, we could use the fact that we expected to see CRYPTOGRAPHY in the text, and assign this word to the only word in the ciphertext that has the same letter in the third and the last position. Then, we would substitute all its letters in the ciphertext and would see if it makes any sense.

From there, we just need to continue on guessing letters... kind of a puzzle 🙂

That's it for today, I hope you're liking it :). Questions and comments are more than welcome!

Posted by Eloi Sanfèlix

Comments (5) Trackbacks (0)
  1. ¿Recuerdas que en la competición de la campus hubo un ATBASH? es super sencillo pero nos costó caer. En efecto, salía la ‘e’ como letra más frecuente y era clarísimamente el histograma invertido en las abscisas.

  2. Hola, soy nuevo leyendo este blog, y me interesa mucho el tema de criptografia.

    Encontre en la Wikipedia el Cifrado Cesar (http://es.wikipedia.org/wiki/Cifrado_C%C3%A9sar), y en el un codificador del cifrado cesar en linea, aunque lo probe con el ejemplo y funciona (para cifra y descifrar).
    y esta hecho en javascript.

    Saludos y gracias….

  3. Hola,

    @vierito5
    Si que me acuerdo sí. Es una de las razones por las que está ahí jejeje

    @Andrex
    Puede que en el ejemplo funcione para ambos lados, porque es justo 13 que está a mitad del alfabeto y 13 a izquierda es lo mismo que 13 a derechas. Al menos a mí no me iba para el ejemplo de ‘CRIPTOGRAFIA PARA TODOS’.

    Hay otros sitios donde puedes encontrar javascripts para este cifrado, por ejemplo en http://www.yellowpipe.com/yis/tools/encrypter/index.php.

    Este lo puedes probar para hacer fuerza bruta a un cifrado Cesar, pues 26 combinaciones se hacen en un momento en un PC jeje

    Saludos

  4. Hola.., vaya, menuda coincidencia, eso pasa por no probar con otro. Y otra que estuba despistado fue “Julius Caesar” == “Julio Cesar” y en mi pagina tambien el script.
    http://andrexweb.biz/sneak/index.php

    Saludos y gracias….

  5. I constantly visit your site! Thanks for all the perfect articles.


Leave a comment

No trackbacks yet.