Tuesday, August 25, 2015

What I learned from cracking 4000 Ashley Madison passwords

The Plan

When the Ashley Madison database first got dumped, there was an interesting contingent of researchers talking about how pointless it would be to crack the passwords, since Ashley Madison was using salted bcrypt with a cost of 12.  I thought it might be a fun experiment to run the hashes on a cracking rig of mine to see what I could actually get out of it.

The Rig

My cracking rig is your typical milk carton style setup, as seen on Silicon Valley.


It was originally purchased a couple years ago as a cryptocoin mining rig for about $1500 in bitcoins.  Pretty much just a stack of four ATI R9 290s running PIMP (http://getpimp.org).  PIMP is a Debian based USB boot environment designed for plug and play cryptocoin mining.  I find it super convenient to use because it deals with all the GPU driver BS, and by default gives you the most optimized setup for whatever cards you're using.  To give it the magical cracking powers, I dropped the most recent oclHashcat in there as well.

The Procedure

So, the data showed up in the form of a mysqldump.

gunzip member_login.dump.gz
tr , '\n' < member_login.dump > tmp.txt # switching commas for newlines
grep "\$2a" tmp.txt > tmp2.txt # grepping out hashes
tr -d "\'" < tmp2.txt > am.txt # removing single quotes


Holy cow, this thing has 36 *million* hashes in it.  The leak was still pretty new at the time, so I hadn't realized how many accounts were actually in the dump, but just the file of hashes themselves was 2.1 gigs.  I moved the file over to the cracking rig and ..

oclHashcat v1.36 starting...
Counting lines in am.txt
ERROR: Insufficient memory available


Huh.  Well that sucks.  I use head to grab the first million lines, and the thing fires up just great.  This is my first time seeing the real benchmarks for the crack, and it looks something like this:

Speed.GPU.#1...: 39 H/s
Speed.GPU.#2...: 39 H/s
Speed.GPU.#3...: 39 H/s
Speed.GPU.#4...: 39 H/s
Speed.GPU.#*...: 156 H/s


Yes, that's right, 156 hashes per second.  To someone who's used to cracking md5 passwords, this looks pretty disappointing, but it's bcrypt, so I'll take what I can get.  I start seeing how many hashes I can do at a time.  I double to 2 million hashes, then 4, and finally I see the Insufficient Memory error again when I hit 8 million hashes.  I drop it down to 6 million, and that seems to work just fine.  So my final list of hashes to crack is the first 6 million hashes from the database dump.

My final command looks like this:

./oclHashcat32.bin -m3200 -a0 am2.txt rockyou.txt --force --weak-hash-threshold 0
This is just a super basic -a0 attack using the famous rockyou.txt wordlist.  I also set a script to take a snapshot how how many passwords I had cracked every 10 minutes.

And now.. we wait..


The Data

So, after five days and three hours, I hit 4000 passwords, which I figured was a good time to stop.  I pulled the 10 minute snapshots together, and as it turns out, this is what the final graph of cracks over time looked like:


Now, of course what immediately jumped out at me was how insanely linear this is.  It comes to about 32.6 cracked passwords discovered per hour.  I had expected the curve to shoot up and level off over time as passwords became more rare.  This could be because I was still in the "dumb password" phase, but it's hard to tell.  It may not look like it at first, but there are 741 data points in this graph.

Some interesting numbers, of the 4007 cracked passwords in the final list, only 1191 of them were unique.  Dropping the list of cracked passwords into http://www.textfixer.com/tools/online-word-counter.php, we get a nice list of the most popular passwords cracked so far.  Here's the top 20 for your amusement:

123456 202
password 105
12345 99
qwerty 32
12345678 31
ashley 28
baseball 27
abc123 27
696969 23
111111 21
football 20
fuckyou 20
madison 20
asshole 19
superman 19
fuckme 19
hockey 19
123456789 19
hunter 18
harley 18

Conclusion
So, maybe these passwords were all throwaways.  It may also be infeasible to crack any given bcrypt password, but given enough users, it doesn't matter if passwords are bcrypted and salted, a ton of passwords are eventually going to pop out.

This is the goodbye message I got when I stopped the crack.

Session.Name...: oclHashcat
Status.........: Aborted
Input.Mode.....: File (rockyou.txt)
Hash.Target....: File (am2.txt)
Hash.Type......: bcrypt, Blowfish(OpenBSD)
Time.Started...: Thu Aug 20 11:40:32 2015 (5 days, 3 hours)
Time.Estimated.: 0 secs
Speed.GPU.#1...: 39 H/s
Speed.GPU.#2...: 39 H/s
Speed.GPU.#3...: 39 H/s
Speed.GPU.#4...: 39 H/s
Speed.GPU.#*...: 156 H/s
Recovered......: 4007/6000000 (0.07%) Digests, 4007/6000000 (0.07%) Salts
Progress.......: 60396544/86002302412928 (0.00%)
Rejected.......: 0/60396544 (0.00%)
Restore.Point..: 0/14343296 (0.00%)

As you can see, the crack still had quite a ways to go when I aborted it.


All my data from this study can be found here:

am-checkpoints.txt : log of passwords cracked every 10 minutes
am-freq.txt : frequency count of cracked passwords
am-pass.txt : final list of cracked passwords
am-sorted.txt : list of passwords, sorted alphabetically
am.pot : oclHashcat potfile generated by the crack

thanks everyone!  you can follow me on the tweeters @deanpierce


Errata 

/u/rallias has pointed out that uniq -c will do a frequency count, and I'm dumb for using a website :-)