Monday, April 11, 2016

The Enslaved Oracle, and the Future of Superintelligence

The emergence of Superintelligent AI is coming, but what form will it take?  One potential avenue for this emergence that I feel is under-discussed, is that of the "Enslaved Oracle".  In this scenario, the first few Artificial General Intelligence lifeforms that are created are created in secret, and are used by their creators to gain a tactical advantage in their respective fields.

The primary appeal to this scenario is that it could be happening today.  It is very possible that researchers at Google, Goldman Sachs, or even the NSA have cracked the key barriers in general intelligence, and have successfully created and bound highly intelligent artificial lifeforms.  Presumably they would have "read-only" access to the internet, and are given the sole task of predicting future events for the benefit of their respective organizations.

Of course, this *probably* isn't happening today, but more and more companies are utilizing machine learning and predictive modeling, and you could imagine that the most successful programs would generate the most profit, which would lead to an increase in investment in predictive modeling, creating a cascading effect towards Superintelligence.

In most Superintelligence breakout scenarios, the AI ends up hacking into, and using the world's existing computing infrastructure to achieve ultimate enlightenment, but there is a key gap there between science fiction and reality.  As it turns out, specially designed neuromorphic processing hardware is many orders of magnitude more power performant than modern computing platforms for the purpose of AI simulation.  This means that with the right architecture, investing even a few million dollars in a neuromorphic computing platform, you could outperform the combined power of every computer on the planet.  This leads me to believe that early superintellegent AI are likely to be centralized, and will not easily migrate away from the cyber primordial soup that they were birthed in.

Democratizing Superintelligence

Is the Enslaved Oracle Scenario an unequivocally "bad thing" for the future of humanity?  I'm not too worried.  For some reason I have this faith that even an oligarchy of Superintelligent AI controlling humanity through corporate puppetry will still have a level of benevolence above and beyond what we are seeing from our existing human leadership.  That said, I am not a huge fan of technological disparity, so I've also thought through several scenarios where the emergence of such technologies occurs more in a more populist way.

As an avid follower of all things crypto, two new technologies that I'm a huge fan of are the Ethereum computing platform, which offers decentralized and cryptographically verified "smart contracts", and Augur, built on Ethereum, which is a decentralized prediction market.

How do these technologies play into the future of superintelligence?  Well, I think that there's a very good chance that these technologies, or others like it, could be the key to monetizing the hobbyist neuromorphic processor market.  Much the same way that custom ASIC companies exploded selling customized ASICs for Bitcoin mining, it's very possible that in the near future, eager crypto miners will be running neuromorphic mining rigs in an attempt to win big on these emerging prediction markets.

Imagine asking a question, in the form of publishing a contract to a prediction market.  You could ask what the weather would be like next week, which real estate investment would be more profitable in 10 years, what protein structures would be best for fighting some new disease outbreak, and you would instantly have a world full of intelligent machinery working on your problems.  The great thing at this point is that even if large corporations had developed "Enslaved Oracles", it would be to their benefit to participate in this global "Super Oracle".  Hardware companies sell more hardware.  Software companies sell better predictive algorithms, and the world's neuromorphic computing infrastructure runs hot, guiding humanity through our next stages of existence.

Tuesday, August 25, 2015

What I learned from cracking 4000 Ashley Madison passwords

The Plan

When the Ashley Madison database first got dumped, there was an interesting contingent of researchers talking about how pointless it would be to crack the passwords, since Ashley Madison was using salted bcrypt with a cost of 12.  I thought it might be a fun experiment to run the hashes on a cracking rig of mine to see what I could actually get out of it.

The Rig

My cracking rig is your typical milk carton style setup, as seen on Silicon Valley.

It was originally purchased a couple years ago as a cryptocoin mining rig for about $1500 in bitcoins.  Pretty much just a stack of four ATI R9 290s running PIMP (  PIMP is a Debian based USB boot environment designed for plug and play cryptocoin mining.  I find it super convenient to use because it deals with all the GPU driver BS, and by default gives you the most optimized setup for whatever cards you're using.  To give it the magical cracking powers, I dropped the most recent oclHashcat in there as well.

The Procedure

So, the data showed up in the form of a mysqldump.

gunzip member_login.dump.gz
tr , '\n' < member_login.dump > tmp.txt # switching commas for newlines
grep "\$2a" tmp.txt > tmp2.txt # grepping out hashes
tr -d "\'" < tmp2.txt > am.txt # removing single quotes

Holy cow, this thing has 36 *million* hashes in it.  The leak was still pretty new at the time, so I hadn't realized how many accounts were actually in the dump, but just the file of hashes themselves was 2.1 gigs.  I moved the file over to the cracking rig and ..

oclHashcat v1.36 starting...
Counting lines in am.txt
ERROR: Insufficient memory available

Huh.  Well that sucks.  I use head to grab the first million lines, and the thing fires up just great.  This is my first time seeing the real benchmarks for the crack, and it looks something like this:

Speed.GPU.#1...: 39 H/s
Speed.GPU.#2...: 39 H/s
Speed.GPU.#3...: 39 H/s
Speed.GPU.#4...: 39 H/s
Speed.GPU.#*...: 156 H/s

Yes, that's right, 156 hashes per second.  To someone who's used to cracking md5 passwords, this looks pretty disappointing, but it's bcrypt, so I'll take what I can get.  I start seeing how many hashes I can do at a time.  I double to 2 million hashes, then 4, and finally I see the Insufficient Memory error again when I hit 8 million hashes.  I drop it down to 6 million, and that seems to work just fine.  So my final list of hashes to crack is the first 6 million hashes from the database dump.

My final command looks like this:

./oclHashcat32.bin -m3200 -a0 am2.txt rockyou.txt --force --weak-hash-threshold 0
This is just a super basic -a0 attack using the famous rockyou.txt wordlist.  I also set a script to take a snapshot how how many passwords I had cracked every 10 minutes.

And now.. we wait..

The Data

So, after five days and three hours, I hit 4000 passwords, which I figured was a good time to stop.  I pulled the 10 minute snapshots together, and as it turns out, this is what the final graph of cracks over time looked like:

Now, of course what immediately jumped out at me was how insanely linear this is.  It comes to about 32.6 cracked passwords discovered per hour.  I had expected the curve to shoot up and level off over time as passwords became more rare.  This could be because I was still in the "dumb password" phase, but it's hard to tell.  It may not look like it at first, but there are 741 data points in this graph.

Some interesting numbers, of the 4007 cracked passwords in the final list, only 1191 of them were unique.  Dropping the list of cracked passwords into, we get a nice list of the most popular passwords cracked so far.  Here's the top 20 for your amusement:

123456 202
password 105
12345 99
qwerty 32
12345678 31
ashley 28
baseball 27
abc123 27
696969 23
111111 21
football 20
fuckyou 20
madison 20
asshole 19
superman 19
fuckme 19
hockey 19
123456789 19
hunter 18
harley 18

So, maybe these passwords were all throwaways.  It may also be infeasible to crack any given bcrypt password, but given enough users, it doesn't matter if passwords are bcrypted and salted, a ton of passwords are eventually going to pop out.

This is the goodbye message I got when I stopped the crack.

Session.Name...: oclHashcat
Status.........: Aborted
Input.Mode.....: File (rockyou.txt)
Hash.Target....: File (am2.txt)
Hash.Type......: bcrypt, Blowfish(OpenBSD)
Time.Started...: Thu Aug 20 11:40:32 2015 (5 days, 3 hours)
Time.Estimated.: 0 secs
Speed.GPU.#1...: 39 H/s
Speed.GPU.#2...: 39 H/s
Speed.GPU.#3...: 39 H/s
Speed.GPU.#4...: 39 H/s
Speed.GPU.#*...: 156 H/s
Recovered......: 4007/6000000 (0.07%) Digests, 4007/6000000 (0.07%) Salts
Progress.......: 60396544/86002302412928 (0.00%)
Rejected.......: 0/60396544 (0.00%)
Restore.Point..: 0/14343296 (0.00%)

As you can see, the crack still had quite a ways to go when I aborted it.

All my data from this study can be found here:

am-checkpoints.txt : log of passwords cracked every 10 minutes
am-freq.txt : frequency count of cracked passwords
am-pass.txt : final list of cracked passwords
am-sorted.txt : list of passwords, sorted alphabetically
am.pot : oclHashcat potfile generated by the crack

thanks everyone!  you can follow me on the tweeters @deanpierce


/u/rallias has pointed out that uniq -c will do a frequency count, and I'm dumb for using a website :-)

Tuesday, December 17, 2013

Bitcoin Private Key Necromancy


~ The Longer Version ~

A few weeks ago, a friend came to me with a problem.  Way back in 2011, he had the great idea to reinstall Windows.  Without thinking too much about it, he installed the new version of Windows, and used the drive for a while.  It was only later that he realized that the drive actually contained a good quantity if bitcoins.  Luckily, he realized there was a chance that the actual data containing the keys may one day be recoverable, and immediately unplugged it and stored it away for safe keeping.

With the price approaching 1000 USD/BTC, he brought the drive to a local bitcoin meetup and asked around.  One guy ran various profession forensics tools against the drive with no luck, and at the end of the night, the drive ended up in my hands.

Discussions of forensic hygiene out out of scope for this particular blog entry, but needless to say, my first step was using dd to pull the raw data off the drive, giving me a 160 gig file on my local filesystem to work with.

Idea #1 BerkeleyDB recovery
So, of course, the original though was "find and pull out the wallet.dat".  My tool of choice for this sort of thing is magicrescue (  Magic rescue is typically used to recover images and documents from large blobs of data, for example, damaged filesystems.  Unfortunately for me, there was no BerkeleyDB "recipe file", which is what magicrescue uses to reconstruct files.  With a bit of poking around, I figured out how to write my own custom recipe files, and I was on my way.  Using a hex editor, I checked out the first 16 bytes or so of a normal wallet.dat, and confirmed that it was the same across multiple wallet.dat files that I had lying around.  I ran the magicrescue scan, and came up with no hits.

I read up some more on the format of BerkeleyDB files, kept tweaking my recipie files to support more and more versions of BerekelyDB, and nothing.

Idea #2 Find *Something*
At this point, I started digging around in the middle of my files for anything that might be somewhat unique.  The first few things I tried were coming up negative, and then I noticed the string name"1, which was immediately followed by a bitcoin address in the various wallet.dat files I had.  I built the recipe, ran the scan, and got a single hit.  I looked into the output file, and there it was, a bitcoin address.  I looked the address up in blockexplorer, and there it was.  An address with the exact number of coins my friend had guessed was on the drive, and no transactions since 2011.


My next thought was that I needed to carve the wallet.dat file out of this chunk of data I had found.  After a bit of futzing around, I noticed that almost directly above the address was a header for a .NET Assembly.  This meant one thing: fragmentation, which was bad news for me. 

Idea #3 Raw Key Extraction
Okay, it was time to finally figure out how to extract the keys directly.  I found various tools for printing out private keys, but everything was outputting this strange 400 byte format, which didn't seem right.  I read up a bit more about how private keys work in bitcoin, and read a ton of code and specs figuring out how they were supposed to be encoded, and "Wallet Import Format".  That let me to this nifty webapp

My big break came last night when I realized I could export one of my own empty private keys, and get the raw 32 byte hex from the website.  Once I knew that, I could dig around in the wallet.dat file.  I noticed that there were some interesting bytes that preceded the private key, and noticed that this preceding magic number was in front of all of the private keys in the wallet.  I quickly whipped up a magicrescue recipe, and before I knew it, I had 400 hits.  I wrote up some code to go through the files, translate the 32 byte data into base58 WIF keys, and threw them into a shellscript that ran "bitcoind importprivkey ...", and imported them into a local wallet that I had.  When that was done, I ran "bitcoind getbalance" and there they were :-D  I quickly moved the coins to a safer place, and let my friend know the good news.

The birth of KeyHunter
I figure not everyone wants to dink around with magicrescue, so I wrote up a tool called keyhunter to automatically rip through a large chunks of data, and spit out the base58 Import Address.  The code is here:

If it helps you find any of your lost and forgotten coins, I've set up a donation address here:


Good luck!

Tuesday, June 25, 2013

Locking down your Tor usage

Due to increased interest, I figure I should post about this in a more public area.

Problem statement : "I worry that unintended data is leaking from my machine while I'm using Tor."

The solution : Using iptables to block all data exiting the machine that is not coming from the Tor daemon.

The way that I do this is by creating a simple script in my home directory called "".  This file contains the following lines:

sudo iptables -F
sudo iptables -I OUTPUT -o wlan0 -m owner ! --uid-owner debian-tor -j REJECT

This assumes a few things.

  1. You are on a Linux box with iptables.
  2. The local tor server is running as the user "debian-tor".
  3. You are connected to the internet through the wlan0 interface.
  4. You don't already have complex iptables rules in place.
  5. You are using the standard tor daemon, without vidalia, privoxy, or the browser bundle.

This will work out of the box for anyone on Ubuntu or Debian who installed it via the supported PPAs.

To get this working with the browser bundle, I first set a password for the debian-tor user, made sure the home directory was set to /var/lib/tor , and then installed the browser bundle there.  Then, when I want to run the browser bundle, I first run the ./ script, then run "su debian-tor".  At that point, I can connect to anything using tor, and no traffic from my admin user, or even from root, can exit the box.  Any scripts or tools you're using should be run as your regular user, and you are guaranteed that they will only be able to touch the internet through tor.

Wednesday, December 28, 2011

A simple mtgox auto trading bot in python

Just for kicks, I decided to write a tradebot the other day, and thought it might be interesting for others to see just how simple it is to get a basic bot going.  As a quick disclaimer, the bot that we end up with is the bot that I am running right now, but it probably cause you to lose all of your money.  I am assuming that you have git installed, a working python installation, and a reasonably well funded mtgox account.

step 1 : download a good trading API

git clone

This code is dead simple to use, and makes writing your bot take no time at all, so you can focus more on trading strategies rather than worrying about your POST variables getting to mtgox alright.

step 2 : move to and fill in credentials

step 3 : test out a hello world app

Create a file like, and put in these 4 lines of code

from settings import *
balance = exchange.get_balance()
btc_balance = balance['btcs']
print "you have %s bitcoins in mtgox!" % btc_balance

You should then be able to run   python   If successful, you know that your connection with mtgox is working correctly, if not, there might be a problem with your (wrong username/password?)

step 4 : read some example code

Almost all of the code in the directory, other than the settings and the, is example code.  You can read that to get a feel for which methods are available to you, and how you can effectively use them.

step 5 : think of a good trading strategy

This is definitely the hard part, and you are likely going to be modifying your strategy regularly as the market fluctuates.  The strategy that I am using now is pretty simple.  If the market looks pretty volatile  and is very close to the daily high, sell some coins and buy them back when they are 1% cheaper.  If the price is close to the daily low, buy some coins, and sell them back when the market goes up by 1%.  This is a pretty conservative strategy, and mtgox is taking almost half of each win.  This also only works when

  1. the market is volatile, swinging by 4% or so per day
  2. the market is stable around a single point

If the market is trending up or down, this strategy will bet that the market is going to go back to where it was, and if it doesn't, you lose out more and more as the market drifts away from you.

step 6 : write some damn code

The general structure should look like this

import time
from settings import *

while True:
  # gather data
  # analyze data
  # act on data

And so, without further ado, here is the code that I am currently using to trade with on the strategy as described above.

import time
from settings import *

while True:
    balance = exchange.get_balance()
    btc_balance = balance['btcs']
    usd_balance = balance['usds']
    print btc_balance+" btc"
    print usd_balance+" usd"

    ticker = exchange.get_ticker()
    last = ticker['last']
    high = ticker['high']
    low  = ticker['low']

    width = high - low
    girth = 1 - (low / high)
    high_scrape = ( high - last ) / width
    low_scrape  = ( last - low ) / width

    print "width: "+str(width)
    print "girth: "+str(girth)
    print "[ %f %f ( %f ) %f %f ]" % (high,high_scrape,last,low_scrape,low)

    orders = exchange.get_orders()
    print str(len(orders))+" orders open"

    if girth < 0.1:
      print "market seems pretty stable, waiting"

    if len(orders) > 0:
      print "there is already a pending order"

    if high_scrape < 0.1:
      print "high scrape detected!"
      exchange.sell_btc(amount=btc_balance, price=last)
      exchange.buy_btc(amount=btc_balance, price=float(last)*0.991)

    if low_scrape < 0.1:
      print "low scrape detected!"
      exchange.buy_btc(amount=float(usd_balance)*last, price=last)
      exchange.sell_btc(amount=float(usd_balance)*last, price=float(last)*1.0099)

    print "oh no!!"


For testing, I put in 100 coins, and have 50 trading at the high end, and 50 trading at the low end (kept as USD by default).  It has been working pretty well as the market was floating around 4, but today I have an order stuck back down at 4.05, so I am hoping that this upward trend comes back down for a bit so I can get that back :-)

Now, you might notice that the only data that I am pulling from to make my decisions is mtgox high, low, and last.  Yes, this is pretty insane.  Any decent bot really should be analyzing a wide range of data sources, like maybe some historical data, market depth (which the mtgox api here give really easy access to), blockchain data, and even data coming out of the blogotwittisphere could prove extremely useful for successful trading.

Good luck, and be careful not to shoot your foot off!