Cracking real world salted MD5 passwords in python with several dictionaries

Recently a friend (who will remain unnamed for obvious reasons) asked me to penetration test a website he created. I found a very simple exploit where I could upload an avatar but the file was not checked to ensure it was an image, so I uploaded a php script I wrote an began exploring the server. I printed out all of the usernames, passwords and salts from the database to see how many of the 1,109 passwords could be easily cracked.

The passwords were stored as MD5 hashes with a random 6 character alphanumeric salt. To create the MD5 hash of the password the salt was prefixed to the password and then the combination was hashed. Thanks to this method we can employ a simple bruteforce/dictionary attack on the passwords. I will start with the wordlists creation, then results I obtained to keep your interest, and finally show my python code.

Creating wordlists
I already has two reasnoble sized dictionaries that I use for different things like wordcube. I used john the ripper on my double sized dictionary to create lots of common permutations on words, such as captial first letter, and a number affixed to the end. To do this you run john with the following parameters, where dic.txt is the input dictionary and dic_plus_rules.txt is the output from john with all of the additions it has made.

john –wordlist=dic.txt –rules –stdout > dic_plus_rules.txt

I also download two wordlists from openwall, one which is a list of ~3100 common passwords, and one labelled ALL that has a large amount of words (~4 million) in various languages. Because of the highly compressible nature of text the files are available in small gzip files. ALL is 11.5Mb which unzips to 41.4Mb and password 12kb which unzips to 21.8kb. There are also more wordlists avaliable for different languages, but the ALL file includes these.

The size of all of the wordlists I used is shown below:

Dictionary Combinations
English 42,987
Double-English 80,368
Double+john-rules 3,986,706
Openwall Common Passwords 3,158
Openwall ALL 3,917,116


Dictionary Cracked Percentage Time
English 60 5.41% 80s
Double-English 65 5.86% 170s
Double+john-rules 116 10.46% 2.5hrs (8393s)
Openwall Common Passwords 112 10.10% 7s
Openwall All 210 18.94% 2.45hrs (8829s)
Total Passwords Obtained 254 22.90% ~5hrs

Comical passwords

Here are some of the more amusingly bad passwords, the number in brackets shows the frequency of the password.

Crap passwords: 123456 (18), password (4), 1234567 (4), 123456789 (3) 12345678 (2), 12345 (2), abc123 (2), asdfgh (2), nintendo (2), 123123, abcd1234, abcdefg, qwerty
Self-describing passwords: catholic, cowboy, creator, doger, ginger, killer, maggot, player, princess, skater, smallcock, smooth, super, superman, superstar, tester, veggie, winner, wolverine
Some other passwords:bananas, cheese, cinnamon, hampster ,DRAGON, dribble1, poopie, poopoo

Python Program

# -*- coding: utf-8 -*-
import hashlib, sys
from time import time

# Change to commandline swtiches when you have the time!
hash = ""
hash_file = "hash2.csv"
wordlist = "mass_rules.txt"; 

# Read the hash file entered
	hashdocument = open(hash_file,"r")
except IOError:
	print "Invalid file."
	# Read the csv values seperated by colons into an array
	for line in hashdocument:
		inp = line.split(":")
		if (line.count(":")<2):

# Read wordlist in
	wordlistfile = open(wordlist,"r")
except IOError:
	print "Invalid file."

tic = time()
for line in wordlistfile:
	line = line.replace("\n","")
	for i in range(0,len(hashes)):
		m = hashlib.md5()
		word_hash = m.hexdigest()
		if word_hash==hashes[i][1]:
			toc = time()
			print hashes[i][0]," : ", line, "\t(",time()-tic,"s)"

	# Show progress evey 1000 passwords tested
	if tested%1000==0:
		print "Cracked: ",cracked," (",tested,") ", line

# Save the output of this program so we can use again 
# with another program/dictionary adding the password 
# to each line we have solved.
crackout = open("pycrackout.txt","w")
for i in hashes:
	for j in i:
		if s!="":

print "Passwords found: ",cracked,"/",len(hashes)
print "Wordlist Words :", test
print "Hashes computed: ",len(hashes)*tested
print "Total time taken: ",time()-tic,'s' 


  • Play with more dictionaries
  • Speed up code:
    • Add multi-threading: My experience with multi-threading in python is that it doesn't work well for cpu intensive tasks, if you know otherwise please let me know.
    • Have a look at PyCUDA to see if I can use my graphics card to speed up the code significantly (another type of mutli-threading really...) without having to change language like in my previous post of CUDA MD5 cracking
  • Remove hash once found to stop pointless checking
  • Add command line switches to all it to be used like a real program
Read More

Wordcube feedback

This page was created for feedback from users of wordcube available via the wordcube website or as an app for android phones (available in market). Filling in these polls and leaving feedback will help improve wordcube for everyone.

[poll id=”4″]

[poll id=”5″]

[poll id=”6″]

[poll id=”7″]

[poll id=”8″]

Thanks for your feedback. Please post any bugs, suggestions, complaints or ideas below.

Read More

Android: WordCube – Daily puzzle game

Due to the success (and small amount of addiction) of my browser-based wordcube game (see here), I decided to make a WordCube application for android.


  • Anagram / Wordsearch based puzzle
  • Small file size (~100kb) and footprint
  • Updated daily
  • Share score with twitter integration (compete with friends)
  • Saves your last attempts so you can continue at later time
  • This also means you can continue your last game offline
  • Several achievements can be unlocked (more to come, also looking for suggestions for achievements)
wordcube screenshot

screenshot of wordcube

Find as many words as possible using letters from the grid. The words must be 4 letters or more, contain the central letter and each letter may not be used more than once. There is at least one word that uses all of the letters in the cube.

The main interface is by tapping the letters in order to construct a word, but keyboards (and on screen keyboards) are also supported.

wordcube screenshot 2

Another wordcube screenshot

Twitter Integration
Once you have attained all the words that you can, you can post your score to twitter and then compare with your friends to see how they did in comparison. In order to use this feature you need to have a twitter client installed, I would recommend twidroid.

wordcube twitter integration

Twitter integration in wordcube

WordCube can be downloaded from the market on your android phone either by searching for wordcube or following one the two android links below. To download the WordCube app from this website, follow the Web link.

Android: WordCube Free
Android: WordCube Pro (only £1)
Web: WordCube Free

The pro version is available for £1, with the money going to support the developer and the development and maintenance of this application. The pro version features all of the latest features and in the near future will support personal statistics to keep track of performance.

If you enjoyed this please leave feedback for me either here or on the market. Comments, suggestions and constructive criticism is also welcome.

Read More

Bash: Script to convert .flv to mp3

Flash Video (.FLV) is currently a very popular format of online videos, inparticular youtube. This post explains how to use a simple script to extract the sound from a flash video file and turn it into an mp3.

In order for the script to work you will need to download ffmpeg (to decode the video) and lame (to encode the mp3). This can be achieve in ubuntu by opening a terminal and running the following or alternatively you can use your package manager GUI to search and download the packages for you.

sudo apt-get install ffmpeg lame

You then need to create a new file named “” and paste the following into it using your preferred text editor (which hopefully isn’t VI). Save the file and then change the file permissions so that it is executable (by running:`chmod a+x` in the terminal or via the gui in you file browser)

# this script should convert files from FLV to WAV and then to MP3
echo " "
echo "  Welcome to FLV to MP3 converter!  version 0.1"
echo " "
infile_name="[email protected]"
# exit if the user did not enter anything:
if [ -z "$infile_name" ]; then
    echo " "
    echo "You did not tell me the file name, so I will exit now."
    echo " "
echo " "
ffmpeg -i "$infile_name" -acodec pcm_s16le -ac 2 -ab 128k -vn -y "${infile_name%.flv}.wav"
lame --preset cd "${infile_name%.flv}.wav" "${infile_name%.flv}.mp3"
rm "${infile_name%.flv}.wav"
echo " "
echo "OK. I'm done! Have fun!"
echo " "

You should now be able to convert a flashvideo into an mp3 by running the following command (changing the filenames to fit your purpose):
sh videofilename.flv

Extra: Youtube
In linux it might be worth noting that youtube downloads the flv’s to your /tmp folder and you can easily copy them or convert to mp3’s (Ensure video is completly finished loading).

Also there is an application called ‘youtube-dl’ which can be installed from the repositories

sudo apt-get install youtube-dl

and then run using


Of course it’s up to your moral guidance to decide what you can and can’t download.

Read More

Python: Wordwheel / WordCube solver

Often in newspapers there is a wordwheel or some variant, whereby you have to find as many words greater than 3 letters long, containing the centre word and using the letters no more than once. I have created a webpage that generates a “WordCube” daily for people to peruse at their leisure ( This post contains the code and explanation of the solutions to wordcube’s (and all other word<insert shape here>).

WordCube from for 12/12/2009
Example WordCube image for the 12th December 2009 from

Below is a function I wrote to check if an input was a valid anagram (or partial anagram, as it isn’t essential to use every letter). The function works by cycling over each letter of word we are testing (word), and checks if the letter is valid (checked against chkword). If the letter is valid then it removes the letter from the original word and moves to the next letter until we run out of letters (returns True) or if the letter is invalid (returns False).

def anagramchk(word,chkword):
	for letter in word:
		if letter in chkword:
			chkword=chkword.replace(letter, '', 1)
			return False
	return True

f=open('english', 'r')
word=raw_input('Input letters (starting with mandatory letter) :')
for l in f:
	if len(l)<=len(word) and len(l)>=minlen and word[0] in l and anagramchk(l,word):
		if len(l)==len(word):
			print l,'  <-- Long word'
			print l
print count

This will output a list of the possible words, along with a total. The results can be seen for the WordCube in the example above here (To prevent spoiling it if you’d like to have a go at it yourself).

As always I’d be interested to see if anyone knows any faster methods or any other general improvements or comments.

The dictionary file can be found here (not perfect):

Read More