md5 is a cryptographic hash function used to verify file integrity as well as to store passwords in applications especially on the web. However it’s popularity is on the decline because of the discoveries of collisions which makes it less secure then it was assumed to be. Python’s md5 library has support for generating md5 hashes and thus it is very easy to write a program to launch a dictionary attack against a given hash. I use /usr/share/dict/linux.words as the wordlist which is pretty nice with 483,523 words! Anyway this is what I came up with for some quick and dirty dictionary attack :
#md5crack.py by Himanshu : http://cslife.wordpress.com
import md5
def main():
targetHash = raw_input("Enter md5 hash to decipher: ")
success = False
dictionaryFile = "/usr/share/dict/linux.words"
try:
for word in open(dictionaryFile):
if checkMd5(word.rstrip("\n"), targetHash):
success = True
break
except IOError:
print dictionaryFile, "not found!"
if success:
print "Hash found is", word
else:
print "Hash not found"
def checkMd5(getString, testHash):
getHash = md5.new(getString).hexdigest()
if getHash == testHash:
return True
else:
return False
if __name__== '__main__':
main()
ps : Thanks to stylistic suggestions by Ragzouken from #python in freenode.
How much does it take to check 500k words?
I can check 483,523 words withing 2-3 seconds on my laptop which has a 1.6 Ghz core 2 duo processor.
It takes less than a second on my MacBook (1.83GHz C2D, 2GB RAM) to do this.
It has 234,936 words in it’s dictionary.
Same goes for my FreeBSD machine (235,882 words): 2.66GHz Celeron D, 512MB RAM.
Guess you shouldn’t use dictionary-passwords…..
You can do this in good ol’ php too. I wrote up a script just to get a comparison on time. Instead of words, I used random numbers (using an 8 digit number as the original passcode).
Took about 1 second per 100,000 numbers generated and tested. So a 500,000 word check would take about 5 seconds.
Good example of why your password should not be from a dictionary, and should contain both letters and numbers. Checking 100,000,000 numbers or words is feasible – although it might take an hour or two. Checking 218,340,105,584,896 (the number of permutations for an 8 character letter + number password)… not so feasible.
[...] Bruteforcing md5 in python md5 is a cryptographic hash function used to verify file integrity as well as to store passwords in applications […] [...]
Cool stuff
. Did you find any matches?
One can always apply the “chosen-prefixed-byte-values” method devised by http://www.win.tue.nl/hashclash/SoftIntCodeSign/. From the page -
“As a proof of concept we applied our chosen-prefix collision finding method to the files HelloWorld.exe and GoodbyeWorld.exe. This came down to carefully constructing two blocks of 832 bytes, and appending them to the end of both files. These 2 times 832 bytes have been constructed such that the resulting files, renamed to HelloWorld-colliding.exe and GoodbyeWorld-colliding.exe, have exactly identical MD5 hash values.”
Also, there are rainbow tables somewhere for precomputed md5 hashes and also a tool to crack just that. I recommend you check that out if you are interested in it.
Thanks for the pingback man, I have been busy most of the week!
Nice one there. I will try to test it on my network sometime in the weekend. Is there a way to scale it up for larger dictionaries or any performance differences in respect to some other tools?
A recipe for the Debian-heads out there:
—snip
% sudo apt-get install wamerican-huge
[...]
% wc -l /usr/share/dict/american-english-huge
57025 /usr/share/dict/american-english-huge
% tail -n 4 /usr/share/dict/american-english-huge # ensure worst-case time
zucchinis
[...]
% time echo -n zucchinis | md5sum | awk ‘{print $1}’ | ~/prog/py/3rdparty/md5_brute_w_parser_fix.py # discard md5sum(1)’s “filename” field with awk(1)
Enter md5 hash to decipher: Hash found is zucchinis
[...]
~/prog/py/3rdparty/md5_brute_w_parser_fix.py 0.25s user 0.01s system 80% cpu 0.323 total
—snip
My timing results are probably within the noise. I couldn’t quickly find a larger wordlist.
Thanks for the informative post, Himanshu.
-Tyler
Where can I find that word list? And how do I create a hash function to attack?
Where did you find that word list?
What OS are you using? If you have Ubuntu installed it is at :
/usr/share/dict/words
If you need more help you can email me at :himanshuchhetri [AT] gmail [DOT] com
import md5
words = [word.strip() for word in file('/usr/share/dict/words')]
hashes = [md5.new(word).hexdigest() for word in words]
hash = raw_input(“Enter Hash:”)
word = words[hashes.index(hash)] if hash in hashes else “Not Found.”
print word
—
would do the same. With lesser memory if you used () instead of [].
- ~l