开发者

What is an efficient way to write password cracking algorithm (python)

This problem might be relatively simple, but I'm given two text files. One text file contains all encrypted passwords e开发者_StackOverflow社区ncrypted via crypt.crypt in python. The other list contains over 400k+ normal dictionary words.

The assignment is that given 3 different functions which transform strings from their normal case to all different permutations of capitalizations, transforms a letter to a number (if it looks alike, e.g. G -> 6, B -> 8), and reverses a string. The thing is that given the 10 - 20 encrypted passwords in the password file, what is the most efficient way to get the fastest running solution in python to run those functions on dictionary word in the words file? It is given that all those words, when transformed in whatever way, will encrypt to a password in the password file.

Here is the function which checks if a given string, when encrypted, is the same as the encrypted password passed in:

def check_pass(plaintext,encrypted):
 crypted_pass = crypt.crypt(plaintext,encrypted)
 if crypted_pass == encrypted:
  return True
 else:
  return False

Thanks in advance.


Without knowing details about the underlying hash algorithm and possible weaknesses of the algorithm all you can do is to run a brute-force attack trying all possible transformations of the words in your password list.

The only way to speed up such a brute-force attack is to get more powerful hardware and to split the task and run the cracker in parallel.


On my slow laptop, crypt.crypt takes about 20 microseconds:

$ python -mtimeit -s'import crypt' 'crypt.crypt("foobar", "zappa")'
10000 loops, best of 3: 21.8 usec per loop

so, the brute force approach (really the only sensible one) is "kinda" feasible. By applying your transformation functions you'll get (ballpark estimate) about 100 transformed words per dictionary word (mostly from the capitalization changes), so, about 40 million transformed words out of your whole dictionary. At 20 microseconds each, that will take about 800 seconds, call it 15 minutes, for the effort of trying to crack one of the passwords that doesn't actually correspond to any of the variations; expected time about half that, to crack a password that does correspond.

So, if you have 10 passwords to crack, and they all do correspond to a transformed dictionary word, you should be done in an hour or two. Is that OK? Because there isn't much else you can do except distribute this embarassingly parallel problem over as many nodes and cores as you can grasp (oh, and, use a faster machine in the first place -- that might buy you perhaps a factor of two or thereabouts).

There is no deep optimization trick that you can add, so the general logic will be that of a triple-nested loop: one level loops over the encrypted passwords, one over the words in the dictionary, one over the variants of each dictionary word. There isn't much difference regarding how you nest things (except the loop on the variants must come within the loop on the words, for simplicity). I recommend encapsulating "give me all variants of this word" as a generator (for simplicity, not for speed) and otherwise minimizing the number of function calls (e.g. there is no reason to use that check_pass function since the inline code is just as clear, and will be microscopically faster).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜