Algorithm for generating huge wordlist
Alright, I know this is going to sound bad, like I'm going to use this for un-ethical things, but you have my word that I am not.
I am writing a paper for my Compute开发者_JS百科r and Information Security course and the topic I chose was hashing methods. One of the points that I go over in my paper is MD5 being only one-way and the only way to crack an MD5 hash is to continuously make strings and use an MD5 function, then compare it with the hash you want to crack.
I would like to build a really simple mock-up program to show alongside my paper (we do a presentation and this would be an awesome thing to have), so I wanted to work out an algorithm that makes a string with every possible character combination up to 8 characters. For example the output will be:
a, b, c, ..., aa, ab, ac, ... ba, bb, bc etc etc etc.
It need to include letters, numbers and symbols if possible.
I got partly through the algorithm for this, but unfortunately my programming skills are not up to the task. If anyone can provide a complete algorithm for this I'd be extremely thankful.
Again, if you think I'm a liar and I'm going to use this for hacking purposes you don't have to leave an answer.
Thank you. :)
In Python, itertools.product does almost all you require -- though it does it for just one "number of repeats", so you'll have to iterate from 1 to 8 (not hard;-). In essence:
import itertools
import string
# whatever you wish as alphabet (lower/upper, digits, punct, &c)
myalphabet = string.ascii_lowercase + string.ascii_digits
def prods(maxlen, alphabet=myalphabet):
for i in range(1, maxlen+1):
for s in itertools.product(alphabet, repeat=i):
yield ''.join(s)
Of course, for an alphabet of length N and K repetitions (8 in your case) this does produce N + N^2 + ... + N^K possibilities (2,901,713,047,668 possibilities for N=36 and K=8), but, what's a few trillion outputs among friends!-)
To implement this i would probably encode integers to base 36 (or more if you wanted symbols).
1 = 1 2 = 2 ... a = 10 b = 12 ..
and so on.
then you would have a number, like 38 and do some divisions, ie:
38/36 = 1 remaider 2 = 12 in base 36
then just run a for loop to your max number you want to encode, something very large and output your encoded numbers.
just for fun i wrote this for you: http://pastebin.antiyes.com/index.php?id=327
It is not true that "the only way to crack an MD5 hash" is to generate every possible string and look for collisions. In fact, if you have access to the original it is possible to modify it so that its MD5 matches that of another file you can create. This is described in a paper at infosec.edu.
Even if you cannot modify the original file, rainbow tables of MD5 checksums exist which can be used to generate collisions.
These facts make MD5 unsuitable for passwords or cryptography, and in fact the U.S. government has forbidden its continued use for secure applications.
If you already have access to the hashed version of the password, then MD5 is broken to begin with. That said, when it comes to breaking a hashed value, you'd likely be better off using Rainbow Tables, Dictionary Attacks, and Social Engineering over your brute force method. That said, since you asked for an algorithm to generate all the values, maybe the following will be beneficial (C#):
using System;
using System.Text;
namespace PossibiltyIterator
{
class Program
{
static readonly char[] Symbols = {
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q',
'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
'I', 'J', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
'1', '2', '3', '4', '5', '6', '7', '8', '9', '0', '!', '@', '#', '$', '%', '^', '&',
'*', '(', ')', '-', '_', '+', '=', '/', '\\', '[', ']', '{', '}', ';', ':', '\'', '"',
',', '.', '<', '>', '?', '`', '~'
};
const int MaxLength = 8;
static void BuildWord(int currentLength, int desiredLength, char[] word)
{
if (currentLength == desiredLength)
{
Console.WriteLine(word);
}
else
{
for (int value = 0; value < Symbols.Length; ++value)
{
word[currentLength] = Symbols[value];
BuildWord(currentLength + 1, desiredLength, word);
}
}
}
static void Main(String[] args)
{
double totalValues = (Math.Pow(Symbols.Length, MaxLength + 1) - Symbols.Length)/(Symbols.Length - 1);
Console.WriteLine("Warning! You are about to print: {0} values", totalValues);
Console.WriteLine("Press any key to continue...");
Console.ReadKey(true /* intercept */);
for (int desiredLength = 1; desiredLength <= MaxLength; ++desiredLength)
{
BuildWord(0 /* currentLength */, desiredLength, new char[MaxLength]);
}
}
}
}
To be completely honest, this can be optimized further. Because it builds all the "words" of length 1, then does that work a second time in building the words of length 2. It would be smarter to build the words of length MaxLength, then truncate one letter to build a word of MaxLength-1.
Here is the optimized version... note that it does NOT return the words in the order originally requested.
using System;
using System.Text;
namespace PossibiltyIterator
{
class Program
{
static readonly char[] Symbols = {
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q',
'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
'I', 'J', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
'1', '2', '3', '4', '5', '6', '7', '8', '9', '0', '!', '@', '#', '$', '%', '^', '&',
'*', '(', ')', '-', '_', '+', '=', '/', '\\', '[', ']', '{', '}', ';', ':', '\'', '"',
',', '.', '<', '>', '?', '`', '~'
};
const int MaxLength = 8;
static void BuildWord(int currentLength, int desiredLength, char[] word)
{
if (currentLength != desiredLength)
{
for (int value = 0; value < Symbols.Length; ++value)
{
word[currentLength] = Symbols[value];
BuildWord(currentLength + 1, desiredLength, word);
}
word[currentLength] = '\0';
}
Console.WriteLine(word);
}
static void Main(String[] args)
{
double totalValues = (Math.Pow(Symbols.Length, MaxLength + 1) - Symbols.Length)/(Symbols.Length - 1);
char[] word = new char[MaxLength];
Console.WriteLine("Warning! You are about to print: {0} values", totalValues);
Console.WriteLine("Press any key to continue...");
Console.ReadKey(true /* intercept */);
BuildWord(0 /* currentLength */, MaxLength, new char[MaxLength]);
}
}
}
To complete the post with a Java example which will print out the Base64 encoded MD5's of all possible character combinations using only 0-9 and a-z characters:
MessageDigest digest = MessageDigest.getInstance("MD5");
int i = 0;
while (true)
{
String raw = Integer.toString(i, Character.MAX_RADIX);
byte[] md5 = digest.digest(raw.getBytes());
String base64 = new BigInteger(1, md5).toString(16);
System.out.println(raw + " = " + base64);
i++;
}
精彩评论