Password Cracking in 2010 and Beyond
I have looked a bit into cryptography and related matters during the last couple of days and am pretty confused by now. I have a question about password strength and am hoping that someone can clear up my confusion by sharing how they think through the following questions. I am becoming obsessed about these things, but need to spend my time otherwise :-)
Let's assume we have an eight-digit password that consists of upper and lower-case alphabetic characters, numbers and common symbols. This means we have 96^8 ~= 7.2 quadrillion different possible passwords.
As I understand there are at least two approaches to breaking this password. One is to try a brute-force attack where we try to guess each possible combination of characters. How many passwords can modern processors (in 2010, Core i7 Extreme for eg) guess per second (how many instructions does a single password guess take and why)? My guess would be that it tak开发者_开发技巧es a modern processor in the order of years to break such a password.
Another approach would consist of obtaining a hash of my password as stored by operating systems and then search for collisions. Depending on the type of hash used, we might get the password a lot quicker than by the bruteforce attack. A number of questions about this:
- Is the assertion in the above sentence correct?
- How do I think about the time it takes to find collisions for MD4, MD5, etc. hashes?
And finally, regardless of the strength of file encryption using AES-128/256, the weak link is still my en/decryption password used. Even if breaking the ciphered text would take longer than the lifetime of the universe, a brute-force attack on my de/encryption password (guess password, then try to decrypt file, try next password...), might succeed a lot earlier than the end of the universe. Is that correct?
I would be very grateful, if people could have mercy on me and help me think through these probably simple questions, so that I can get back to work.
As I understand there are at least two approaches to breaking this password. One is to try a brute-force attack where we try to guess each possible combination of characters. How many passwords can modern processors (in 2010, Core i7 Extreme for eg) guess per second (how many instructions does a single password guess take and why)?
As you observe, this depends on the algorithm used. SHA1 is a common (though poor) choice, so let's consider that.
The best SHA1 implementations in software claim as little as 5.8 cycles per byte on 1024 byte blocks; let's be generous and assume that it's as efficient on a single 512 bit block; that would imply 371.2 cycles per block, or equivalently, per password guess. On your suggested processor, which Wikipedia claims does 147,600 MIPS, that's very optimistically about 400 million guesses per core per second, or a little under 2.3 billion per second for the whole processor. Note these are wildly optimistic, but should be in the ballpark, at least.
Another possibility is dedicated hardware: this claims to run on an FPGA, do 82 clock cycles per block, and run at 350mhz - which doesn't sound impressive at only 4.2 million guesses per second, until you consider that at only 14,500 gates per core, you can build a lot of these in the size of a Core i7.
Also bear in mind that a good password hashing scheme will hash the password repeatedly - hundreds, or even thousands of times - which inflates the amount of work you have to do by the same factor.
All of this is somewhat irrelevant, however, if you don't have access to the password hash - which you often wouldn't. In that situation, you're limited by the rate at which you can make guesses, and a well designed system will easily detect your brute-force attack, and cut you off, making the size of the password somewhat irrelevant.
Another approach would consist of obtaining a hash of my password as stored by operating systems and then search for collisions. Depending on the type of hash used, we might get the password a lot quicker than by the bruteforce attack. A number of questions about this:
Is the assertion in the above sentence correct?
Not exactly. You seem to already assume you have the password hash in the first question. A brute force attack is searching every possible password - they're not two distinct things.
How do I think about the time it takes to find collisions for MD4, MD5, etc. hashes?
There are currently no known practical preimage attacks for MD5 or SHA1. I'm not sure about MD4, but nobody in their right mind should be using it now!
And finally, regardless of the strength of file encryption using AES-128/256, the weak link is still my en/decryption password used. Even if breaking the ciphered text would take longer than the lifetime of the universe, a brute-force attack on my de/encryption password (guess password, then try to decrypt file, try next password...), might succeed a lot earlier than the end of the universe. Is that correct?
Correct, which is why good crypto systems don't encrypt messages directly with a password-generated key, but rather use other systems like public key crypto, requiring the attacker to first get your private key (which ought to be difficult in the first place), then attempt to crack the password on that.
I'm not sure what 'assertion' you mean? Of course, it 'might' be faster. However, in general, you do not have to worry about using the hash method as attack vector. Finding these collisions should theoretically be extremely hard (read: impractical), but if there ever was an algorithm to quickly determine a collision, then the corresponding hash-algorithm is plain and simply broken and all applications relying on it would have to exchange it for another one.
You don't. You should simply stick to using algorithms that are not known to be broken. For example, MD5 and SHA1 are considered broken. Although there doesn't seem to be a really efficient application available to actually find collisions, researchers believe it may be just around the corner. Currently, f.ex. the best known preimage attack on MD5 has a complexity of 2 to the power of something larger than 100. Considering that this is more than number of atoms in the universe, but is already considered insecure, you don't even have to bother about those algorithms considered secure.
No idea there.
And yes, in terms of brute-force attacks, I still consider the guess-try-guess-again approach the most practicable.
To get your head cleared a little bit consider this: For how long do you want to store information securely? Most of the time, one or two years is all that's needed really, in which case a normal AES-256 should hold up fine. If you need longer security, then you should go for the largest keys possible and hope for the best, but there will be no guarantee. And not even the infamous one-time-pad will safe you from that problem.
This is an update of my progress on this question so far.
The number of processor instructions a single password guess + decrypt cycle takes is, of course, dependent on the hardware and decryption algorithm. I am not sure how to think about this correctly, but my guess is that one such cycle might take around 1,000 instructions. Now, if we take an Intel Core i7 Extreme CPU that is capable of 150,000 MIPS, then this gives us 150,000,000 gueses per second that a PC you can easily buy today can carry out.
So for a password as described in the question (say, the password to a AES256 encrypted file), a brute-force attempt to break it, could succeed in less than 556 days.
If the assumption of 1,000 instructions for one guess-decrypt cycle is correct, then a single standard PC you can buy today can crack such a password for your AES256 encrypted files in less than two years. Phew, the riddle is solved.
My conclusion would be that an eight-digit alphanumeric + symbols character is still sufficient for most practical purposes. And in two years time, we will all have to learn to remember even longer passwords :-)
The bigger issue with passwords is that they are very rarely truly random. Many people will use whole or partial words in their passwords, which means there are attacks using dictionaries that greatly reduce the space over which possible passwords must be searched. Even when a password is generated 'randomly' by a person (not using mechanical assistance, such as a pseudo-random generator), we have unconscious biases that influence how we try to make something 'random', which can be exploited by attackers.
Another issue is that people will often use the same passwords for numerous different logins. Not all logins are cryptographically secure; in fact, there are still applications out there that transmit passwords in plaintext. They are also vulnerable to capture by keyloggers, which can often be transmitted by seemingly innocuous vectors, like USB thumb drives.
I can't answer the quantitative questions, but:
Your calculation of the possibilities for an 8-character password assumes that all of the combinations are possible, including ones that are very hard to remember or type. Usually, this type of password tends to be in some sort of encrypted container with other passwords. Passwords that people actually use tend to live in a smaller part of that range. Not to mention that 7.2E15 is not enough against a determined attacker.
Good OSs try to hide the actual passwords as best they can, and in addition they salt the passwords by inserting a few random characters into the password hash (and, of course, saving the salt). This means that making a "rainbow table" of hashes of likely passwords is a lot less practical.
However, what you need to consider is the threat, and there's four important components here.
Who are you defending against? There's a major difference between the adolescent kid down the street and the NSA. The script kiddie will try stuff that's passed around the internet on his home box, while the NSA will use experts and oodles of computing power. Everybody else is in between.
How directly are you targeted? If an intruder is trying to find an account on a system, your password doesn't have to be unbreakable, just a lot harder than the weakest account on the system.
How long do you hope to keep out intruders? Some information will be stale two years from now, other information might be important a lot longer. As a corollary, how determined is the expected attacker?
How bad would it be if your account was broken into? If you broke into my Facebook account, you'd be able to post things as allegedly coming from me, and other than that you'd get no information that isn't already easily accessible. If you broke into one of my online banking accounts, I'd be far more concerned.
精彩评论