Password cracking is the process of recovering passwords from data that has been stored in or transmitted by a computer system. A common approach is to repeatedly try guesses for the password. The purpose of password cracking might be to help a user recover a forgotten password (though installing an entirely new password is less of a security risk, but involves system administration privileges), to gain unauthorized access to a system, or as a preventive measure by system administrators to check for easily crackable passwords.
Passwords to access computer systems are usually stored, typically not in cleartext form, in a database so the system can perform password verification when users attempt to login. To preserve confidentiality of system passwords, the password verification data is typically generated by applying a one-way function to the password, possibly in combination with other data. For simplicity in this discussion, when the one-way function (which may be either an encryption function or cryptographic hash) does not incorporate a secret key, other than the password, we will refer to the one way function employed as a hash and its output as a hashed password.Even though functions that create hashed passwords may be cryptographically secure, possession of a hashed password provides a quick way to test guesses for the password by applying the function to each guess, and comparing the result to the verification data. The most commonly used hash functions can be computed rapidly and the attacker can test guesses repeatedly with different guesses until one succeeds,meaning the plaintext password has been recovered.
The term password cracking is typically limited to recovery of one or more plaintext passwords from hashed passwords, but there are also many other ways of obtaining passwords illicitly; without the hashed version of a password, the attacker can still attempt access to the computer system in question with guessed passwords. However well designed systems limit the number of failed access attempts and can alert administrators to trace the source of the attack if that quota is exceeded. With the hashed password, the attacker can work undetected, and if the attacker has obtained several hashed passwords, the chances for cracking at least one is quite high.
Otherwise it is possible to try to obtain the passwords through other different methods, such as social engineering, wiretapping, keystroke logging, login spoofing, dumpster diving, phishing, shoulder surfing, timing attack, acoustic cryptanalysis, using a Trojan Horse or virus, identity management system attacks (such as abuse of Self-service password reset) and compromising host security (see password for details). However, cracking usually designates a guessing attack.
Cracking may be combined with other techniques. For example, use of a hash-based challenge-response authentication method for password verification may provide a hashed password to an eavesdropper, who can then crack the password. A number of stronger cryptographic protocols exist that do not expose hashed-passwords during verification over a network, either by protecting them in transmission using a high-grade key, or by using a zero-knowledge password proof.
Principal attack methods
Weak encryption
If a system uses a reversible function to obscure stored passwords, exploiting that weakness can recover even 'well-chosen' passwords. One example is the LM hash that Microsoft Windows uses by default to store user passwords that are less than 15 characters in length. LM hash converts the password into all uppercase letters then breaks the password into two 7-character fields which are then hashed separately, allowing each half to be attacked separately. Hash functions like SHA-512, SHA-1, and MD5 are considered impossible to invert when used correctly.
Guessing
See also: Password strength and Password policy
Many passwords can be guessed either by humans or by sophisticated cracking programs armed with dictionaries and the user's personal information.
Not surprisingly, many users choose weak passwords, usually one related to themselves in some way. Repeated research over some 40 years has demonstrated that around 40% of user-chosen passwords are readily guessable by programs.[1] Examples of insecure choices include:
* blank (none)
* the words "password", "passcode", "admin" and their derivatives
* the user's name or login name
* the name of their significant other or another person
* their birthplace or date of birth, or a friend's, or a relative's
* a pet's name
* a dictionary word in any language
* a name of a celebrity they like
* automobile licence plate number
* a row of letters from a standard keyboard layout (e.g., the qwerty keyboard -- qwerty itself, asdf, or qwertyuiop)
* a simple modification of one of the preceding, such as suffixing a digit or reversing the order of the letters.
and so on.
In one survey of MySpace passwords which had been phished, 3.8 percent of passwords were a single word found in a dictionary, and another 12 percent were a word plus a final digit; two-thirds of the time that digit was 1.[1]
Some users neglect to change the default password that came with their account on the computer system. And some administrators neglect to change default account passwords provided by the operating system vendor or hardware supplier. A famous example is the use of FieldService as a user name with Guest as the password. If not changed at system configuration time, anyone familiar with such systems will have 'cracked' an important password; such service accounts often have higher access privileges than a normal user account. Lists of default passwords are available on the Internet.[2]
Personal data about individuals are now available from various sources, many on-line, and can often be obtained by someone using social engineering techniques, such as posing as an opinion surveyor or a security control checker. Attackers who know the user may have information as well. For example, if a user chooses the password "19YaleLaw78" because he graduated from Yale Law School in 1978, a disgruntled business partner might be able to guess the password.
Cracking programs exist which accept personal information about the user being attacked and generate common variations for passwords suggested by that information.[3][4]
Brute force attack
A last resort is to try every possible password, known as a brute force attack. In theory, a brute force attack will always be successful since the rules for acceptable passwords must be publicly known, but as the length of the password increases, so does the number of possible passwords. This method is unlikely to be practical unless the password is relatively small. But, how small is too small? This depends heavily on whether the prospective attacker has access to the hash of the password, in which case the attack is called an offline attack (it can be done without connection to the protected resource), or not, in which case it is called an online attack. Offline attack is generally a lot easier, because testing a password is reduced to a quickly calculated mathematical computation; i.e., calculating the hash of the password to be tried and comparing it to the hash of the real password. In an online attack the attacker has to actually try to authenticate himself with all the possible passwords, where arbitrary rules and delays can be imposed by the system and the attempts can be logged. A common current length recommendation for cases where the attacker will not have access to the hash is 8 or more randomly chosen characters combining letters, numbers, and special (punctuation, etc) characters. Systems which limit passwords to numeric characters only, or upper case only, or, generally, which exclude possible password character choices make such attacks easier. Using longer passwords in such cases (if possible on a particular system) can compensate for a limited allowable character set. And, of course, even with an adequate range of character choice, users who ignore that range (using only upper case alphabetic characters, or digits alone, for instance) make brute force attacks much easier against those password choices.
Generic brute-force search techniques can be used to speed up the computation. But the real threat may be likely to be from smart brute-force techniques that exploit knowledge about how people tend to choose passwords. NIST SP 800-63 (2) provides further discussion of password quality, and suggests, for example, that an 8 character user-chosen password may provide somewhere between 18 and 30 bits of entropy, depending on how it is chosen. This amount of entropy is far less than what is generally considered safe for an encryption key.
How small is too small for offline attacks thus depends partly on an attacker's ingenuity and resources (e.g., available time, computing power, etc.), the latter of which will increase as computers get faster. Most commonly used hashes can be implemented using specialized hardware, allowing faster attacks. Large numbers of computers can be harnessed in parallel, each trying a separate portion of the search space. Unused overnight and weekend time on office computers can also be used for this purpose.
The distinction between guessing, dictionary and brute force attacks is not strict. They are similar in that an attacker goes through a list of candidate passwords one by one; the list may be explicitly enumerated or implicitly defined, may or may not incorporate knowledge about the victim, and may or may not be linguistically derived. Each of the three approaches, particularly 'dictionary attack', is frequently used as an umbrella term to denote all the three attacks and the spectrum of attacks encompassed by them.
Precomputation
Further information: Rainbow table
In its most basic form, precomputation involves hashing each word in the dictionary (or any search space of candidate passwords) and storing the