Almost all web applications involve user authentication at some point, and many use the good old “password” as the primary approach to check if you really are who you claim to be.
This burdens developers with the important responsibility of having to store and process passwords in a secure way. Even if the app you’re protecting is not anything important (say, a silly online game), if your server is hacked and your users’ passwords are stolen, that could have dire consequences — imagine if someone uses the same password for their online banking! (Yes, many people are that stupid.) So let’s see the various approaches that can be taken here…
Plain, unencrypted passwords
No, no, no! Even if you’re not collecting any valuable private data. Re-read the last paragraph.
Passwords encrypted with a secret key
So you could encrypt the password database with DES, AES, or whatever encryption algorithm. In case a hacker gets it, all they will take is a bunch of gibberish that’s worthless without the key. Right?
Well, this option is a bit better than the above one, that’s for sure. It can limit the damage if the data leak is contained to the password database (such as with an SQL injection attack). But you’d still have to store the secret key somewhere, and your server has to be able to access it. So if an attacker can gain privileged access to the server, then they can get the secret key too… game over.
Passwords processed with a hash function
This is a bit better than the above, but still insecure. However, plain hashing is probably still the single most common way passwords are stored in non-enterprise applications. That’s probably because in many programming languages, it’s the easiest way for a programmer who is lazy (or pressed by looming deadlines) to add at least some security. For example, MySQL lets you easily apply the popular MD5 and SHA1 hash functions to a string.
Hashing is different than encryption. It’s more like fingerprinting. A cryptographic hash function can perform a one-way mathematical transformation of your password. You get a fixed-length piece of data that you store in your database. It’s impossible to recover the original password from that. However, it is possible to calculate the hashes of two text values, and verify if they are the same. In this way, you can check if the user has entered the correct password when you don’t even know it! You just compare the hashes.
What’s the problem here? Well, if an attacker gets the hashes, they can simply try to get the passwords by “brute force” (trying every possible combination of letters, numbers, etc. until they hit a match or give up). Modern hardware is fast enough to make this feasible, and attackers can also come “pre-armed” with a ready-made table of millions of common passwords and their corresponding hashes. Additionally, you can look at the hashes to see if two users have the same password, or if one person used the same password on two different websites — this gives important clues to the attackers.
Slow, salted hash
There are two tactics that, together, help prevent cracking of hash values. The first one is to repeat the hash function — you calculate the hash, then the hash of the hash, and so on, hundreds of times. That makes it slower to calculate the hashed value. When a user logs in, a slowdown of a tenth of a second is not going to be noticed. But when cracking a hash, you have to test billions and billions of possible values, so the same slowdown could be enough to prevent bad guys from getting the password.
In addition, there is this thing called a salt — in a typical implementation, for each user, a random, non-secret string would be generated and appended to the password before hashing it. Then the non-encrypted salt is stored together with the hash.
What’s the point of this? Well, the same password with a different salt would generate a different hash. You can no longer tell if two users have the same password just by looking at the hashes. You can no longer come with a table of common passwords/hashes prepared in advance.
Here is a sample user table where the password is simply hashed with the SHA1 algorithm:
These four users obviously have the same password, and by looking up the hash in a “rainbow table” readily available online, you can find out in less than a minute that the password is “pizza”.
So let’s “salt” this pizza. See what happens? Same password, much more difficult to steal:
This is nothing new
And you know what? All these considerations were described more than thirty years ago in a four-page paper dealing with password security on UNIX systems (R. Morris & K. Thompson 1979, “Password Security: A Case History“, Communications of the ACM, Volume 22, Issue 11). These are supposed to be standard security practices, but often they are not followed, even by large and knowledgeable providers like Adobe.
PHP implementation: long due
Since I primarily work with PHP, I am pleased to note that since 2013, when PHP 5.5.0 was released, there is a simplified way to work with safe password hashes that satisfy all the above conditions. The new password_hash and password_verify functions use a safe algorithm. Their default options are secure enough – no extra keys to provide, no fancy settings to configure.
PHP developers (like me) have no excuse not to use these functions. (Well, to be honest, they have one – some web hosting providers still haven’t upgraded to php 5.5 – but that makes it all the more important to be careful about the hosting that you choose!)