Let's see what the best way is to store passwords in the database and how well-known platforms solve this problem.
Plaintext
When there was a passwords' storing problem, the first idea was to write them openly in the corresponding table in the database. It would be a good idea if clients could not access the table with passwords directly. But, unfortunately, the well-known SQL injection still sometimes works in various web applications, not to mention other potential vulnerabilities. In security matters, it is customary to assume the worst and prepare a plan of action and defense precisely for such a case. It should be assumed that the attacker has already found a hole in the web application and, in one way or other, downloads a table with the logins and passwords of users and then disposes of them, as he pleases. In general, his further actions may be as follows:
- Perform illegitimate actions on behalf of users using their credentials on a vulnerable resource: for example, if a bank card is attached to an account, now the attacker can use it;
- Use the received password on other resources: often users neglect security and use the same password to different services;
- Try to identify the rule for generating a password and go to the second item: some users use a certain rule for password generation, as a result, different passwords are used on different resources, but they obey the same generation rule that can be detected;
- Privilege escalation: if the administrator password is stored in this table, the attacker can gain full control over the server.
Encryption Hashing
The idea already seems not so good. What should we do? It would be nice to store passwords encrypted. Then, even if they are extracted, the attackers will not be able to recover the passwords or they will have to spend too much time recovering at least. You should choose between two development branches: encrypt or hash passwords. Software developers have stopped on the second option and it is clear why. Let's compare the development options for different characteristics:
- Labor input. Encryption takes more time, and, what kind of password transformation we choose, it will have to be done with each password check. One of the requirements for hash functions is the speed of execution.
- The length of the output values. The result of the encryption has a variable length, the result of hashing is always the same, and storing data of uniform size in the database is very convenient. Do not forget also that the length of the password in encrypted form will give some information about the length of the original password. The same length is also inconvenient and can lead to the possibility of collisions, but we will discuss this below.
- Key management. For encryption, you need a key that must be stored somewhere so that no one can find it. In any case, generation and management of keys is a separate story: they should not be weak, they need to be changed regularly and so on.
- Possibility of collision. When encrypted, the output data from the different input data will always be different too. When hashing this is not always the case. Constant hash length means that the set of output values of the hash function is limited, which leads to the possibility of collision. That is, let's say the user really tried and came up with a really cool long password, in which there are special characters, numbers, and letters in the lower and upper case. An attacker enters in the password field an equally cool password " admin ". The hash server for checking and comparing hashes it. Hashes coincides. It's a shame.
However, hashing wins with a score of 3:1. But should we stop here? The answer is no.
Attacks on Hashed Passwords
Suppose that an attacker has got our table with usernames and passwords. In the table, now the passwords are stored in hashed form, but this does not stop our attacker, and he seriously intends to restore them. Possible actions of the attacker:
- Brute force with the dictionary: if an attacker is defeated with a master password of administrators, then he will use the dictionary of popular passwords and try his luck with their hashes;
- Rainbow tables: perhaps, he will not have to do anything at all to compute and search through the dictionary. It will be enough to take advantage of rainbow tables, which are easy to access in the network. Rainbow tables contain the hash values, already calculated by someone before, and the corresponding input data. It is important to note that due to collisions, the password that the rainbow table will offer is not necessarily the one the user uses. There are already tables with pre-calculated values for MD5, SHA1, SHA256, SHA512, for their modifications and some other algorithms. You can try to convert a hash, for example, here;
- Full search: if this does not work, then you'll have to use brute force and search through all possible passwords until the computed hashes finally match.
Usually, an attacker has to brute force passwords. In this case, success depends on the computing hash function speed. The hash time computing comparison can be seen here. For example, the hash functions implemented on Java on 64-bit Windows 10 with 1 core Intel i7 2.60GHz and 16GB RAM were started one million times to compute a hash of the 36 characters length. The following results were obtained:
MD5 — 627 ms
SHA-1 — 604 ms
SHA-256 — 739 ms
SHA-512 — 1056 ms
But today the brute force process can be parallelized and run many times faster using the GPU power (as well as on APU, DSP, and FPGA). However, in addition to choosing a longer algorithm and a longer output, you can do something else.
Hashing the Hash
To prevent the intruder from using ready-made rainbow tables, the password can be hashed several times. That is, we compute the hash from the hash from the hash from the hash... and so n times (it is necessary, however, not to get involved with this, because during normal verification of the user's password, the server will also have to do this). After that, the attacker will no longer find the password on the rainbow table easily, and the time for brute force will increase noticeably. But nothing will stop the attacker from generating a rainbow table with the password dictionary, knowing the algorithm of hashing.
Add Salt to Taste
In order to significantly complicate the life of an attacker, today the passwords are hashed with the addition of salt.
Salt is an additional random string that is appended to the password and is hashed with it. From the thus obtained hash, the password cannot be restored by the rainbow table. Knowing the salt and the output hash, the attacker is doomed to brute force and no pre-calculated tables are likely to help him.
Taxonomy of salting passwords:
-
By the principle of salting:
- A unique salt for each user: a unique salt is used for each user, so if the salt becomes known to an attacker, he will have to brute force the password of each user separately. And besides, even if two users think in the same way and come up with identical passwords, the hashes will still be different on the output;
- Global salt: salt is the same for all and used for all hashes;
- Both.
-
According to the salt storage method:
- In the database: as a rule, individual salts are stored in the same database as hashes of passwords; often even on the same line;
- In the code (read: in the config file): the global salt is usually stored not in the database, but, for example, in the config file, so that the offender has to spend time on its search.
We will assume that the individual salts of users are stored in the database and the global salt is hidden in the config file. The attacker got access to the database and knows all the hashes and the corresponding salts, but the global salt is not stored in the database and he does not know it. Total, if you combine all the methods, in order to get the passwords in open text, the attacker, even if extremely goal-oriented, will face the following obstacles:
- He does not know the global salt, so it will have to be brute forced.
- He knows the salts of users, but he does not have prepared tables with these salts, so he will have brute force passwords.
- This process will take even more time because he will have to hash hashes by n times.
How Different CMS Store Passwords in DB
Wordpress
Prior to version 3.x, all passwords were simply hashed using MD5. Now the phpass library is used. By default, the salt is assigned to the password from the front and the resulting line is hashed MD5 28 times.
Joomla
Prior to version 1.0.12, it was simply MD5. Now the phpass library is used, by default bcrypt with salt and 210 repetitions.
Drupal
Prior to version 6, md5 was used without salt. Now the phpass library is used. The default is a salty sha512 with 216 repetitions.
Silverstripe
A salty Blowfish with 210 repetitions is used.
Umbraco
The system uses HMACSHA256 with salt. Also, the system uses the second, global salt specified in the config file.