Home  >  Article  >  Backend Development  >  Understand the php Hash function and enhance password security_PHP tutorial

Understand the php Hash function and enhance password security_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:31:57686browse

1. Disclaimer
Cryptozoology is a complex topic and I am not an expert in it. Many universities and research institutions have long-term research in this area. In this article, I hope to show you a way to securely store passwords for web applications in as simple and understandable a way as possible.
2. What does "Hash" do?
"Hash converts a piece of data (small data or large data) into a relatively short piece of data, such as a string or an integer."
This is accomplished by relying on a one-way hash function. The so-called one-way means that it is difficult (or actually impossible) to reverse it back. An example of a common hash function is md5(), which is popular in various computer languages ​​and systems.

Copy code The code is as follows:

$data = "Hello World";
$hash = md5($data ; Speaking of which, it can also be represented by a 128-bit (16-byte) integer. You can use md5() to process very long strings and data, but what you always get is a fixed-length hash value, which may also help you understand why this function is "one-way".

3. Use Hash function to store passwords

Typical user registration process:
The user fills in the registration form, which contains the password field; The program stores all the information filled in by the user to the database;
However, the password is encrypted by a hash function before being stored in the database;
The original password is no longer stored anywhere, or it is discarded.
User login process:
The user enters the user name and password;
The program encrypts the password through the same hash function registered;
The program finds the user from the database and reads the hashed password;
The program compares the username and password and authorizes the user if they match.
How to choose a suitable method to encrypt passwords, we will discuss this issue later in the article.

4. Question 1: Hash collision

Hash collision means that two different contents are hashed to obtain the same hash value. The possibility of a hash collision depends on the hash algorithm used.
How to generate? For example, some old-fashioned programs use crc32() to hash passwords. This algorithm produces a 32-bit integer as the hash result, which means there are only 2^32 (i.e. 4,294,967,296) possible output results.
Let’s hash a password:



Copy the code
The code is as follows: echo crc32('supersecretpassword'); // outputs: 323322056

Now let’s assume that a person steals the database and gets the hashed password. He may not be able to restore 323322056 to 'supersecretpassword', however he can find another password that can be hashed to the same value. This only requires a very simple program:


Copy code
The code is as follows: set_time_limit(0); $ i = 0;
while (true) {
if (crc32(base64_encode($i)) == 323322056) {
echo base64_encode($i);
exit;
}
$i++;
}


This program may take a while to run, but eventually it will return a string. We can use this string in place of 'supersecretpassword' and use it to successfully log in to the user account using that password.
For example, after running the above program on my computer for a few months, I got a string: 'MTIxMjY5MTAwNg=='. Let’s test it:


Copy code
The code is as follows: echo crc32('supersecretpassword'); // outputs: 323322056
echo crc32('MTIxMjY5MTAwNg==');
// outputs: 323322056


How to solve it?
Now a slightly more powerful home PC can run a hash function one billion times per second, so we need a hash function that can produce a wider range of results. For example, md5() is more suitable. It can generate a 128-bit hash value, which means there are 340,282,366,920,938,463,463,374,607,431,768,211,456 possible outputs. So people generally cannot do so many loops to find hash collisions. However, some people still find ways to do this, see examples for details.
sha1() is a better alternative as it produces hash values ​​up to 160 bits long.
5. Problem 2: Rainbow table
Even if we solve the collision problem, it is still not safe enough.
“The rainbow table is a table built by calculating hash values ​​of commonly used words and their combinations.”
This table may store millions or even billions of data. Storage is now very cheap, so very large rainbow tables can be built.
Now let’s assume that a person steals the database and gets millions of hashed passwords. A thief can easily look up these hashes in the rainbow table one by one and get the original password. Although not all hash values ​​can be found in the rainbow table, some can certainly be found.
How to solve it?
We can try to add some interference to the password, such as the following example:
Copy the code The code is as follows:

$password = "easypassword";
// this may be found in a rainbow table
// because the password contains 2 common words
echo sha1($password); // 6c94d3b42518febd4ad747801d50a8972022f956
/ / use bunch of random characters, and it can be longer than this
$salt = "f#@V)Hu^%Hgfds";
// this will NOT be found in any pre-built rainbow table
echo sha1($salt . $password); // cd56a16759623378628c0d9336af69b74d9d71a5

What we do here is just append a interference string before each password and hash it, as long as the appended string Complex enough, the hashed value will definitely not be found in the pre-built rainbow table. But it's still not safe enough now.
6. Question 3: Still a rainbow table
Note that the rainbow table may be created from scratch after stealing the merge string. The interference string may also be stolen together with the database, and then they can use this interference string to create a rainbow table from scratch. For example, the hash value of "easypassword" may exist in the ordinary rainbow table, but in the new rainbow table , the hash value of "f#@V)Hu^%Hgfdseasypassword" will also exist.
How to solve it?
We can use unique distraction strings for each user. One available solution is to use the user's id in the database:
Copy code The code is as follows:

$hash = sha1( $user_id . $password);

The premise of this method is that the user's id is a constant value (this is the case in general applications)
We can also randomize each user Generate a unique interference string, but we also need to store this string:
Copy code The code is as follows:

// generates a 22 character long random string
function unique_salt() {
return substr(sha1(mt_rand()),0,22);
}
$unique_salt = unique_salt();
$hash = sha1($unique_salt . $password);
// and save the $unique_salt with the user record
// ...

This method prevents us from being compromised by rainbow tables, because each password is interfered with a different string. An attacker would need to create as many rainbow tables as there are passwords, which is impractical.
7. Question 4: Hash speed
Most hash algorithms are designed with speed in mind, because they are generally used to calculate the hash value of large data or files to verify the integrity of the data. Correctness and completeness.
How to generate?
As mentioned before, a powerful PC can now perform billions of calculations per second, making it easy to use brute force to try every password. You might think that a password of 8+ characters would protect you from brute force cracking, but let’s see if that’s the case:
If a password can contain lowercase letters, uppercase letters, and numbers, that’s 62 (26+ 26+10) characters are optional;
An 8-digit password has 62^8 possible combinations, which is slightly more than 218 trillion.
Based on the speed of calculating 1 billion hash values ​​per second, it only takes 60 hours to solve.
For a 6-digit password, which is also a very common password, it only takes 1 minute to crack. Requiring a 9 to 10-digit password may be safer, but some users may find it troublesome.
How to solve it?
Use a slower hash function.
“Suppose you use an algorithm that can only run 1 million times per second under the same hardware conditions instead of an algorithm that can run 1 billion times per second, then the attacker may need to spend 1,000 times longer to do a brute force crack. 60 little ones will turn into 7 years! ”
You can implement this method yourself:
Copy the code The code is as follows:

function myhash($password, $unique_salt) {
$salt = "f#@V)Hu^%Hgfds";
$hash = sha1($unique_salt . $password);
/ / make it take 1000 times longer
for ($i = 0; $i < 1000; $i++) {
$hash = sha1($hash);
}
return $hash;
}

You can also use an algorithm that supports a "cost parameter", such as BLOWFISH. In php, you can use the crypt() function:
Copy code The code is as follows:

function myhash($password, $ unique_salt) {
// the salt for blowfish should be 22 characters long
return crypt($password, '$2a$10.$unique_salt');
}

this The second parameter of the function contains several values ​​separated by "$" symbols. The first value is "$2a", indicating that the BLOWFISH algorithm should be used. The second parameter "$10" is the cost parameter here, which is a logarithm with base 2, indicating the number of iterations of the calculation loop (10 => 2^10 = 1024), and the value can be from 04 to 31.
For example:
Copy code The code is as follows:

function myhash($password, $unique_salt) {
return crypt($password, '$2a$10.$unique_salt');
}
function unique_salt() {
return substr(sha1(mt_rand()),0,22);
}
$password = "verysecret";
echo myhash($password, unique_salt());
// result: $2a$10$dfda807d832b094184faeu1elwhtR2Xhtuvs3R9J1nfRGBCudCCzC

Result hash The values ​​include the $2a algorithm, the cost parameter $10, and a 22-bit interference string we used. The rest is the calculated hash value. Let’s run a test program:
Copy the code The code is as follows:

/ / assume this was pulled from the database
$hash = '$2a$10$dfda807d832b094184faeu1elwhtR2Xhtuvs3R9J1nfRGBCudCCzC';
// assume this is the password the user entered to log back in
$password = "verysecret";
if (check_password($hash, $password)) {
echo "Access Granted!";
} else {
echo "Access Denied!";
}
function check_password( $hash, $password) {
// first 29 characters include algorithm, cost and salt
// let's call it $full_salt
$full_salt = substr($hash, 0, 29);
// run the hash function on $password
$new_hash = crypt($password, $full_salt);
// returns true or false
return ($hash == $new_hash);
}

Run it, we will see "Access Granted!"
8. Integrate it together
Based on the above discussions, we wrote a tool class :
Copy code The code is as follows:

class PassHash {
// blowfish
private static $algo = '$2a';
// cost parameter
private static $cost = '$10';
// mainly for internal use
public static function unique_salt() {
return substr(sha1(mt_rand()),0,22);
}
// this will be used to generate a hash
public static function hash($password) {
return crypt($password,
self::$algo .
self::$cost .
'$'. self::unique_salt());
}
// this will be used to compare a password against a hash
public static function check_password($hash, $password) {
$full_salt = substr($hash, 0, 29);
$new_hash = crypt($password, $full_salt);
return ($hash == $new_hash);
}
}

以下是注册时的用法:
复制代码 代码如下:

// include the class
require ("PassHash.php");
// read all form input from $_POST
// ...
// do your regular form validation stuff
// ...
// hash the password
$pass_hash = PassHash::hash($_POST['password']);
// store all user info in the DB, excluding $_POST['password']
// store $pass_hash instead
// ...

以下是登录时的用法:
复制代码 代码如下:

// include the class
require ("PassHash.php");
// read all form input from $_POST
// ...
// fetch the user record based on $_POST['username'] or similar
// ...
// check the password the user tried to login with
if (PassHash::check_password($user['pass_hash'], $_POST['password']) {
// grant access
// ...
} else {
// deny access
// ...
}

9.加密是否可用
并不是所有系统都支持Blowfish加密算法,虽然它现在已经很普遍了,你可以用以下代码来检查你的系统是否支持:
复制代码 代码如下:

if (CRYPT_BLOWFISH == 1) {
echo "Yes";
} else {
echo "No";
}

不过对于php5.3,你就不必担心这点了,因为它内置了这个算法的实现。
结论
通过这种方法加密的密码对于绝大多数Web应用程序来说已经足够安全了。不过不要忘记你还是可以让用户使用安全强度更高的密码,比如要求最少位数,使用字母,数字和特殊字符混合密码等。

www.bkjia.comtruehttp://www.bkjia.com/PHPjc/322921.htmlTechArticle1.声明 密码学是一个复杂的话题,我也不是这方面的专家。许多高校和研究机构在这方面都有长期的研究。在这篇文章里,我希望尽量使用...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn