Home >Backend Development >PHP Tutorial >Security rules that cannot be violated in PHP development: filtering user input_PHP tutorial

Security rules that cannot be violated in PHP development: filtering user input_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:30:28929browse

As the most basic precaution, you need to pay attention to your external submissions and make the first security mechanism to deal with the firewall.
Rule 1: Never trust external data or input
The first thing you must realize about web application security is that external data should not be trusted. External data includes any data that is not entered directly by the programmer in the PHP code. Any data from any other source (such as GET variables, form POST, databases, configuration files, session variables, or cookies) cannot be trusted until steps are taken to ensure security.
For example, the following data elements can be considered safe because they are set in PHP.

Copy code The code is as follows:

$myUsername = 'tmyer';
$ arrayUsers = array('tmyer', 'tom', 'tommy');
define("GREETING", 'hello there' . $myUsername);
?>

However, the data elements below are all flawed.
Listing 2. Unsafe, flawed code
Copy the code The code is as follows:

$myUsername = $_POST['username']; //tainted!
$arrayUsers = array($myUsername, 'tom', 'tommy'); //tainted!
define("GREETING", 'hello there' . $myUsername); //tainted!
?>

Why is the first variable $myUsername defective? Because it comes directly from the form POST. Users can enter any string into this input field, including malicious commands to clean files or run previously uploaded files. You might ask, "Can't you avoid this danger by using a client-side (Javascrīpt) form validation script that only accepts the letters A-Z?" Yes, this is always a beneficial step, but as we'll see later , anyone can download any form to their machine, modify it, and resubmit whatever they need.
The solution is simple: the sanitization code must be run on $_POST['username']. If you don't do this, you risk polluting these objects any other time you use $myUsername (such as in an array or constant).
A simple way to sanitize user input is to use regular expressions to process it. In this example, only letters are expected to be accepted. It might also be a good idea to limit the string to a specific number of characters, or require all letters to be lowercase.
Listing 3. Making user input safe
Copy the code The code is as follows:

$myUsername = cleanInput($_POST['username']); //clean!
$arrayUsers = array($myUsername, 'tom', 'tommy'); //clean!
define("GREETING", 'hello there' . $myUsername); //clean!
function cleanInput($input){
$clean = strtolower($input);
$clean = preg_replace ("/[^a-z]/", "", $clean);
$clean = substr($clean,0,12);
return $clean;
}
?>

Rule 2: Disable PHP settings that make security difficult to implement
Now that you can’t trust user input, you should also know that you shouldn’t trust the PHP configuration on the machine. Way. For example, make sure register_globals is disabled. If register_globals is enabled, it's possible to do careless things like use a $variable to replace a GET or POST string with the same name. By disabling this setting, PHP forces you to reference the correct variables in the correct namespace. To use variables from a form POST, $_POST['variable'] should be quoted. This way you won't mistake this particular variable for a cookie, session, or GET variable.
Rule 3: If you can’t understand it, you can’t protect it
Some developers use strange syntax, or organize statements very tightly, resulting in short but ambiguous code. This approach may be efficient, but if you don't understand what the code is doing, you can't decide how to protect it.
For example, which of the following two pieces of code do you like?
Listing 4. Make the code easy to protect
Copy the code The code is as follows:

< ;?php
//obfuscated code
$input = (isset($_POST['username']) ? $_POST['username']:”);
//unobfuscated code
$ input = ”;
if (isset($_POST['username'])){
$input = $_POST['username'];
}else{
$input = ”;
}
?>

In the second cleaner snippet, it's easy to see that $input is flawed and needs to be cleaned up before it can be safely processed.
Rule 4: “Defense in depth” is the new magic weapon
This tutorial will use examples to illustrate how to protect online forms while taking the necessary measures in the PHP code that handles the form. Likewise, even if you use PHP regex to ensure that GET variables are entirely numeric, you can still take steps to ensure that SQL queries use escaped user input.
Defense in depth is not just a good idea, it ensures that you don’t get into serious trouble.
Now that the basic rules have been discussed, let’s look at the first threat: SQL injection attacks.
Prevent SQL injection attacks
In a SQL injection attack, the user adds information to a database query by manipulating a form or GET query string. For example, assume you have a simple login database. Each record in this database has a username field and a password field. Build a login form to allow users to log in.
Listing 5. Simple login form
Copy code The code is as follows:

< html>

Login














This form accepts user input for a username and password and submits the user input to a file called verify.php. In this file, PHP processes data from the login form as follows:
Listing 6. Unsafe PHP form handling code
Copy code The code is as follows:

$okay = 0;
$username = $_POST['user'];
$pw = $_POST['pw'];
$sql = “select count(*) as ctr from users where username='".$username."' and password='".$pw."' limit 1″;
$result = mysql_query($sql);
while ($data = mysql_fetch_object($result)){
if ($data->ctr == 1){
//they' re okay to enter the application!
$okay = 1;
}
}
if ($okay){
$_SESSION['loginokay'] = true;
header( "index.php");
}else{
header("login.php");
}
?>

This code does not look like Question, right? Code like this is used by hundreds (if not thousands) of PHP/MySQL sites around the world. What's wrong with it? Well, remember "user input cannot be trusted". No information from the user is escaped here, thus leaving the application vulnerable. Specifically, any type of SQL injection attack is possible.
For example, if the user enters foo as the username and ' or '1′='1 as the password, the following string is actually passed to PHP, which then passes the query to MySQL:
Copy code The code is as follows:

$sql = "select count(*) as ctr from users where username=' foo' and password=” or '1′='1′ limit 1″;
?>

This query always returns a count of 1, so PHP will allow access. By injecting some malicious SQL at the end of the password string, the hacker can impersonate the legitimate user.
The solution to this problem is to use PHP's built-in mysql_real_escape_string() function as a wrapper around any user input. Escape the characters in the string so that it is impossible to pass special characters such as apostrophes and let MySQL operate on the special characters. Listing 7 shows the code with escape processing. Secure PHP form processing code

Copy code The code is as follows:

$okay = 0;
$username = $_POST['user'];
$pw = $_POST['pw'];
$ sql = "select count(*) as ctr from users where username='".mysql_real_escape_string($username)."' and password='". mysql_real_escape_string($pw)."' limit 1″;
$result = mysql_query($sql);
while ($data = mysql_fetch_object($result)){
if ($data->ctr == 1){
//they're okay to enter the application !
$okay = 1;
}
}
if ($okay){
$_SESSION['loginokay'] = true;
header("index.php") ;
}else{
header("login.php");
}
?>

Use mysql_real_escape_string() as a wrapper for user input, just Any malicious SQL injection in user input can be avoided. If a user attempts to pass a malformed password via SQL injection, the following query will be passed to the database:
select count(*) as ctr from users where username='foo' and password='' or '1'='1 ′ limit 1″
Nothing in the database matches such a password. Just taking one simple step closed a big hole in the web application. The lesson here is that you should always use SQL Escape user input for queries.
However, there are a few security holes that need to be blocked.
Preventing user manipulation of GET variables
In the previous section, this was prevented. The user logged in with a malformed password. If you were smart, you should have applied the techniques you learned to ensure that all user input to the SQL statement was escaped.
However, the user is now safely logged in with a valid password. The password does not mean that it will play by the rules - there are many opportunities for it to cause damage. For example, the application may allow the user to view special content pointing to template.php?pid=33 or template.php?pid=. 321 Such a position. The part after the question mark in the URL is called the query string. Because the query string is placed directly in the URL, it is also called the GET query string. In PHP, if register_globals is disabled. Accessing this string using $_GET['pid'] may perform operations similar to Listing 8. Example template.php

<.>
Copy code
The code is as follows: $pid = $_GET['pid'];
/ /we create an object of a fictional class Page
$obj = new Page;
$content = $obj->fetchPage($pid);
//and now we have a bunch of PHP that displays the page
?>


What's wrong here? First of all, what about the implicit belief that the GET variable pid from the browser is safe? Users are not that smart to construct semantic attacks, but if they notice pid=33 in the browser's URL location field, they can start causing trouble. If they put in another number, then that's probably fine; but if they put in something else, like a SQL command or the name of a file (like /etc/passwd), or something else shenanigans like 3,000 characters long value, what happens?
In this case, remember the basic rule, don't trust user input. Application developers know that personal identifiers (PIDs) accepted by template.php should be numeric, so they can use PHP's is_numeric() function to ensure that non-numeric PIDs are not accepted, as shown below:
Listing 9. Use is_numeric() to limit GET variables



Copy code
The code is as follows: $pid = $_GET['pid'];
if (is_numeric($pid)){
//we create an object of a fictional class Page
$obj = new Page;
$ content = $obj->fetchPage($pid);
//and now we have a bunch of PHP that displays the page
}else{
//didn't pass the is_numeric() test , do something else!
}
?>


This method seems to work, but the following inputs can easily pass the is_numeric() check:
100 (valid)
100.1 (should not have decimal places)
+0123.45e6 (scientific notation - bad)
0xff33669f (hex - Danger! Danger!)
Then, What should security-conscious PHP developers do? Years of experience have shown that best practice is to use regular expressions to ensure that the entire GET variable consists of numbers, as shown below:

Listing 10. Using regular expressions to restrict GET variables



Copy code
The code is as follows:

$pid = $_GET['pid'];
if (strlen($pid)){
if (!ereg("^[0-9 ]+$”,$pid)){
//do something appropriate, like maybe logging them out or sending them back to home page
}
}else{
//empty $pid, so send them back to the home page
}
//we create an object of a fictional class Page, which is now
//moderately protected from evil user input
$obj = new Page;
$content = $obj->fetchPage($pid);
//and now we have a bunch of PHP that displays the page
?>

Required All it does is use strlen() to check whether the length of the variable is non-zero; if so, use an all-numeric regular expression to ensure that the data element is valid. If the PID contains letters, slashes, periods, or anything resembling hexadecimal, then this routine captures it and blocks the page from user activity. If you look behind the scenes of the Page class, you'll see that security-minded PHP developers have escaped the user input $pid, thus protecting the fetchPage() method, as shown below:
Listing 11. Escape the fetchPage() method
Copy the code The code is as follows:

class Page{
function fetchPage($pid){
$sql = "select pid,title,desc,kw,content,status from page where pid='".mysql_real_escape_string($pid)."' ";
}
}
?>

You may ask, "Since we have ensured that the PID is a number, why do we need to escape it?" Because I don't know? There are many different contexts and situations in which the fetchPage() method is used. Protection must be provided everywhere where this method is called, and escaping in the method embodies the meaning of defense in depth.
What happens if the user tries to enter a very long value, such as up to 1000 characters, trying to launch a buffer overflow attack? The next section discusses this in more detail, but for now you can add another check to ensure that the input PID is of the correct length. You know that the maximum length of the database's pid field is 5 digits, so you can add the following check.
Listing 12. Using regular expressions and length checks to limit GET variables
Copy code The code is as follows:

$pid = $_GET['pid'];
if (strlen($pid)){
if (!ereg("^[0-9] +$”,$pid) && strlen($pid) > 5){
//do something appropriate, like maybe logging them out or sending them back to home page
}
} else {
//empty $pid, so send them back to the home page
}
//we create an object of a fictional class Page, which is now
//even more protected from evil user input
$obj = new Page;
$content = $obj->fetchPage($pid);
//and now we have a bunch of PHP that displays the page
?>

Nowadays, anyone can't cram a 5,000-digit value into a database application -- at least not where GET strings are involved. Imagine the hackers gnashing their teeth when they are frustrated in their attempts to break into your application! And because error reporting is turned off, it's harder for hackers to conduct reconnaissance.
Buffer Overflow Attack
Buffer Overflow Attack An attempt to overflow a memory allocation buffer in a PHP application (or, more precisely, in Apache or the underlying operating system). Keep in mind that you may be writing your web application in a high-level language like PHP, but ultimately you're calling C (in the case of Apache). Like most low-level languages, C has strict rules for memory allocation.
Buffer overflow attacks send a large amount of data to the buffer, causing part of the data to overflow into adjacent memory buffers, thereby destroying the buffer or rewriting the logic. This can cause a denial of service, corrupt data, or execute malicious code on the remote server.
The only way to prevent buffer overflow attacks is to check the length of all user input. For example, if you have a form element that asks for the user's name, add a maxlength attribute with a value of 40 on this field and check it using substr() on the backend. Listing 13 gives a brief example of the form and PHP code.
Listing 13. Checking the length of user input
Copy code The code is as follows:

if ($_POST['submit'] == "go"){
$name = substr($_POST['name'],0,40);
}
?>
” method=”post”>


< ;/p>




Why not only provide the maxlength attribute, but also perform substr() check on the backend? Because defense in depth is always good. The browser prevents users from entering very long strings that PHP or MySQL cannot safely handle (imagine someone trying to enter a name that is up to 1,000 characters long), while backend PHP checks ensure that no one is manipulating form data remotely or in the browser .
As you can see, this approach is similar to using strlen() in the previous section to check the length of the GET variable pid. In this example, any input value longer than 5 digits is ignored, but the value can easily be truncated to an appropriate length, as shown below:
Listing 14. Changing the length of the input GET variable
Copy code The code is as follows:

$pid = $_GET['pid '];
if (strlen($pid)){
if (!ereg("^[0-9]+$",$pid)){
//if non numeric $pid, send them back to home page
}
}else{
//empty $pid, so send them back to the home page
}
//we have a numeric pid, but it may be too long, so let's check
if (strlen($pid)>5){
$pid = substr($pid,0,5);
}
//we create an object of a fictional class Page, which is now
//even more protected from evil user input
$obj = new Page;
$content = $obj->fetchPage($pid);
//and now we have a bunch of PHP that displays the page
?>

Note that buffer overflow attacks are not limited to long strings of numbers or letters. You may also see long hexadecimal strings (often looking like xA3 or xFF). Remember, the goal of any buffer overflow attack is to flood a specific buffer and place malicious code or instructions into the next buffer, thereby corrupting data or executing malicious code. The simplest way to deal with hex buffer overflow is to not allow input to exceed a certain length.
If you are dealing with a form text area that allows longer entries in the database, there is no way to easily limit the length of the data on the client side. After the data reaches PHP, you can use regular expressions to clear out any hex-like strings.
Listing 15. Prevent hexadecimal strings
Copy code The code is as follows:

if ($_POST['submit'] == "go"){
$name = substr($_POST['name'],0,40);
// clean out any potential hexadecimal characters
$name = cleanHex($name);
//continue processing….
}
function cleanHex($input){
$clean = preg_replace(” ![][xX]([A-Fa-f0-9]{1,3})!", "",$input);
return $clean;
}
?>
” method=”post”>





You may find this series of operations a bit too strict. After all, hexadecimal strings have legitimate uses, such as printing characters in a foreign language. How you deploy the hex regex is up to you. A better strategy is to only remove hex strings if there are too many of them on a line, or if the string exceeds a certain number of characters (such as 128 or 255).
Cross-site scripting attack
In a cross-site scripting (XSS) attack, there is often a malicious user entering information into a form (or through other user input methods), which inserts malicious client-side tags into the process or in the database. For example, let's say you have a simple guest book program on your site that allows visitors to leave their name, email address, and a brief message. A malicious user could take advantage of this opportunity to insert something other than a brief message, such as an image that would be inappropriate for other users or Javascript that would redirect the user to another site, or steal cookie information.
Fortunately, PHP provides the strip_tags() function, which can remove any content surrounded by HTML tags. The strip_tags() function also allows you to provide a list of allowed tags, such as or .
In-browser data manipulation
There is a type of browser plug-in that allows users to tamper with header elements and form elements on the page. Using Tamper Data, a Mozilla plug-in, it's easy to manipulate simple forms with many hidden text fields to send instructions to PHP and MySQL.
Before the user clicks Submit on the form, he can start Tamper Data. When submitting the form, he will see a list of form data fields. Tamper Data allows the user to tamper with this data before the browser completes the form submission.
Let’s go back to the example we built earlier. String length has been checked, HTML tags cleaned, and hexadecimal characters removed. However, some hidden text fields are added as follows:
Listing 17. Hidden variables
Copy code The code is as follows :

if ($_POST['submit'] == "go"){
//strip_tags
$name = strip_tags($_POST[ 'name']);
$name = substr($name,0,40);
//clean out any potential hexadecimal characters
$name = cleanHex($name);
// continue processing….
}
function cleanHex($input){
$clean = preg_replace(”![][xX]([A-Fa-f0-9]{1,3})! ", "",$input);
return $clean;
}
?>






< ;input type=”submit” name=”submit” value=”go”/>




Note that one of the hidden variables exposes the table name: users. You'll also see an action field with a value of create. Anyone with basic SQL experience can tell that these commands probably control a SQL engine in the middleware. Someone who wants to wreak havoc can simply change the table name or provide another option, such as delete.
What questions are left now? Remote form submission.
Remote form submission
The benefit of the Web is that it can share information and services. The downside is sharing information and services because some people do things without any scruples.
Take a form as an example. Anyone can visit a Web site and create a local copy of the form using File > Save As on the browser. He can then modify the action parameter to point to a fully qualified URL (not to formHandler.php, but to http://www.yoursite.com/formHandler.php since the form is on this site) and do what he wants For any modification, click Submit, and the server will receive this form data as a legal communication flow.
First of all, you may consider checking $_SERVER['HTTP_REFERER'] to determine whether the request comes from your own server. This method can block most malicious users, but it cannot block the most sophisticated hackers. These people are smart enough to tamper with the referrer information in the header to make the remote copy of the form look like it was submitted from your server.
A better way to handle remote form submission is to generate a token based on a unique string or timestamp and put this token in the session variable and the form. After submitting the form, check if the two tokens match. If it doesn't match, you know someone is trying to send data from a remote copy of the form.
To create a random token, you can use PHP's built-in md5(), uniqid() and rand() functions, as shown below:
Listing 18. Defense against remote form submission
Copy code The code is as follows:

session_start();
if ($_POST[' submit'] == "go"){
//check token
if ($_POST['token'] == $_SESSION['token']){
//strip_tags
$ name = strip_tags($_POST['name']);
$name = substr($name,0,40);
//clean out any potential hexadecimal characters
$name = cleanHex($name );
//continue processing….
}else{
//stop all processing! remote form posting attempt!
}
}
$token = md5(uniqid(rand (), true));
$_SESSION['token']= $token;
function cleanHex($input){
$clean = preg_replace(”![][xX]([A- Fa-f0-9]{1,3})!", "",$input);
return $clean;
}
?>
” method=”post”>



”/>




This technique works because session data cannot be migrated between servers in PHP. Even if someone obtains your PHP source code, moves it to their own server, and submits information to your server, all your server will receive is an empty or malformed session token and the originally provided form token. They don't match and the remote form submission fails.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/323236.htmlTechArticleAs the most basic precaution, you need to pay attention to your external submissions and make the first security mechanism to handle the firewall. Rule 1: Never trust external data or input Regarding web application security...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn