Home > Article > Backend Development > Use PHP programming to prevent XSS cross-site scripting attacks_PHP tutorial
Many domestic forums have cross-site scripting vulnerabilities, and there are many such examples abroad. Even Google has appeared, but it was corrected in early December. (Editor's note: Regarding cross-site scripting vulnerability attacks, readers can refer to "Detailed Explanation of XSS Cross-Site Scripting Attacks"). Cross-site attacks are easy to construct and are very subtle and difficult to detect (usually stealing information and immediately jumping back to the original page).
I won’t explain here how to attack (and don’t ask me), but mainly how to prevent it. First of all, cross-site scripting attacks are caused by the lack of strict filtering of user input, so we must intercept possible dangers before all data enters our website and database. For illegal HTML codes including single and double quotation marks, you can use htmlentities().
<?php$str = "A 'quote' is <b>bold</b>";// Outputs: A 'quote' is <b>bold</b>echo htmlentities($str);// Outputs: A 'quote' is <b>bold</b>echo htmlentities($str, ENT_QUOTES);?> |
This will invalidate illegal scripts.
But please note that the default encoding of htmlentities() is ISO-8859-1. If your illegal script is encoded in other formats, it may not be filtered out, but the browser can recognize and execute it. I will first find a few sites to test this issue before talking about it.
Here is a function to filter illegal scripts:
function RemoveXSS($val) { // remove all non-printable characters. CR(0a) and LF(0b) and TAB(9) are allowed // this prevents some character re-spacing such as <javascript> // note that you have to handle splits with n, r, and t later sincethey *are* allowed in some inputs $val = preg_replace('/([x00-x08][x0b-x0c][x0e-x20])/', '', $val); // straight replacements, the user should never need these since they're normal characters // this prevents like <IMG SRC=@avascript:a&_#X6Cert('XSS')> $search = 'abcdefghijklmnopqrstuvwxyz'; $search .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; $search .= '1234567890!@#$%^&*()'; $search .= '~`";:?+/={}[]-_|'\'; for ($i = 0; $i < strlen($search); $i++) { // ;? matches the ;, which is optional // 0{0,7} matches any padded zeros, which are optional and go up to 8 chars // @ @ search for the hex values $val = preg_replace('/([x|X]0{0,8}'.dechex(ord($search[$i])).';?)/i',$search[$i], $val); // with a ; // @ @ 0{0,7} matches '0' zero to seven times $val = preg_replace('/({0,8}'.ord($search[$i]).';?)/', $search[$i], $val); // with a ; } // now the only remaining whitespace attacks are t, n, and r $ra1 = Array('javascript', 'vbscript', 'expression', 'applet', 'meta', 'xml', 'blink', 'link', 'style', 'script', 'embed', 'object', 'iframe', 'frame', 'frameset', 'ilayer', 'layer', 'bgsound', 'title', 'base'); $ra2 = Array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavailable', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate','ondrag', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterchange', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup','onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmouseout','onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart','onrowenter', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect','onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload'); $ra = array_merge($ra1, $ra2); $found = true; // keep replacing as long as the previous round replaced something while ($found == true) { $val_before = $val; for ($i = 0; $i < sizeof($ra); $i++) { $pattern = '/'; for ($j = 0; $j < strlen($ra[$i]); $j++) { if ($j > 0) { $pattern .= '('; $pattern .= '([x|X]0{0,8}([9][a][b]);?)?'; $pattern .= '|({0,8}([9][10][13]);?)?'; $pattern .= ')?'; } $pattern .= $ra[$i][$j]; } $pattern .= '/i'; $replacement = substr($ra[$i], 0, 2).'<x>'.substr($ra[$i], 2); // add in <> to nerf the tag $val = preg_replace($pattern, $replacement, $val); // filter out the hex tags if ($val_before == $val) { // no replacements were made, so exit the loop $found = false; } } } } |