Home  >  Article  >  Backend Development  >  The difference between php htmlentities and htmlspecialchars_PHP tutorial

The difference between php htmlentities and htmlspecialchars_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 15:50:56683browse

The translations performed are:

Copy code The code is as follows:

'&' (ampersand) becomes '&'
'"' (double quote) becomes '"' when ENT_NOQUOTES is not set.
''' (single quote) becomes ''' only when ENT_QUOTES is set.
'<' (less than ) becomes '<'
'>' (greater than) becomes '>'

htmlspecialchars only converts the above few html codes, while htmlentities will convert all html codes , together with the Chinese characters it cannot recognize, were also converted.

We can take a simple example for comparison:
Copy the code The code is as follows:

$ str='Test page';
echo htmlentities($str);
//

$str='
Test page';
echo htmlspecialchars($str);
// Test page

The conclusion is that when there is Chinese, it is best to use htmlspecialchars, otherwise it may be garbled

Also refer to this custom function
Copy the code The code is as follows:

function my_excerpt( $html, $len ) {
// $html should contain an HTML document.
// This example will remove HTML tags, javascript code
// and whitespace characters. Also converts some common
// HTML entities into corresponding text.
$search = array ("']*?>.*?'si", // Remove javascript
"'<[/!]* ?[^<>]*?>'si", // Remove HTML tags
"'([rn])[s]+'", // Remove whitespace characters
"'&( quot|#34);'i", // Replace HTML entity
"'&(amp|#38);'i",
"'&(lt|#60);'i",
"'&(gt|#62);'i",
"'&(nbsp|#160);'i",
"'&(iexcl|#161);'i",
"'&(cent|#162);'i",
"'&(pound|#163);'i",
"'&(copy|#169);'i" ,
"'(d+);'e"); // Run as PHP code
$replace = array ("",
"",
"\1",
""",
"&",
"<",
">",
" ",
chr(161),
chr(162),
chr(163),
chr(169),
"chr(\1)");
$text = preg_replace ($search, $replace, $html);
$ text = trim($text);
return mb_strlen($text) >= $len ? mb_substr($text, 0, $len) : '';
}

The htmlspecialchar() function is similar to the htmlentities() function in converting html codes. htmlspecialchars_decode converts the converted html encoding back.

We can take a simple example for comparison:
Copy code The code is as follows:

Run the above code and you can see the difference between the two.

I have always known that the htmlentities and htmlspecialchars functions in PHP can convert special characters in html into corresponding character entities (I don’t know how to translate). I have also always known that there is a difference between the htmlentities and htmlspecialchars functions, but I have always known that I don't use these two functions, so I haven't studied the difference.


I used it today. I was too lazy to read the bird language in the PHP manual. I thought someone should have written this kind of question in Chinese, so I Googled the keyword "htmlentities htmlspecialchars" and the answers were the same. I have become so accustomed to it that even a primary school student can copy and paste. After comparison, it was found that each article roughly contains two parts:

The first part is a reference to the PHP manual:

The PHP manual writes about htmlspecialchars:

The translations performed are:
Copy code The code is as follows:

'&' (ampersand) becomes '&'
'" ' (double quote) becomes '"' when ENT_NOQUOTES is not set.
"' (single quote) becomes ''' only when ENT_QUOTES is set.
'<' (less than) becomes '<'
'>' (greater than) becomes '>'

This part is understandable, but the explanation of the second part is not very correct:

htmlspecialchars only converts the above few html codes, while htmlentities will convert all html codes, including the ones inside that it cannot The recognized Chinese characters are also converted.

We can take a simple example for comparison:
Copy the code The code is as follows:

< ;?php
$str='
Test page';
echo htmlentities($str);

// < ;a href="test.html">²âÊÔÒ³Ãæ

$str='Test page';
echo htmlspecialchars($str);
// Test page

?>

Conclusion Yes, when there is Chinese, it is best to use htmlspecialchars, otherwise it may be garbled.

Does the htmlentities function only have one parameter? Of course not! htmlentities also has three optional parameters, namely $quote_style, $charset, $double_encode. The manual describes the $charset parameter as follows:

Defines character set used in conversion. The default character set is ISO-8859 -1.

Judging from the output of the above program, $str is encoded in GB2312. The hexadecimal value corresponding to the words "test page" is:

B2 E2 CA D4 D2 B3 C3 E6

However, it is parsed as ISO-8859-1 encoding:

²âÊÔÒ³Ãæ

exactly corresponds to the HTML character entity:

²âÊÔÒ³Ãæ

will of course be escaped by htmlentities, but as long as the correct encoding is added as a parameter, the so-called Chinese garbled problem will not occur at all:

$str='Test page';

echo htmlentities($str, ENT_COMPAT, 'gb2312');
// Test PageThree people become tigers, spreading rumors.

Conclusion: The difference between htmlentities and htmlspecialchars is that htmlentities will convert all html character entities, while htmlspecialchars will only convert a few html character entities listed in the manual (that is, the basic ones that will affect html parsing) character). Generally speaking, it is sufficient to use htmlspecialchars to convert basic characters, and there is no need to use htmlentities. When actually using htmlentities, be careful to pass the correct encoding for the third parameter.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/319281.htmlTechArticleThe translations performed are: Copy the code as follows: 'amp;' '"' (double quote) becomes '' when ENT_NOQUOTES is not set. ''' (single quote) becomes '' only when ENT_QUOTES is...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn