php htmlentities() function usage examples, htmlentities examples
php htmlentities() function converts characters into HTML entities, this article introduces php to coders The basic usage and examples of the htmlentities() function are introduced. Coders in need can refer to it.
Definition and Usage
htmlentities() function converts characters into HTML entities.
Tip: To convert HTML entities back to characters, use the html_entity_decode() function.
Tip: Please use the get_html_translation_table() function to return the translation table used by htmlentities().
Syntax
htmlentities(string,flags,character-set,double_encode)
Parameters |
Description |
string |
Required. Specifies the string to be converted. |
flags |
参数 |
描述 |
string |
必需。规定要转换的字符串。 |
flags |
可选。规定如何处理引号、无效的编码以及使用哪种文档类型。
可用的引号类型:
ENT_COMPAT - 默认。仅编码双引号。
ENT_QUOTES - 编码双引号和单引号。
ENT_NOQUOTES - 不编码任何引号。
无效的编码:
ENT_IGNORE - 忽略无效的编码,而不是让函数返回一个空的字符串。应尽量避免,因为这可能对安全性有影响。
ENT_SUBSTITUTE - 把无效的编码替代成一个指定的带有 Unicode 替代字符 U FFFD(UTF-8)或者 FFFD; 的字符,而不是返回一个空的字符串。
ENT_DISALLOWED - 把指定文档类型中的无效代码点替代成 Unicode 替代字符 U FFFD(UTF-8)或者 FFFD;。
规定使用的文档类型的附加 flags:
ENT_HTML401 - 默认。作为 HTML 4.01 处理代码。
ENT_HTML5 - 作为 HTML 5 处理代码。
ENT_XML1 - 作为 XML 1 处理代码。
ENT_XHTML - 作为 XHTML 处理代码。
|
character-set |
可选。一个规定了要使用的字符集的字符串。
允许的值:
UTF-8 - 默认。ASCII 兼容多字节的 8 位 Unicode
ISO-8859-1 - 西欧
ISO-8859-15 - 西欧(加入欧元符号 ISO-8859-1 中丢失的法语和芬兰语字母)
cp866 - DOS 专用 Cyrillic 字符集
cp1251 - Windows 专用 Cyrillic 字符集
cp1252 - Windows 专用西欧字符集
KOI8-R - 俄语
BIG5 - 繁体中文,主要在台湾使用
GB2312 - 简体中文,国家标准字符集
BIG5-HKSCS - 带香港扩展的 Big5
Shift_JIS - 日语
EUC-JP - 日语
MacRoman - Mac 操作系统使用的字符集
注释:在 PHP 5.4 之前的版本,无法被识别的字符集将被忽略并由 ISO-8859-1 替代。自 PHP 5.4 起,无法被识别的字符集将被忽略并由 UTF-8 替代。
|
double_encode |
可选。布尔值,规定是否编码已存在的 HTML 实体。
|
Optional. Specifies how to handle quotes, invalid encodings, and which document type to use. Available quote types: ENT_COMPAT - Default. Only double quotes are encoded.
ENT_QUOTES - encodes double and single quotes.
返回值: |
返回被转换的字符串。
如果 string 包含无效的编码,则返回一个空的字符串,除非设置了 ENT_IGNORE 或者 ENT_SUBSTITUTE 标志。
|
PHP 版本: |
4 |
更新日志: |
在 PHP 5 中,character-set 参数的默认值改为 UTF-8。
在 PHP 5.4 中,新增了:ENT_SUBSTITUTE、ENT_DISALLOWED、ENT_HTML401、ENT_HTML5、ENT_XML1 和 ENT_XHTML。
在 PHP 5.3 中,新增了 ENT_IGNORE。
在 PHP 5.2.3 中,新增了 double_encode 参数。
在 PHP 4.1 中,新增了 character-set 参数。
|
ENT_NOQUOTES - Do not encode any quotes.
Invalid encoding:
-
ENT_IGNORE - ignore invalid encoding instead of letting the function Returns an empty string. This should be avoided as this may have an impact on security.
- ENT_SUBSTITUTE - Substitute an invalid encoding with the specified character with the Unicode replacement character U FFFD (UTF-8) or FFFD; instead of returning an empty String.
- ENT_DISALLOWED - Replaces invalid code points in the specified document type with the Unicode replacement characters U FFFD (UTF-8) or FFFD;.
Additional flags specifying the document type to use:
- ENT_HTML401 - Default. Code processed as HTML 4.01.
- ENT_HTML5 - Handles code as HTML 5.
- ENT_XML1 - As XML 1 processing code.
- ENT_XHTML - as XHTML processing code.
|
character-set |
Optional. A string specifying the character set to be used. Allowed values:
- UTF-8 - Default. ASCII compatible multi-byte 8-bit Unicode
- ISO-8859-1 - Western Europe
- ISO-8859-15 - Western Europe ( Added French and Finnish letters missing from ISO-8859-1 for Euro symbol)
- cp866 - DOS-specific Cyrillic character set
- cp1251 - Cyrillic character set for Windows
- cp1252 - Western European character set for Windows
- KOI8-R - Russian li>
- BIG5 - Traditional Chinese, mainly used in Taiwan
- GB2312 - Simplified Chinese, national standard character set
- BIG5-HKSCS - Big5 with Hong Kong extension
- Shift_JIS - Japanese
- EUC-JP - Japanese li>
- MacRoman - Character set used by Mac operating system
Note: In versions prior to PHP 5.4, unrecognized character sets will be ignored and Replaced by ISO-8859-1. As of PHP 5.4, unrecognized character sets are ignored and replaced by UTF-8. |
double_encode |
Optional. Boolean value that specifies whether to encode existing HTML entities.
- TRUE - Default. Each entity will be converted.
- FALSE - Existing HTML entities will not be encoded.
|
Technical details
Return value: |
Return the converted string. If string contains an invalid encoding, an empty string is returned unless the ENT_IGNORE or ENT_SUBSTITUTE flag is set. |
PHP version: |
4 |
Update log: | In PHP 5, the default value of the character-set parameter is changed to UTF-8. In PHP 5.4, new: ENT_SUBSTITUTE, ENT_DISALLOWED, ENT_HTML401, ENT_HTML5, ENT_XML1 and ENT_XHTML. In PHP 5.3, ENT_IGNORE was added. In PHP 5.2.3, the double_encode parameter is added. In PHP 4.1, there is a new character-set parameter.
Statement:The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn