Home >Backend Development >PHP Tutorial >HTML-ENTITIES encoding

HTML-ENTITIES encoding

WBOY
WBOYOriginal
2016-08-08 09:28:211054browse

When using fabpot/goutte (https://github.com/FriendsOfPHP/Goutte) to crawl the web page, I found that no matter what encoding the target page is (gb2312...), the final result is unicode.
After research, I found that Symfony’s crawler calls html-entities encoding.

mb_convert_encoding($content, 'HTML-ENTITIES', $charset);

Then, the basic knowledge was popularized on Wikipedia. . . html-entities are encoded using unicode (http://en.wikipedia.org/wiki/Character_encodings_in_HTML).

Reference

A numeric character reference in HTML refers to a character by its Universal Character Set/Unicode code point


Hereby recorded.

The above introduces HTML-ENTITIES coding, including aspects of content. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn