Home >Backend Development >PHP Tutorial >How Can I Decode Unicode Escape Sequences to UTF-8 in PHP?

How Can I Decode Unicode Escape Sequences to UTF-8 in PHP?

Susan Sarandon
Susan SarandonOriginal
2024-12-29 03:49:15577browse

How Can I Decode Unicode Escape Sequences to UTF-8 in PHP?

Decoding Unicode Escape Sequences to UTF-8 Characters in PHP

Question: Is there a built-in function in PHP that can decode Unicode escape sequences like "u00ed" into the corresponding UTF-8 character, such as "í"?

Answer: While PHP does not provide a direct function for this task, you can use a combination of regular expressions and character encoding functions to achieve the desired result:

$str = preg_replace_callback('/\\u([0-9a-fA-F]{4})/', function ($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}, $str);

This code uses a regular expression to match Unicode escape sequences and replaces them with their corresponding UTF-8 characters using mb_convert_encoding().

In case the escape sequence is in UTF-16 format:

$str = preg_replace_callback('/\\u([0-9a-fA-F]{4})/', function ($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UTF-16BE');
}, $str);

This modified code assumes that the escape sequence is UTF-16 encoded, which is commonly used in certain programming languages and JSON notation.

The above is the detailed content of How Can I Decode Unicode Escape Sequences to UTF-8 in PHP?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn