Home > Article > Backend Development > How Can I Remove Accents from Characters in PHP?
Character De-Accentuation in PHP
In PHP, extracting the base character from its accented counterpart can be achieved using various methods. One approach involves utilizing the Normalizer class, which offers native character normalization capabilities. Unfortunately, the Normalizer class may not be available in older PHP versions or on certain hosting platforms.
An alternative method involves using regular expressions and character substitution. The following function, known as Unaccent, effectively removes common accent marks from a string:
<code class="php">function Unaccent($string) { return preg_replace('~&([a-z]{1,2})(acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '', htmlentities($string, ENT_QUOTES, 'UTF-8')); }</code>
This function first converts accented characters into their HTML entities using htmlentities. It then applies a regular expression that identifies and captures the accented character, including the accent mark. The $1 placeholder in the replacement pattern ensures that only the base character is retained.
For instance, using this function on "ã" and "é" would yield "a" and "e", respectively.
The above is the detailed content of How Can I Remove Accents from Characters in PHP?. For more information, please follow other related articles on the PHP Chinese website!