Home >Web Front-end >JS Tutorial >How to Efficiently Remove Accents from Strings in JavaScript?

How to Efficiently Remove Accents from Strings in JavaScript?

Linda Hamilton
Linda HamiltonOriginal
2024-12-14 22:38:15223browse

How to Efficiently Remove Accents from Strings in JavaScript?

Remove Accents/Diacritics in a String in JavaScript

Removing accented characters from strings can be a useful task for text processing and data analysis. In the provided code, the accentsTidy function attempts to remove accents using regular expressions. However, this approach may not be efficient or reliable, especially in older browsers like IE6.

ES2015/ES6 Solution

A more modern and efficient solution is to use the ES2015/ES6 String.prototype.normalize() method. This method converts a string to a Unicode normalized form. By using the "NFD" form, which decomposes combined graphemes into their base characters and combining marks, removing diacritics becomes easier. Here's an example:

const str = "Crème Brûlée";
str.normalize("NFD").replace(/[\u0300-\u036f]/g, "");
// "Creme Brulee"

The regular expression matches the Unicode range U 0300 → U 036F, which includes various diacritic marks. Other Unicode normal forms such as "NFKD" can be used to normalize characters like uFB01 (fi) differently.

Using Unicode Property Escapes

ES2018 introduced Unicode property escapes, providing a more concise way to remove diacritics:

str.normalize("NFD").replace(/\p{Diacritic}/gu, "");
// "Creme Brulee"

This escape matches all characters with the Unicode property "Diacritic".

Alternatively: Sorting

If the goal is to sort strings with accents, the Intl.Collator object can be used. It supports sorting strings based on their Unicode canonical order, which ignores diacritics. Here's an example:

const c = new Intl.Collator();
["creme brulee", "crème brûlée", "crame brulai", "crome brouillé",
"creme brulay", "creme brulfé", "creme bruléa"].sort(c.compare);
// ['crame brulai', 'creme brulay', 'creme bruléa', 'creme brulee', 'crème brûlée', 'creme brulfé', 'crome brouillé']

The above is the detailed content of How to Efficiently Remove Accents from Strings in JavaScript?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn