Home >Backend Development >C++ >How Can I Accurately Validate Persian Characters Using Regex?

How Can I Accurately Validate Persian Characters Using Regex?

Susan Sarandon
Susan SarandonOriginal
2025-01-04 04:36:40986browse

How Can I Accurately Validate Persian Characters Using Regex?

Validation of Persian Characters using Regex

Issue:

When validating Persian characters using a regex pattern, the provided code ^[u0600-u06FF] $ may fail to include specific characters, such as گ, چ, پ, and ژ.

Answer:

To accurately validate Persian characters, consider using the following character sets:

Letters:

  • ^[آابپتثجچحخدذرزژسشصضطظعغفقکگلمنوهی] $
  • or the equivalent Unicode codepoints:
^[\u0622\u0627\u0628\u067E\u062A-\u062C\u0686\u062D-\u0632\u0698\u0633-\u063A\u0641\u0642\u06A9\u06AF\u0644-\u0648\u06CC]+$

Numbers:

  • ^[۰۱۲۳۴۵۶۷۸۹] $
  • or the equivalent Unicode codepoints:
^[\u06F0-\u06F9]+$

Vowels:

  • [ ‬ٌ ‬ًّ ‬َ ‬ِ ‬ُ ‬ْ ‬]
  • or the equivalent Unicode codepoints:
[\u202C\u064B\u064C\u064E-\u0652]

Combine these character sets as needed to match different aspects of Persian input. For example, for letters only:

^[آابپتثجچحخدذرزژسشصضطظعغفقکگلمنوهی]+$

Why Previous Patterns Failed:

While ^[u0600-u06FF] $ may seem comprehensive, it includes various characters not used in Persian, such as numerals and diacritics. Similarly, [آ-ی] covers additional characters that are not specific to Persian.

The above is the detailed content of How Can I Accurately Validate Persian Characters Using Regex?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn