Home  >  Article  >  Backend Development  >  javascript - PHP regular rules to remove uncode control characters

javascript - PHP regular rules to remove uncode control characters

WBOY
WBOYOriginal
2016-10-10 11:55:541085browse

When writing a website to verify the username, I asked a friend who plays penetration to test it. . Then I was given the input of a control character (although it doesn’t matter, but it still has an impact. It seems that the regular expression on sf does not work. Also, how to use uncode in PHP perl?
Test as follows:

Unable to match (Note: u202e is an RLO control character)
Prohibited characters Test: Gong Lun Fa
The character sequence is RLOGong Lun Fa
It seems that there are many loopholes exploited by control characters?
Tieba has blocked control characters. However, my ability is limited and I have not found the relevant REX for controlling JS.
I came to SF for help.
PS: The Chinese username cannot be used /^[x4e00-x9affw]{4,12}$/

Reply content:

When writing a website to verify the username, I asked a friend who plays penetration to test it. . Then I was given the input of a control character (although it doesn’t matter, but it still has an impact. It seems that the regular expression on sf does not work. Also, how to use uncode in PHP perl?
Test as follows:

Unable to match (Note: u202e is an RLO control character)
Prohibited characters Test: Gong Lun Fa
The character sequence is RLOGong Lun Fa
It seems that there are many loopholes exploited by control characters?
Tieba has blocked control characters. However, my ability is limited and I have not found the relevant REX for controlling JS.
I came to SF for help.
PS: The Chinese username cannot be used /^[x4e00-x9affw]{4,12}$/

After flipping through the PHP Manual, I found the matching pattern,,,Chinese, a-zA-Z_, matching regular pattern:
/[wx{4e00}-x{9aff}]{4,12}/u Test:

Chinese and Japanese should be fine, just turn on UTF-8 mode.

In UTF-8 mode, "x{...}" is allowed to be used, and the content within the curly braces is a significant hexadecimal digit. It interprets the given hexadecimal number as a UTF-8 character code.

u (PCRE_UTF8)
This modifier turns on an additional feature that is incompatible with perl. Pattern strings are considered UTF-8. This modifier is available starting with PHP 4.1.0 or higher for Unix and PHP 4.2.3 for Win32. PHP 4.3.5 starts checking the utf-8 validity of patterns.
The question is over, I don’t understand why I didn’t give a big answer.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn