Home >Backend Development >PHP Tutorial >PHP/JS Chinese character regular expression summary_PHP tutorial

PHP/JS Chinese character regular expression summary_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 16:56:59837browse

If we want to match Chinese characters in php and js, we only need to use the regular /^[\x{4e00}-\x{9fa5}]+$/u to match double-byte characters (including Chinese characters): [ ^/x00-/xff], the details are as follows

js version

Regular expression matching Chinese characters: [/u4e00-/u9fa5]

Match double-byte characters (including Chinese characters): [^/x00-/xff]

The code is as follows Copy code
 代码如下 复制代码


var reg =  /^[u4e00-u9fa5]+$/;

if(reg.test(str))
{
   alert('汉字的干活');
}

计算字符串的长度(一个双字节字符长度计2,ASCII字符计1)

String.prototype.len=function(){return this.replace([^/x00-/xff]/g,"aa").length;}

var reg = /^[u4e00-u9fa5]+$/;

if(reg.test(str))
{

alert('Working with Chinese characters');
 代码如下 复制代码

 

$action = trim($_GET['action']);
if($action == "sub")
{
    $str = $_POST['dir']; 
    //if(!preg_match("/^[".chr(0xa1)."-".chr(0xff)."A-Za-z0-9_]+$/",$str)) //GB2312汉字字母数字下划线正则表达式
    if(!preg_match("/^[x{4e00}-x{9fa5}A-Za-z0-9_]+$/u",$str))   //UTF-8汉字字母数字下划线正则表达式
    { 
        echo "您输入的[".$str."]含有违法字符"; 
    }
    else
    {
        echo "您输入的[".$str."]完全合法,通过!"; 
    }
}

}

Calculate the length of the string (a double-byte character is counted as 2, and an ASCII character is counted as 1)
 代码如下 复制代码


$str = "小小子";
if(preg_match("/^[".chr(0xa1)."-".chr(0xff)."]+$/",$str)){
print($str."确实全是汉字");
} else {
print($str."这个真 TMD不全是汉字");
}

uft8编码正则

$str = "汉字";
if (preg_match("/^[x{4e00}-x{9fa5}]+$/u",$str)) {
print("该字符串全部是中文");
} else {
print("该字符串不全部是中文");
}

String.prototype.len=function(){return this.replace([^/x00-/xff]/g,"aa").length;}
php version php regular match Chinese characters! /^[x{4e00}-x{9fa5}]+$/u
The code is as follows Copy code
$action = trim($_GET['action']); if($action == "sub") { $str = $_POST['dir']; //if(!preg_match("/^[".chr(0xa1)."-".chr(0xff)."A-Za-z0-9_]+$/",$str)) //GB2312 Chinese characters Number underline regular expression if(!preg_match("/^[x{4e00}-x{9fa5}A-Za-z0-9_]+$/u",$str)) //UTF-8 Chinese character alphanumeric underline regular expression {  echo "The [".$str."] you entered contains illegal characters"; } else { echo "The [".$str."] you entered is completely legal, passed!"; } }
Of course, if you want the string to be all Chinese characters, the GBK2312 encoding matching is:
The code is as follows Copy code
$str = "Little boy"; if(preg_match("/^[".chr(0xa1)."-".chr(0xff)."]+$/",$str)){ print($str."It is indeed all Chinese characters"); } else { print($str."This is really not all Chinese characters"); } uft8 encoding regular $str = "Chinese characters"; if (preg_match("/^[x{4e00}-x{9fa5}]+$/u",$str)) { print("This string is all in Chinese"); } else { print("This string is not all Chinese"); }


In fact, as long as you understand the beginning and end of the high and low bits of each code, you can naturally write the regular expression, and it is directly sixteen digits. What is the difficulty? hehe. But please note that in php, x is used to indicate sixteen digits.

Examples of gbk, gb2312:

}        else
The code is as follows
 代码如下 复制代码

$action = trim($_GET['action']);
if($action == "sub")
{
$str = $_POST['dir'];
//if(!preg_match("/^[".chr(0xa1)."-".chr(0xff)."A-Za-z0-9_]+$/",$str)) //GB2312汉字字母数字下划线正则表达式
if(!preg_match("/^[x{4e00}-x{9fa5}A-Za-z0-9_]+$/u",$str)) //UTF-8汉字字母数字下划线正则表达式
{
echo "您输入的[".$str."]含有违法字符";  
     }
     else
     {
         echo "您输入的[".$str."]完全合法,通过!";  
     }
}
?>

 

Copy code

$action = trim($_GET['action']);
if($action == "sub")
{
$str = $_POST['dir'];

//if(!preg_match("/^[".chr(0xa1)."-".chr(0xff)."A-Za-z0-9_]+$/",$str)) //GB2312 Chinese characters Number underline regular expression

if(!preg_match("/^[x{4e00}-x{9fa5}A-Za-z0-9_]+$/u",$str)) //UTF-8 Chinese character alphanumeric underline regular expression

echo "The [".$str."] you entered contains illegal characters";

{                   echo "The [".$str."] you entered is completely legal and passed!"; 

}

} +$/u means: + means repeating 1 or more times; $ represents the end of the match; / represents the delimiter;
u indicates that the pattern string is treated as UTF-8; U means to stop searching after the first match.
To match 2-4, use {2, 4} to express.
/^[x{4e00}-x{9fa5}]{2,4}$/u http://www.bkjia.com/PHPjc/631567.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/631567.htmlTechArticleIf we want to match Chinese characters in php and js, we only need to use regular /^[\x{4e00}-\ x{9fa5}]+$/u, matches double-byte characters (including Chinese characters): [^/x00-/xff], the details are as follows js version matching...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn