Home >Backend Development >PHP Tutorial >Implementation code for PHP to determine whether a string is pure English, pure Chinese characters, or a mixture of Chinese and English
How to use php code to determine how strings are combined? For example, how to determine whether it is pure English, pure numbers, or a mixture of Chinese characters and English? Let’s take a look at the analysis and several examples provided for you in this article.
Instructions: The method for PHP to determine whether a string is Chinese (or English), in addition to regular expression judgment and character splitting to judge whether the value of a character is less than 128, there is also a special method. That is to use the mb_strlen and strlen functions in php to judge: Use the above two functions to measure the return value of the character in the current encoding, and then compare the return values. The return values that are equal are pure English, pure numbers, and mixed English and numbers; The return values are not equal, and the strlen return value can be divisible by mb_strlen, which is pure Chinese characters. If the return values are not equal, and the return value of strlen is not divisible by mb_strlen, it is a mixed arrangement of English and Chinese or Chinese. What a wonderful function, what an interesting idea, what a wonderful example, below, haha. Example, <?php $strarray[1] = "hello"; $strarray[2] = "123456"; $strarray[3] = "123hello"; $strarray[4] = "你好"; $strarray[5] = "123你好"; $strarray[6] = "hello你好"; $strarray[7] = "123hello你好"; foreach ($strarray as $key->$value) { $x = mb_strlen($value,'gb2312'); $y = strlen($value); echo $strarray[$key].' <span style="color: #ff0000;">'.$x.'</span> <span style="color:#ff0000;">'.$y.'</span>'; } ?> Output result: hello 5 5 123456 6 6 123hello 8 8 hello 2 4 123Hello 5 7 hello 7 9 123hello hello 10 12 PHP does not have a direct function to determine whether a string is pure English or pure Chinese characters or a mixture of Chinese and English. You can only write the function yourself. In order to realize this function, it is necessary to understand the Chinese character encoding occupancy of the character set. At present, the more commonly used character sets in China are UTF8 and GBK. UTF8 Each Chinese character is equal to 3 lengths; Each Chinese character in GBK is equal to 2 lengths; Based on the above differences between Chinese characters and English, you can use the mb_strlen function and strlen function to calculate two sets of length numbers respectively, and then perform operations according to the rules to determine the type of the string. 1. Example of UTF-8 <?php /** * PHP判断字符串纯汉字 OR 纯英文 OR 汉英混合 * site bbs.it-home.org */ echo '<meta charset="utf-8" />'; function utf8_str($str){ $mb = mb_strlen($str,'utf-8'); $st = strlen($str); if($st==$mb) return '纯英文'; if($st%$mb==0 && $st%3==0) return '纯汉字'; return '汉英混合'; } $str = '博客'; echo '字符串:<span style="color:red">'.$str.'</span>,是<span style="color:red">'.utf8_str($str).'</span>'; ?> 2. Example of GBK method <?php /** * PHP判断字符串纯汉字 OR 纯英文 OR 汉英混合 * site bbs.it-home.org */ function gbk_str($str){ $mb = mb_strlen($str,'gbk'); $st = strlen($str); if($st==$mb) return '纯英文'; if($st%$mb==0 && $st%2==0) return '纯汉字'; return '汉英混合'; } ?> |