Home >Backend Development >PHP Tutorial >php extracts Chinese letters and numbers from string
I read the user’s nickname through the WeChat interface, but many of the names are special characters, as shown in the picture above. It is meaningless if these special characters are not stored in the mysql database. So I want to filter it and extract only Chinese letters and numbers from the name characters. How to write this in PHP.
I read the user’s nickname through the WeChat interface, but many of the names are special characters, as shown in the picture above. It is meaningless if these special characters are not stored in the mysql database. So I want to filter it and extract only Chinese letters and numbers from the name characters. How to write this in PHP.
It’s meaningless if these special characters can’t be stored in the mysql database
---- Meaningful, icons, try using varbinary for field types
In fact, the emoji emoticons cannot be stored in mysql
If this kind of expression is not processed and is directly stored in versions below mysql5.5, an error will be reported.
You can try modifying the database character set to utf8mb4.
There is a range of emoji unicode extraction on github. The reference range is Just match when filtering
First of all, if these things don’t mean anything to you, just don’t save them.
It’s useless if you extract part of it.
For complete access, mysql supports it. Just convert the character set. utf8mb4 is a superset of utf8 and is backward compatible. Modifying this is the most perfect solution.
The second step is transcoding at the code level. You can encode it, save it, take it out, decode it and display it again.
The last method is this. In fact, you just can’t save the emoji. It's OK to filter out emoji.
<code>public static function emoji($text) { $clean_text = ""; // Match Emoticons $regexEmoticons = '/[\x{1F600}-\x{1F64F}]/u'; $clean_text = preg_replace($regexEmoticons, '', $text); // Match Miscellaneous Symbols and Pictographs $regexSymbols = '/[\x{1F300}-\x{1F5FF}]/u'; $clean_text = preg_replace($regexSymbols, '', $clean_text); // Match Transport And Map Symbols $regexTransport = '/[\x{1F680}-\x{1F6FF}]/u'; $clean_text = preg_replace($regexTransport, '', $clean_text); // Match Miscellaneous Symbols $regexMisc = '/[\x{2600}-\x{26FF}]/u'; $clean_text = preg_replace($regexMisc, '', $clean_text); // Match Dingbats $regexDingbats = '/[\x{2700}-\x{27BF}]/u'; $clean_text = preg_replace($regexDingbats, '', $clean_text); $regexDingbats = '/[\x{231a}-\x{23ab}\x{23e9}-\x{23ec}\x{23f0}-\x{23f3}]/u'; $clean_text = preg_replace($regexDingbats, '', $clean_text); return $clean_text; }</code>
Source is here