unicode emoji是4个字节的,存不进MySQL里,找到一个转义的库http://code.iamcal.com/php/emoji/,但是转为Unicode之后,还是4个字节,一样存不进,应该说根本没转。转为其他格式的emoji又怕以后新增了表情不好做,你们在不改数据库编码的前提下,是怎么弄的?
方法1:base_encode64
这种方法是可以,但是旧数据没有经过encode
操作,取数据的时候如果统一进行decode的话,旧数据会丢失的。
方法2:urlencode
这个似乎可以,对没有经过encode的数据进行decode也不会有影响,而且多次decode似乎也不会有影响。你们说这个方法有缺陷吗?
=======================
一个发现,微信获取用户基本信息的时候,笑哭那个表情print_r出的是\ud83d\ude02
,而我存储的时候,报错说这个 \xF0\x9F\x98\x82 值不能存储,请问这是怎么回事,自动转码了,转成的这是什么?是微信转码过了吗?
=======================
方法3:最后采用了下面采纳的那个方法,因为我觉得它有下面几个优点:
1、那个方法只转换表情,不会转换中文,所以数据还是直接可读的
数据库中存储起来是这样的, 后面的\ud83d\udca5可以随意复制粘贴,而显示出来是这样的,
2、不会把表情转换为其它标准,只有一个简单的,固定的转换算法,也就是说不需要一个表情库来对照着转换,所以以后其它人要使用这个数据的时候,也很容易知道每个表情是对应的哪个。就算苹果大爷又增加了表情,也不需要做什么额外的修改。
3、可以无限decode输出的都是正确的内容。因为有的时候可能需要在一次请求中的两个地方做decode,其它decode多次会把正确的数据改成其它数据,这个不会。
缺点:
1、看了下面的代码就知道,这个是强制修改字符编码中,指定区间内的编码,也就说有可能误杀,也有可能有超出这个区间的emoji没杀到。不过仅仅是在字符前加反斜杠,即使误杀了,发现之后也很容易改回来。
数据库中发现有这样的 ,是漏杀了,但是不知道为什么,这个可以直接存数据库。
<code> /** 把用户输入的文本转义(主要针对特殊符号和emoji表情) */ function userTextEncode($str){ if(!is_string($str))return $str; if(!$str || $str=='undefined')return ''; $text = json_encode($str); //暴露出unicode $text = preg_replace_callback("/(\\\u[ed][0-9a-f]{3})/i",function($str){ return addslashes($str[0]); },$text); //将emoji的unicode留下,其他不动,这里的正则比原答案增加了d,因为我发现我很多emoji实际上是\ud开头的,反而暂时没发现有\ue开头。 return json_decode($text); } /** 解码上面的转义 */ function userTextDecode($str){ $text = json_encode($str); //暴露出unicode $text = preg_replace_callback('/\\\\\\\\/i',function($str){ return '\\'; },$text); //将两条斜杠变成一条,其他不动 return json_decode($text); }</code>
回复内容:
unicode emoji是4个字节的,存不进MySQL里,找到一个转义的库http://code.iamcal.com/php/emoji/,但是转为Unicode之后,还是4个字节,一样存不进,应该说根本没转。转为其他格式的emoji又怕以后新增了表情不好做,你们在不改数据库编码的前提下,是怎么弄的?
方法1:base_encode64
这种方法是可以,但是旧数据没有经过encode
操作,取数据的时候如果统一进行decode的话,旧数据会丢失的。
方法2:urlencode
这个似乎可以,对没有经过encode的数据进行decode也不会有影响,而且多次decode似乎也不会有影响。你们说这个方法有缺陷吗?
=======================
一个发现,微信获取用户基本信息的时候,笑哭那个表情print_r出的是\ud83d\ude02
,而我存储的时候,报错说这个 \xF0\x9F\x98\x82 值不能存储,请问这是怎么回事,自动转码了,转成的这是什么?是微信转码过了吗?
=======================
方法3:最后采用了下面采纳的那个方法,因为我觉得它有下面几个优点:
1、那个方法只转换表情,不会转换中文,所以数据还是直接可读的
数据库中存储起来是这样的, 后面的\ud83d\udca5可以随意复制粘贴,而显示出来是这样的,
2、不会把表情转换为其它标准,只有一个简单的,固定的转换算法,也就是说不需要一个表情库来对照着转换,所以以后其它人要使用这个数据的时候,也很容易知道每个表情是对应的哪个。就算苹果大爷又增加了表情,也不需要做什么额外的修改。
3、可以无限decode输出的都是正确的内容。因为有的时候可能需要在一次请求中的两个地方做decode,其它decode多次会把正确的数据改成其它数据,这个不会。
缺点:
1、看了下面的代码就知道,这个是强制修改字符编码中,指定区间内的编码,也就说有可能误杀,也有可能有超出这个区间的emoji没杀到。不过仅仅是在字符前加反斜杠,即使误杀了,发现之后也很容易改回来。
数据库中发现有这样的 ,是漏杀了,但是不知道为什么,这个可以直接存数据库。
<code> /** 把用户输入的文本转义(主要针对特殊符号和emoji表情) */ function userTextEncode($str){ if(!is_string($str))return $str; if(!$str || $str=='undefined')return ''; $text = json_encode($str); //暴露出unicode $text = preg_replace_callback("/(\\\u[ed][0-9a-f]{3})/i",function($str){ return addslashes($str[0]); },$text); //将emoji的unicode留下,其他不动,这里的正则比原答案增加了d,因为我发现我很多emoji实际上是\ud开头的,反而暂时没发现有\ue开头。 return json_decode($text); } /** 解码上面的转义 */ function userTextDecode($str){ $text = json_encode($str); //暴露出unicode $text = preg_replace_callback('/\\\\\\\\/i',function($str){ return '\\'; },$text); //将两条斜杠变成一条,其他不动 return json_decode($text); }</code>
我是这么玩儿的
<code><?php //处理名字的emoji符号 $tmpStr = json_encode($text); //暴露出unicode $tmpStr = preg_replace("#(\\\ue[0-9a-f]{3})#ie","addslashes('\\1')",$tmpStr); //将emoji的unicode留下,其他不动 $text = json_decode($tmpStr); return $text;</code></code>
给一个标准的解决方案:
mysql的版本必须为v5.5.3或更高
把数据库的编码改成
utf8mb4 -- UTF-8 Unicode
然后需要存储emoji表情的字段选择
utf8mb4_general_ci
数据库连接也需要改为
utf8mb4
设置完成后,应该可以看到如下类似字符集设置结果。那么可以直接的存入数据库,无需做任何额外的事情了。
<code>mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; +--------------------------+--------------------+ | Variable_name | Value | +--------------------------+--------------------+ | character_set_client | utf8mb4 | | character_set_connection | utf8mb4 | | character_set_database | utf8mb4 | | character_set_filesystem | binary | | character_set_results | utf8mb4 | | character_set_server | utf8mb4 | | character_set_system | utf8 | | collation_connection | utf8mb4_unicode_ci | | collation_database | utf8mb4_unicode_ci | | collation_server | utf8mb4_unicode_ci | +--------------------------+--------------------+ rows in set (0.00 sec) </code>
我在做微信公众平台开发时遇到过这个问题,微信用户的昵称可以包含表情(坑爹- -!)。于是我就将整个昵称转换成HEX字符串存在MySQL中,目前用户1W+,系统稳定,题主可以参考一下此方案。
MySQL支持hex() and unhex()函数。Java可以使用org.apache.commons.codec.binary.Hex工具类。其他语言也有相应的方法。
试试微博或qq里面的那种方式?用简单的编码来映射,比如微笑可以用 [wx]
或 /wx
。不过表情多了之后4个字符不怎么够用。。。
urldecode
我看了一下 decode 的源码,应该是不会出现问题。
只要没有 %
解码后肯定还是原来的字符(串),有 %
会出现两种情况,一种是解码成功,这个时候肯定就不是原来的字符串了,一种是解码失败,抛出异常(其实这个异常可以作为是否 encode的标准)。
解码还算是比较严格吧,作为用户名的情况下 出现 %
还解码成功的概率比较小吧,对于这部分你可以手动改数据库,应该不会有很多。
你试试这个函数,之前弄微信自定义菜单的时候,也接触过Emoji表情,当时看到用的这个函数把Emoji表情的编码给转换了。
<code>function utf8_bytes($cp) { if ($cp > 0x10000){ # 4 bytes return chr(0xF0 | (($cp & 0x1C0000) >> 18)). chr(0x80 | (($cp & 0x3F000) >> 12)). chr(0x80 | (($cp & 0xFC0) >> 6)). chr(0x80 | ($cp & 0x3F)); }else if ($cp > 0x800){ # 3 bytes return chr(0xE0 | (($cp & 0xF000) >> 12)). chr(0x80 | (($cp & 0xFC0) >> 6)). chr(0x80 | ($cp & 0x3F)); }else if ($cp > 0x80){ # 2 bytes return chr(0xC0 | (($cp & 0x7C0) >> 6)). chr(0x80 | ($cp & 0x3F)); }else{ # 1 byte return chr($cp); } }</code>
使用BOLO类型
将数据库编码改为 utf8mb4
我这个刚解决的这个问题(后端是java实现的,数据库Mysql),供参考。
1、修改存储emoji字段编码,例如放在username字段中:
<code> ALTER TABLE user CHANGE username username VARCHAR(128) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci default null;</code>
2、java在执行数据库插入、更新操作前,要先执行 sql语句"set names utf8mb4" 语句。
https://github.com/iamcal/php-emoji
我是用这个处理的~
http://www.emoji-cheat-sheet.com/
有一种编码叫 utfmb4,支持 4 位长度的 utf8 编码
喏,你的 MySQL 版本必须为 5.5 以上的
不用转,直接数据库转成utf8mb4, 我以前就是这么干的
不用更改整个数据库的把。。。create xxx() charset=utf8mb4 单表 utf8mb4就行了把
因为我的项目中需要对字数有限制的需求,涉及到逐字计数,在这基础上我增加了emoji的功能完美实现。你需要代码的话请再告诉我,我提供给你。

PHP is a server-side scripting language used for dynamic web development and server-side applications. 1.PHP is an interpreted language that does not require compilation and is suitable for rapid development. 2. PHP code is embedded in HTML, making it easy to develop web pages. 3. PHP processes server-side logic, generates HTML output, and supports user interaction and data processing. 4. PHP can interact with the database, process form submission, and execute server-side tasks.

PHP has shaped the network over the past few decades and will continue to play an important role in web development. 1) PHP originated in 1994 and has become the first choice for developers due to its ease of use and seamless integration with MySQL. 2) Its core functions include generating dynamic content and integrating with the database, allowing the website to be updated in real time and displayed in personalized manner. 3) The wide application and ecosystem of PHP have driven its long-term impact, but it also faces version updates and security challenges. 4) Performance improvements in recent years, such as the release of PHP7, enable it to compete with modern languages. 5) In the future, PHP needs to deal with new challenges such as containerization and microservices, but its flexibility and active community make it adaptable.

The core benefits of PHP include ease of learning, strong web development support, rich libraries and frameworks, high performance and scalability, cross-platform compatibility, and cost-effectiveness. 1) Easy to learn and use, suitable for beginners; 2) Good integration with web servers and supports multiple databases; 3) Have powerful frameworks such as Laravel; 4) High performance can be achieved through optimization; 5) Support multiple operating systems; 6) Open source to reduce development costs.

PHP is not dead. 1) The PHP community actively solves performance and security issues, and PHP7.x improves performance. 2) PHP is suitable for modern web development and is widely used in large websites. 3) PHP is easy to learn and the server performs well, but the type system is not as strict as static languages. 4) PHP is still important in the fields of content management and e-commerce, and the ecosystem continues to evolve. 5) Optimize performance through OPcache and APC, and use OOP and design patterns to improve code quality.

PHP and Python have their own advantages and disadvantages, and the choice depends on the project requirements. 1) PHP is suitable for web development, easy to learn, rich community resources, but the syntax is not modern enough, and performance and security need to be paid attention to. 2) Python is suitable for data science and machine learning, with concise syntax and easy to learn, but there are bottlenecks in execution speed and memory management.

PHP is used to build dynamic websites, and its core functions include: 1. Generate dynamic content and generate web pages in real time by connecting with the database; 2. Process user interaction and form submissions, verify inputs and respond to operations; 3. Manage sessions and user authentication to provide a personalized experience; 4. Optimize performance and follow best practices to improve website efficiency and security.

PHP uses MySQLi and PDO extensions to interact in database operations and server-side logic processing, and processes server-side logic through functions such as session management. 1) Use MySQLi or PDO to connect to the database and execute SQL queries. 2) Handle HTTP requests and user status through session management and other functions. 3) Use transactions to ensure the atomicity of database operations. 4) Prevent SQL injection, use exception handling and closing connections for debugging. 5) Optimize performance through indexing and cache, write highly readable code and perform error handling.

Using preprocessing statements and PDO in PHP can effectively prevent SQL injection attacks. 1) Use PDO to connect to the database and set the error mode. 2) Create preprocessing statements through the prepare method and pass data using placeholders and execute methods. 3) Process query results and ensure the security and performance of the code.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft