P粉8541192632023-08-28 11:11:30
也不要忘记 META 标记(像这样,或者 它的 HTML4 或 XHTML 版本):
1 |
|
这看起来微不足道,但 IE7 之前曾给我带来过问题。
我做的一切都是正确的;数据库、数据库连接和Content-Type HTTP标头都设置为UTF-8,在所有其他浏览器中都运行良好,但Internet Explorer仍然坚持使用“西欧”编码。
原来该页面缺少 META 标记。添加即可解决问题。
编辑:
W3C 实际上有一个相当大的专门讨论 I18N 的部分。他们有许多与此问题相关的文章 - 描述了 HTTP、(X)HTML 和 CSS 方面的内容:
他们建议同时使用 HTTP 标头和 HTML 元标记(或者在 XHTML 充当 XML 的情况下使用 XML 声明)。
P粉7636623902023-08-28 09:05:50
数据存储:
Specify the utf8mb4
character set on all tables and text columns in your database. This makes MySQL physically store and retrieve values encoded natively in UTF-8. Note that MySQL will implicitly use utf8mb4
encoding if a utf8mb4_*
collation is specified (without any explicit character set).
In older versions of MySQL (< 5.5.3), you'll unfortunately be forced to use simply utf8
, which only supports a subset of Unicode characters. I wish I were kidding.
数据访问:
In your application code (e.g. PHP), in whatever DB access method you use, you'll need to set the connection charset to utf8mb4
. This way, MySQL does no conversion from its native UTF-8 when it hands data off to your application and vice versa.
某些驱动程序提供自己的机制来配置连接字符集,该机制既更新其自身的内部状态,又通知 MySQL 连接上要使用的编码 - 这通常是首选方法。在 PHP 中:
If you're using the PDO abstraction layer with PHP ≥ 5.3.6, you can specify charset
in the DSN:
1 |
|
If you're using mysqli, you can call set_charset()
:
1 2 |
|
If you're stuck with plain mysql but happen to be running PHP ≥ 5.2.3, you can call mysql_set_charset
.
If the driver does not provide its own mechanism for setting the connection character set, you may have to issue a query to tell MySQL how your application expects data on the connection to be encoded: SET NAMES 'utf8mb4'
.
The same consideration regarding utf8mb4
/utf8
applies as above.
输出:
Content-Type: text/html; charset=utf-8
. You can achieve that either by setting default_charset
in php.ini (preferred), or manually using header()
function.json_encode()
, add JSON_UNESCAPED_UNICODE
as a second parameter.输入:
mb_check_encoding()
does the trick, but you have to use it religiously. There's really no way around this, as malicious clients can submit data in whatever encoding they want, and I haven't found a trick to get PHP to do this for you reliably.其他代码注意事项:
显然,您将提供的所有文件(PHP、HTML、JavaScript 等)都应使用有效的 UTF-8 进行编码。
You need to make sure that every time you process a UTF-8 string, you do so safely. This is, unfortunately, the hard part. You'll probably want to make extensive use of PHP's mbstring
extension.
PHP's built-in string operations are not by default UTF-8 safe. There are some things you can safely do with normal PHP string operations (like concatenation), but for most things you should use the equivalent mbstring
function.
要知道您在做什么(阅读:不要搞砸),您确实需要了解 UTF-8 以及它如何在尽可能最低的级别上工作。查看 utf8.com 中的任何链接,获取一些很好的资源,以了解您需要了解的所有内容。