search
HomeBackend DevelopmentPHP TutorialChinese coding encountered in PHP website development_PHP tutorial

Chinese coding encountered in PHP website development_PHP tutorial

Jul 13, 2016 pm 05:36 PM
phpChineselead toofprogrammingcodingwebsite developmentthismeetquestion

PHP程序设计中中文编码问题曾经困扰很多人,导致这个问题的原因其实很简单,每个国家(或区域)都规定了计算机信息交换用的字符编码集,如美国的扩展 ASCII 码, 中国的 GB2312-80,日本的 JIS 等。作为该国家/区域内信息处理的基础,字符编码集起着统一编码的重要作用。字符编码集按长度分为 SBCS(单字节字符集),DBCS(双字节字符集)两大类。早期的软件(尤其是操作系统),为了解决本地字符信息的计算机处理,出现了各种本地化版本(L10N),为了区分,引进了 LANG, Codepage 等概念。但是由于各个本地字符集代码范围重叠,相互间信息交换困难;软件各个本地化版本独立维护成本较高。因此有必要将本地化工作中的共性抽取出来,作一致处理,将特别的本地化处理内容降低到最少。这也就是所谓的国际化(118N)。各种语言信息被进一步规范为 Locale 信息。处理的底层字符集变成了几乎包含了所有字形的 Unicode。

现在大部分具有国际化特征的软件核心字符处理都是以 Unicode 为基础的,在软件运行时根据当时的ocale/Lang/Codepage 设置确定相应的本地字符编码设置,并依此处理本地字符。在处理过程中需要实现 Unicode 和本地字符集的相互转换,甚或以 Unicode 为中间的两个不同本地字符集的相互转换。这种方式在网络环境下被进一步延伸,任何网络两端的字符信息也需要根据字符集的设置转换成可接受的内容。

数据库中的字符集编码问题

流行的关系数据库系统都支持数据库字符集编码,也就是说在创建数据库时可以指定它自己的字符集设置,数据库的数据以指定的编码形式存储。当应用程序访问数据时,在入口和出口处都会有字符集编码的转换。对于中文数据,数据库字符编码的设置应当保证数据的完整性。GB2312、GBK、UTF-8 等都是可选的数据库字符集编码;当然我们也可以选择 ISO8859-1 (8-bit),只是我们得在应

用程序写数据之前先将 16Bit 的一个汉字或 Unicode 拆分成两个 8-bit 的字符,读数据之后也需要将两个字节合并起来,同时还要判别其中的 SBCS 字符,因此我们并不推荐采用 ISO8859-1 作为数据库字符集编码。这样不但没有充分利用数据库自身的字符集编码支持,而且同时也增加了编程的复杂度。编程时,可以先用数据库管理系统提供的管理功能检查其中的中文数据是否正确。

PHP 程序在查询数据库之前,首先执行 mysql_query("SET NAMES xxxx"); 其中 xxxx 是你网页的编码(charset=xxxx),如果网页中 charset=utf8,则 xxxx=utf8,如果网页中 charset=gb2312,则xxxx=gb2312,几乎所有 WEB 程序,都有一段连接数据库的公共代码,放在一个文件里,在这文件里,加入 mysql_query("SET NAMES xxxx") 就可以了。

SET NAMES 显示客户端发送的 SQL 语句中使用什么字符集。因此,SET NAMES utf-8 语句告诉服务器“将来从这个客户端传来的信息采用字符集 utf-8”。它还为服务器发送回客户端的结果指定了字符集(例如,如果你使用一个 SELECT 语句,它表示列值使用了什么字符集)。

定位问题时常用的技巧

定位中文编码问题通常采用最笨的也是最有效的办法―在你认为有嫌疑的程序处理后打印字符串的内码。通过打印字符串的内码,你可以发现什么时候中文字符被转换成 Unicode,什么时候Unicode 被转回中文内码,什么时候一个中文字成了两个 Unicode 字符,什么时候中文字符串被转成了一串问号,什么时候中文字符串的高位被截掉了……

取用合适的样本字符串也有助于区分问题的类型。如:"aa啊 aa?@aa" 等中英相间,GB、GBK特征字符均有的字符串。一般来说,英文字符无论怎么转换或处理,都不会失真(如果遇到了,可以尝试着增加连续的英文字母长度)。

解决各种应用的乱码问题

1) 使用 标签设置页面编码

这个标签的作用是声明客户端的浏览器用什么字符集编码显示该页面,xxx 可以为 GB2312、GBK、UTF-8(和 MySQL 不同,MySQL 是 UTF8)等等。因此,大部分页面可以采用这种方式来告诉浏览器显示这个页面的时候采用什么编码,这样才不会造成编码错误而产生乱码。但是有的时候我们会发现有了这句还是不行,不管 xxx 是哪一种,浏览器采用的始终都是一种编码,这个情况我后面会谈到。

请注意, 是属于 HTML 信息的,仅仅是一个声明,仅表明服务器已经把 HTML 信息传到了浏览器。

2) header("content-type:text/html; charset=xxx");

The function header() is to send the information in the brackets to the http header. If the content in the brackets is as mentioned in the article, the function is basically the same as the label. If you compare the first one, you will find that the characters are similar. But the difference is that if there is this function, the browser will always use the xxx encoding you requested and will never be disobedient, so this function is very useful. Why is this? Then we have to talk about the difference between http header and HTML information:

The http header is a string sent by the server before sending HTML information to the browser using the http protocol. The tag belongs to HTML information, so the content sent by header() reaches the browser first. The popular point is that header() has a higher priority (I don’t know if I can say this). If a php page has both header("content-type:text/html;charset=xxx") and header("content-type:text/html;charset=xxx"), the browser will only recognize the former http header and not the meta. Of course, this function can only be used within php pages.

There is also a question left, why does the former definitely work, but the latter sometimes does not work? This is the reason why we want to talk about Apache next.

3) AddDefaultCharset

In the conf folder in the Apache root directory, there is the entire Apache configuration document httpd.conf.

Open httpd.conf with a text editor. Line 708 (different versions may be different) contains AddDefaultCharset xxx, where xxx is the encoding name. The meaning of this line of code: Set the character set in the http header of the web page file in the entire server to your default xxx character set. Having this line is equivalent to adding a line of header("content-type:text/html; charset=xxx") to each file. Now you can understand why the browser always uses gb2312 even though it is set to utf-8.

If there is header("content-type:text/html; charset=xxx") in the web page, the default character set will be changed to the character set you set, so this function will always be useful. If you add a "#" in front of AddDefaultCharset xxx, comment out this sentence, and the page does not contain header("content-type..."), then it is the meta tag's turn to take effect.

The above is listed below in order of priority:

.. header("content-type:text/html; charset=xxx")

.. AddDefaultCharset xxx

..

If you are a web programmer, it is recommended to add a header ("content-type: text/html; charset=xxx") to each of your pages, so as to ensure that it can be displayed correctly on any server. Portability is also relatively strong.

4) default_charset configuration in php.ini:

default_charset = "gb2312" in php.ini defines the default language character set of php. It is generally recommended to comment out this line and let the browser automatically select the language based on the charset in the web page header instead of making a mandatory requirement, so that web services in multiple languages ​​can be provided on the same server.

Conclusion

In fact, Chinese coding in PHP development is not as complicated as imagined. Although there are no fixed rules for locating and solving problems, and various operating environments are also different, the principles behind it are the same. Understanding the knowledge of character sets is the basis for solving character problems. However, with the changes in the Chinese character set, not only PHP programming, but also problems in Chinese information processing will still exist for some time.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/508265.htmlTechArticleThe problem of Chinese encoding in PHP programming has troubled many people. The reason for this problem is actually very simple. Every country (or region) all specify the character encoding set for computer information exchange...
Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
The Continued Use of PHP: Reasons for Its EnduranceThe Continued Use of PHP: Reasons for Its EnduranceApr 19, 2025 am 12:23 AM

What’s still popular is the ease of use, flexibility and a strong ecosystem. 1) Ease of use and simple syntax make it the first choice for beginners. 2) Closely integrated with web development, excellent interaction with HTTP requests and database. 3) The huge ecosystem provides a wealth of tools and libraries. 4) Active community and open source nature adapts them to new needs and technology trends.

PHP and Python: Exploring Their Similarities and DifferencesPHP and Python: Exploring Their Similarities and DifferencesApr 19, 2025 am 12:21 AM

PHP and Python are both high-level programming languages ​​that are widely used in web development, data processing and automation tasks. 1.PHP is often used to build dynamic websites and content management systems, while Python is often used to build web frameworks and data science. 2.PHP uses echo to output content, Python uses print. 3. Both support object-oriented programming, but the syntax and keywords are different. 4. PHP supports weak type conversion, while Python is more stringent. 5. PHP performance optimization includes using OPcache and asynchronous programming, while Python uses cProfile and asynchronous programming.

PHP and Python: Different Paradigms ExplainedPHP and Python: Different Paradigms ExplainedApr 18, 2025 am 12:26 AM

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP and Python: A Deep Dive into Their HistoryPHP and Python: A Deep Dive into Their HistoryApr 18, 2025 am 12:25 AM

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

Choosing Between PHP and Python: A GuideChoosing Between PHP and Python: A GuideApr 18, 2025 am 12:24 AM

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP and Frameworks: Modernizing the LanguagePHP and Frameworks: Modernizing the LanguageApr 18, 2025 am 12:14 AM

PHP remains important in the modernization process because it supports a large number of websites and applications and adapts to development needs through frameworks. 1.PHP7 improves performance and introduces new features. 2. Modern frameworks such as Laravel, Symfony and CodeIgniter simplify development and improve code quality. 3. Performance optimization and best practices further improve application efficiency.

PHP's Impact: Web Development and BeyondPHP's Impact: Web Development and BeyondApr 18, 2025 am 12:10 AM

PHPhassignificantlyimpactedwebdevelopmentandextendsbeyondit.1)ItpowersmajorplatformslikeWordPressandexcelsindatabaseinteractions.2)PHP'sadaptabilityallowsittoscaleforlargeapplicationsusingframeworkslikeLaravel.3)Beyondweb,PHPisusedincommand-linescrip

How does PHP type hinting work, including scalar types, return types, union types, and nullable types?How does PHP type hinting work, including scalar types, return types, union types, and nullable types?Apr 17, 2025 am 12:25 AM

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values ​​and handle functions that may return null values.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.