search
HomeBackend DevelopmentPHP TutorialDetailed explanation of strings, encoding, and UTF-8 codes in PHP

I have read a lot of articles on coding recently, so I divided it into two blog posts to talk about "PHP, strings, encoding, UTF-8" related knowledge. This blog post is the first half, divided into four major parts, namely " "Definition and use of strings", "String conversion", "The nature of PHP strings", "Multibyte strings". The first half is relatively basic, and the next article "Best Practices of PHP and UTF-8" may have more information.

Definition and use of string

There are four ways to set strings in PHP:

Single quoted string

Single-quoted strings are similar to raw strings in Python, which means that single-quoted strings do not have variable parsing functions and special character escaping functions. For example, $str='hello\nworld', the \n does not have a newline function.

 Double quoted string

Double-quoted strings have variable parsing functions and special character escaping functions that single-quoted strings do not have.

Personally, I am very interested in the special escape of hexadecimal and octal strings. Special addition:

\[0-7]{1,3} #八进制表达方式
\x[0-9A-Fa-f]{1,2} #十六进制表达方式

heredoc

This expression is similar to a long string in Python and can define a string containing multiple lines. Its grammatical definition is very strict, so you need to pay attention when using it.

$str=<<<EOD
hello\n
world
EOD;

 Nowdoc

Nowdoc is similar to a single-quoted string and does not parse variables. It is more suitable for defining a large section of text without escaping special characters.

Variable analysis

The most powerful part of PHP strings is variable parsing, which can parse variables according to context at runtime (this is an interpreted language), which can produce many wonderful uses.

Simple variable parsing means that the string can contain "variables", "arrays", and "object attributes". Complex syntax rules are to use {} symbols to operate (to form an expression).

Let’s look at the power of variable parsing through an example

class beers {
    const softdrink = &#39;softdrink&#39;;
    public static $ale = &#39;ale&#39;;
    public $data = array(1,3,"k"=>4);
}

$softdrink = "softdrink";
$ale = "ale";
$arr = array("arr1","arr2","arr3"=>"arr4","arr4"=>array(1,2));
$arr4 = "arr4";
$obj = new beers;
echo "line1:{$arr[1]}\n";
echo "line2:{$arr[&#39;arr4&#39;][0]}\n"; 
echo "line3:{$obj->data[1]}\n";
echo "line4:{${$arr[&#39;arr3&#39;]}}\n";
echo "line5:{${$arr[&#39;arr3&#39;]}[1]}\n";
echo "line6:{${beers::softdrink}}\n";
echo "line7:{${beers::$ale}}\n";

String conversion

Another reason why the PHP language is simpler than Python is the implicit conversion of types, which will simplify many operations, which is explained here through string conversion.

String type coercion

$var = 10 ;
$dvar = (string)$var ;
echo $dvar . "_" . gettype($dvar);

The strval() function is to get the string value of the variable:

$var = 10.2 ;
$dvar = strval($var) ;
echo gettype($var) . "_" . $dvar . "_" . gettype($dvar);

The settype() function sets the type of the variable:

$str = "10hello";
settype($str, "integer");
echo $str ;

During the process of forced type conversion, certain rules will be followed when converting other types of values ​​​​to strings. For example, a Boolean value of TRUE is converted into a string of "1". It’s best to understand the relevant rules.

Automatic type conversion

The above two conversions are display conversions, and what is more important to pay attention to is automatic type conversion. In an expression that requires a string, it will be automatically converted to a type. For details, see the example:

$bool = true;
$str = 10 + "hello"
echo $bool . "_" . $str ;

The essence of PHP string

Quoting the explanation from the PHP documentation:

String in PHP is implemented as an array of bytes plus an integer specifying the buffer length. There is no information on how to convert bytes into characters, it is up to the programmer to decide. There are no restrictions on what values ​​a string consists of, including bytes with a value of 0 that can appear anywhere in the string.

PHP does not specify the encoding of the string. How the string is encoded depends on the programmer. Strings are encoded according to the encoding of the PHP file. For example, if your file encoding is GBK, then the content of your code will be GBK.

To supplement the concept of binary safety, a byte with a value of 0 (NULL) can be at any position in the string, and some of PHP's non-binary functions are called C functions at the bottom, which will ignore the characters after NULL.

As long as PHP's file encoding is compatible with ASCII, string operations can be handled well. However, string operations are still Native in nature (no matter what the file encoding is), so you need to pay attention when using it:

  • Some functions assume that strings are encoded in single bytes, but do not require the bytes to be interpreted as specific characters. For example, the sbustr() function.

  • Many functions need to pass encoding parameters explicitly, otherwise the default values ​​will be obtained from the PHP.INI file, such as the htmlentities() function.

  • There are also some functions related to the local area, and these functions can only operate on single byte.

Under normal circumstances, although PHP does not support Unicode characters internally, it does support UTF-8 encoding. In most cases, there will be no problems. However, the following situations may not be handled:

  • How to convert non-UTF-8 encoded strings

  • A UTF-8 encoded web page, but when users submit the form, they may use GBK encoding (which does not comply with meta tag)

  • For a UTF-8 encoded PHP file, using strlen("China") returns 6 instead of the actual number of characters (2)

​So how to solve this problem? PHP provides the mbstring extension!

Multibyte string

The mbstring extension is not turned on by default. You need --enable-mbstring when installing.

Let’s first look at the configuration of the mbstring directive in PHP.INI. It took a long time to gradually understand it.

  • I understand the mbstring.language parameter as UTF-8

  • mbstring.internal_encoding This encoding has nothing to do with PHP file encoding. It is just that in most mbstring functions, you need to specify the encoding of the string to be processed. If you do not specify it explicitly, the value of this parameter will be obtained by default. The value of this parameter is in higher versions of PHP. Used the default_charset parameter instead.

  • mbstring.http_input This parameter specifies the default encoding for HTTP input (excluding GET parameters). Generally consistent with the encoding of the HTML page, the value of this parameter is replaced by the default_charset parameter.

  • mbstring.http_output This parameter misled me. What is HTTP output? Isn’t PHP output just a page? How can there be such a concept?

  • mbstring.encoding_translation, let’s focus on this parameter. It is turned off by default. If it is turned on, PHP will automatically convert the encoding of the POST variable and the name of the uploaded file to the value specified by mbstring.internal_encoding. However, I have not tested it. You can upload a Chinese named file. It is recommended to close it and let programmers deal with related issues.

Let’s take a look at some functions extended by mbstring later:

  • mb_http_input(): Detect the HTTP input character encoding and find it necessary to process the file name of the file upload.

  • mb_convert_encoding(): A commonly used function, pay attention to the third parameter.

  • mb_detect_order(): Set/get the detection order of character encoding.

  • mb_list_encodings(): Returns the encoding list supported by the system.

Important note: PHP files must support certain encodings and must be ASCII compatible.

But do not use BIG-5 as the PHP file encoding, especially when strings appear in the form of identifiers or literals. If the actual PHP file encoding is BIG-5, then try to convert the input and output content to UTF-8.

Zend Multibyte

Finally, let’s talk about the concept of Zend Multibyte. I don’t understand it very deeply. First of all, don’t confuse it with the mbstring extension. Zend Multibyte mode is turned off by default and can be turned on via the zend.multibyte command. Then specify the encoding of the PHP parser through the declare() function.

What is the significance of this instruction? As mentioned above, the encoding of PHP files needs to be ASCII-compatible, so what to do with non-compatible ASCII encodings like BIG-5? You can operate it through this command. When the PHP parser reads the mbstring.script_encoding encoding and uses this encoding to parse PHP files.

The above is a detailed explanation of strings, encodings, and UTF-8 codes in PHP. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!


Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
PHP's Purpose: Building Dynamic WebsitesPHP's Purpose: Building Dynamic WebsitesApr 15, 2025 am 12:18 AM

PHP is used to build dynamic websites, and its core functions include: 1. Generate dynamic content and generate web pages in real time by connecting with the database; 2. Process user interaction and form submissions, verify inputs and respond to operations; 3. Manage sessions and user authentication to provide a personalized experience; 4. Optimize performance and follow best practices to improve website efficiency and security.

PHP: Handling Databases and Server-Side LogicPHP: Handling Databases and Server-Side LogicApr 15, 2025 am 12:15 AM

PHP uses MySQLi and PDO extensions to interact in database operations and server-side logic processing, and processes server-side logic through functions such as session management. 1) Use MySQLi or PDO to connect to the database and execute SQL queries. 2) Handle HTTP requests and user status through session management and other functions. 3) Use transactions to ensure the atomicity of database operations. 4) Prevent SQL injection, use exception handling and closing connections for debugging. 5) Optimize performance through indexing and cache, write highly readable code and perform error handling.

How do you prevent SQL Injection in PHP? (Prepared statements, PDO)How do you prevent SQL Injection in PHP? (Prepared statements, PDO)Apr 15, 2025 am 12:15 AM

Using preprocessing statements and PDO in PHP can effectively prevent SQL injection attacks. 1) Use PDO to connect to the database and set the error mode. 2) Create preprocessing statements through the prepare method and pass data using placeholders and execute methods. 3) Process query results and ensure the security and performance of the code.

PHP and Python: Code Examples and ComparisonPHP and Python: Code Examples and ComparisonApr 15, 2025 am 12:07 AM

PHP and Python have their own advantages and disadvantages, and the choice depends on project needs and personal preferences. 1.PHP is suitable for rapid development and maintenance of large-scale web applications. 2. Python dominates the field of data science and machine learning.

PHP in Action: Real-World Examples and ApplicationsPHP in Action: Real-World Examples and ApplicationsApr 14, 2025 am 12:19 AM

PHP is widely used in e-commerce, content management systems and API development. 1) E-commerce: used for shopping cart function and payment processing. 2) Content management system: used for dynamic content generation and user management. 3) API development: used for RESTful API development and API security. Through performance optimization and best practices, the efficiency and maintainability of PHP applications are improved.

PHP: Creating Interactive Web Content with EasePHP: Creating Interactive Web Content with EaseApr 14, 2025 am 12:15 AM

PHP makes it easy to create interactive web content. 1) Dynamically generate content by embedding HTML and display it in real time based on user input or database data. 2) Process form submission and generate dynamic output to ensure that htmlspecialchars is used to prevent XSS. 3) Use MySQL to create a user registration system, and use password_hash and preprocessing statements to enhance security. Mastering these techniques will improve the efficiency of web development.

PHP and Python: Comparing Two Popular Programming LanguagesPHP and Python: Comparing Two Popular Programming LanguagesApr 14, 2025 am 12:13 AM

PHP and Python each have their own advantages, and choose according to project requirements. 1.PHP is suitable for web development, especially for rapid development and maintenance of websites. 2. Python is suitable for data science, machine learning and artificial intelligence, with concise syntax and suitable for beginners.

The Enduring Relevance of PHP: Is It Still Alive?The Enduring Relevance of PHP: Is It Still Alive?Apr 14, 2025 am 12:12 AM

PHP is still dynamic and still occupies an important position in the field of modern programming. 1) PHP's simplicity and powerful community support make it widely used in web development; 2) Its flexibility and stability make it outstanding in handling web forms, database operations and file processing; 3) PHP is constantly evolving and optimizing, suitable for beginners and experienced developers.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software