I have read a lot of articles on coding recently, so I divided it into two blog posts to talk about "PHP, strings, encoding, UTF-8" related knowledge. This blog post is the first half, divided into two. There are four major pieces of content, namely "Definition and Use of Strings", "String Conversion", "Essence of PHP Strings", and "Multi-byte Strings". The first half is relatively basic.
The definition and use of strings
There are four ways to set strings in PHP:
Single quoted strings
The single-quoted string is similar to the original string in Python, which means that the single-quoted string does not have variable parsing function and special character escaping function. For example, $str='hello\nworld', the \n does not have a newline function.
Double-quoted string
Double-quoted string has variable parsing function and special character escaping function that single-quoted string does not have.
I am very interested in the special escape of hexadecimal and octal strings. I would like to add:
\[0-7]{1,3} #八进制表达方式 \x[0-9A-Fa-f]{1,2} #十六进制表达方式
heredoc
This expression The expression is similar to a long string in Python and can define a string containing multiple lines. Its grammatical definition is very strict, so you need to pay attention when using it.
$str=<<<EOD hello\n world EOD;
Nowdoc
Nowdoc is similar to a single-quoted string and does not parse variables. It is more suitable for defining a large section of text without escaping special characters.
Variable parsing
The most powerful part of PHP strings is variable parsing. Variables can be parsed according to context at runtime (this is an interpreted language). Produces many wonderful uses.
Simple variable parsing means that a string can contain "variables", "arrays", and "object properties". Complex syntax rules are to use {} symbols to operate (to form an expression).
Take an example to see the power of variable parsing
class beers { const softdrink = 'softdrink'; public static $ale = 'ale'; public $data = array(1,3,"k"=>4); } $softdrink = "softdrink"; $ale = "ale"; $arr = array("arr1","arr2","arr3"=>"arr4","arr4"=>array(1,2)); $arr4 = "arr4"; $obj = new beers; echo "line1:{$arr[1]}\n"; echo "line2:{$arr['arr4'][0]}\n"; echo "line3:{$obj->data[1]}\n"; echo "line4:{${$arr['arr3']}}\n"; echo "line5:{${$arr['arr3']}[1]}\n"; echo "line6:{${beers::softdrink}}\n"; echo "line7:{${beers::$ale}}\n";
String conversion
Another reason why the PHP language is simpler than Python is the implicit conversion of types. Simplifies many operations, which are explained here through string conversion.
String type coercion
$var = 10 ; $dvar = (string)$var ; echo $dvar . "_" . gettype($dvar);
The strval() function is to get the string value of the variable:
$var = 10.2 ; $dvar = strval($var) ; echo gettype($var) . "_" . $dvar . "_" . gettype($dvar);
The settype() function is to set the variable Type:
$str = "10hello"; settype($str, "integer"); echo $str ;
During the forced type conversion process, certain rules will be followed when converting other types of values to strings. For example, a Boolean value of TRUE is converted into a string of "1". It’s best to understand the relevant rules.
Automatic type conversion
The above two conversions belong to display conversions, and the more important thing to pay attention to is automatic type conversion. In an expression that requires a string, It will be automatically converted to a type. For details, see the example:
$bool = true; $str = 10 + "hello" echo $bool . "_" . $str ;
The essence of PHP string
Quoting the explanation of the PHP documentation:
The implementation of string in PHP Is an array of bytes plus an integer specifying the buffer length. There is no information on how to convert bytes into characters, it is up to the programmer to decide. There are no restrictions on what values a string consists of, including bytes with a value of 0 that can appear anywhere in the string.
PHP does not specify the encoding of the string. How the string is encoded depends on the programmer. Strings are encoded according to the encoding of the PHP file. For example, if your file encoding is GBK, then the content of your code will be GBK.
To supplement the concept of binary safety, a byte with a value of 0 (NULL) can be at any position in the string. However, the bottom layer of some non-binary functions of PHP is the called C function, which will put NULL after it. characters are ignored.
As long as PHP’s file encoding is compatible with ASCII, string operations can be processed well. However, string operations are still Native in nature (no matter what the file encoding is), so you need to pay attention when using it:
Some functions assume that strings are encoded in single bytes. , but does not require the bytes to be interpreted as specific characters. For example, the sbustr() function.
Many functions need to pass encoding parameters explicitly, otherwise the default value will be obtained from the PHP.INI file, such as the htmlentities() function.
There are also some functions related to the local area, and these functions can only operate on single byte.
Generally speaking, although PHP does not support Unicode characters internally, it does support UTF-8 encoding. In most cases, there will be no problem, but the following situations may not be handled. Here is:
How to convert non-UTF-8 encoded strings
A UTF-8 encoded web page, but the user is submitting the form Sometimes, GBK encoding may be used (does not comply with meta tag)
A UTF-8 encoded PHP file, using strlen("China") returns 6 instead of actual characters Number (2)
So how to solve this problem? PHP provides the mbstring extension!
Multi-byte strings
The mbstring extension is not turned on by default. You need --enable-mbstring during installation.
Let’s first take a look at the configuration of the mbstring directive in PHP.INI. It took a long time to gradually understand it.
I understand this parameter of mbstring.language as UTF-8
mbstring.internal_encoding This encoding has nothing to do with the PHP file encoding. In most mbstring functions, you need to specify the encoding of the string to be processed. If you do not specify it explicitly, the value of this parameter will be obtained by default. The value of this parameter is replaced by the default_charset parameter in higher versions of PHP.
mbstring.http_input This parameter specifies the default encoding of HTTP input (excluding GET parameters). Generally consistent with the encoding of the HTML page, the value of this parameter is replaced by the default_charset parameter.
mbstring.http_output This parameter misled me. What is HTTP output? Isn’t PHP output just a page? How can there be such a concept?
mbstring.encoding_translation, let’s focus on this parameter. It is turned off by default. If it is turned on, PHP will automatically convert the encoding of the POST variable and the name of the uploaded file to the value specified by mbstring.internal_encoding. , but I have not tested it. You can upload a file with a Chinese name. It is recommended to close it and let programmers deal with related issues.
Let’s look at some functions of the mbstring extension later:
mb_http_input(): Detect HTTP input character encoding, and think that for the file name of the file upload It is necessary to deal with it.
mb_convert_encoding(): A commonly used function, pay attention to the third parameter.
mb_detect_order(): Set/get the detection order of character encoding.
mb_list_encodings(): Returns the encoding list supported by the system.
Important note: PHP files must support certain encodings and must be ASCII compatible.
But do not use BIG-5 as the PHP file encoding, especially if the string appears in the form of identifiers or literals. If the PHP file encoding is actually BIG-5, then try to convert the input and output content to UTF-8.
Zend Multibyte
Finally, let’s talk about the concept of Zend Multibyte. I don’t understand it very deeply. First of all, don’t confuse it with the mbstring extension. Zend Multibyte mode is turned off by default and can be turned on via the zend.multibyte command. Then specify the encoding of the PHP parser through the declare() function.
Then what is the significance of this command? As mentioned above, the encoding of PHP files needs to be ASCII-compatible, so what to do with non-compatible ASCII encodings like BIG-5? You can operate it through this command. When the PHP parser reads the mbstring.script_encoding encoding and uses this encoding to parse PHP files.

Laravel simplifies handling temporary session data using its intuitive flash methods. This is perfect for displaying brief messages, alerts, or notifications within your application. Data persists only for the subsequent request by default: $request-

The PHP Client URL (cURL) extension is a powerful tool for developers, enabling seamless interaction with remote servers and REST APIs. By leveraging libcurl, a well-respected multi-protocol file transfer library, PHP cURL facilitates efficient execution of various network protocols, including HTTP, HTTPS, and FTP. This extension offers granular control over HTTP requests, supports multiple concurrent operations, and provides built-in security features.

This is the second and final part of the series on building a React application with a Laravel back-end. In the first part of the series, we created a RESTful API using Laravel for a basic product-listing application. In this tutorial, we will be dev

Laravel provides concise HTTP response simulation syntax, simplifying HTTP interaction testing. This approach significantly reduces code redundancy while making your test simulation more intuitive. The basic implementation provides a variety of response type shortcuts: use Illuminate\Support\Facades\Http; Http::fake([ 'google.com' => 'Hello World', 'github.com' => ['foo' => 'bar'], 'forge.laravel.com' =>

Do you want to provide real-time, instant solutions to your customers' most pressing problems? Live chat lets you have real-time conversations with customers and resolve their problems instantly. It allows you to provide faster service to your custom

In this article, we're going to explore the notification system in the Laravel web framework. The notification system in Laravel allows you to send notifications to users over different channels. Today, we'll discuss how you can send notifications ov

Article discusses late static binding (LSB) in PHP, introduced in PHP 5.3, allowing runtime resolution of static method calls for more flexible inheritance.Main issue: LSB vs. traditional polymorphism; LSB's practical applications and potential perfo

PHP logging is essential for monitoring and debugging web applications, as well as capturing critical events, errors, and runtime behavior. It provides valuable insights into system performance, helps identify issues, and supports faster troubleshoot


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 English version
Recommended: Win version, supports code prompts!

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Zend Studio 13.0.1
Powerful PHP integrated development environment

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),
