search

php multibyte string

Nov 21, 2016 pm 05:58 PM
phpphp string

Introduction

Although every necessary character in many languages ​​can be mapped one-to-one to an 8-bit value, there are also several languages ​​that require so many characters for written communication that their encoding range cannot only include In a byte (a byte Byte consists of 8 bits. Each bit can only contain two different values: 1 or 0. Therefore, one byte can only represent 256 different values, that is, eight of 2 power). Multibyte character encoding schemes were developed to express more than 256 characters in conventional byte-based encoding systems.

When you operate (trim, split, splice, etc.) multi-byte encoded strings, because under this encoding scheme, two or more consecutive bytes may only express one character, so you need Use specialized functions. Otherwise, when you apply a function that cannot detect multi-byte strings to this string, it may not be able to detect the starting position of multi-byte characters and end up with a garbled string, basically losing its original meaning.

mbstring provides functions for multi-byte strings, which can help you handle multi-byte encoding in PHP. In addition, mbstring can convert between possible character encodings. For convenience, mbstring is designed to handle Unicode-based encodings, such as UTF-8, UCS-2, and many single-byte encodings.

mbstring is not a default extension. This means that it is not activated by default. You must activate this module explicitly in the configure option.

HTTP Input and Output

HTTP input/output character encoding conversion also works for binary data. If binary data is used for HTTP input/output, the user should control character encoding conversion.

Since PHP 4.3.3, if the enctype attribute of the HTML form is set to multipart/form-data, and mbstring.encoding_translation in php.ini is set to On, POST variables and the name of the uploaded file will also be converted to Internal character encoding. However, the transformation will not be applied to the keys of the query.

HTTP input There is no way to control the conversion of HTTP input characters in PHP scripts. To disable HTTP input character conversion, this must be set in php.ini.

Example #1 Disabling HTTP input conversion in php.ini

;; 禁用 HTTP 输入转换
mbstring.http_input = pass
;;禁用 HTTP 输入转换 (PHP 4.3.0 或更高版本)
mbstring.encoding_translation = Off

When PHP is running as Apache module. These settings can also be overridden through each virtual host (Virtual Host) directive in httpd.conf or .htaccess in each directory.
There are several ways to use HTTP output character encoding conversion. One is to use php.ini, the other is to use ob_start(), with mb_output_handler() as the callback function of ob_start.

Example #2 php.ini setting example

;; 为所有 PHP 页面启用输出字符编码的转换
;; 启用输出缓冲
output_buffering    = On
;; 设置 mb_output_handler 来进行输出的转换
output_handler      = mb_output_handler

Example #3 Script example

<?php
    // 仅为此页面启用输出字符编码的转换
    // 设置 HTTP 输出字符编码为 SJIS
    mb_http_output(&#39;SJIS&#39;);
    // 开始缓冲并指定 "mb_output_handler" 为回调函数
    ob_start(&#39;mb_output_handler&#39;);
?>

Multi-byte string function

mb_check_encoding — Check whether the string is valid in the specified encoding

mb_convert_case — Resize the string Write conversion

mb_convert_encoding — Convert character encodings

mb_convert_kana — Convert "kana" one from another ("zen-kaku", "han-kaku" and more)

mb_convert_variables — Convert character encodings of one or more variables

mb_decode_mimeheader — Decode the string in the MIME header field

mb_decode_numericentity — Decode the HTML numeric string into characters

mb_detect_encoding — Detect the encoding of the character

mb_detect_order — Set/get the detection order of character encoding

mb_encode_mimeheader — For MIME head Encoded string

mb_encode_numericentity — Encode character to HTML numeric string reference

mb_encoding_aliases — Get aliases of a known encoding type

mb_ereg_match — Regular expression match for multibyte string

mb_ereg_replace_callback — Perform a regular expression seach and replace with multibyte support using a callback

mb_ereg_replace — Replace regular expression with multibyte support

mb_ereg_search_getpos — Returns start point for next regular expression match

mb_ereg_search_getregs — Retrieve the result from the last multibyte regular expression match

mb_ereg_search_init — Setup string and regular expression for a multibyte regular expression match

mb_ereg_search_pos — Returns position and length of a matched part of the multibyte regular expression for a predefined multibyte string

mb_ereg_search_regs — Returns the matched part of a multibyte regular expression

mb_ereg_search_setpos — Set start point of next regular expression match

mb_ereg_search — Multibyte regular expression match for predefined multibyte string

mb_ereg — Regular expression match with multibyte support

mb_eregi_replace — Replace regular expression with multibyte support ignoring case

mb_eregi — Regular expression match ignoring case with multibyte support

mb_get_info — Get the internal settings of mbstring

mb_http_input — Detect HTTP input character encoding

mb_http_output — Set/get HTTP output character encoding

mb_internal_encoding — Set/get the internal character encoding

mb_language — Set/get the current language

mb_list_encodings — Return an array of all supported encodings

mb_output_handler — Callback function for converting character encoding in the output buffer

mb_parse_str — Parse GET/ POST/COOKIE data and set global variables

mb_preferred_mime_name — Get MIME string

mb_regex_encoding — Set/Get character encoding for multibyte regex

mb_regex_set_options — Set/Get the default options for mbregex functions

mb_send_mail — Send encoded mail

mb_split — Split a multi-byte string using regular expressions

mb_strcut — Get a part of a character

mb_strimwidth — Get a string truncated by a specified width

mb_stripos — Case-insensitively find a string where another character The position of the first occurrence in a string

mb_stristr — Find the first occurrence of a string in another string, case-insensitively

mb_strlen — Get the length of a string

mb_strpos — Find the first occurrence of a string in another string Occurrence position

mb_strrchr — Find the last occurrence of the specified character in another string

mb_strrichr — Find the last occurrence of the specified character in another string in a case-insensitive manner

mb_strripos — Case-insensitive Find the last occurrence of a string in a string

mb_strrpos — Find the last occurrence of a string in a string

mb_strstr — Find the first occurrence of a string in another string

mb_strtolower — Use String lower case

mb_strtoupper — Make the string uppercase

mb_strwidth — Return the width of the string

mb_substitute_character — Set/get the substitution character

mb_substr_count — Count the number of occurrences of the string

mb_substr — Get the part of the string


Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
What are some common problems that can cause PHP sessions to fail?What are some common problems that can cause PHP sessions to fail?Apr 25, 2025 am 12:16 AM

Reasons for PHPSession failure include configuration errors, cookie issues, and session expiration. 1. Configuration error: Check and set the correct session.save_path. 2.Cookie problem: Make sure the cookie is set correctly. 3.Session expires: Adjust session.gc_maxlifetime value to extend session time.

How do you debug session-related issues in PHP?How do you debug session-related issues in PHP?Apr 25, 2025 am 12:12 AM

Methods to debug session problems in PHP include: 1. Check whether the session is started correctly; 2. Verify the delivery of the session ID; 3. Check the storage and reading of session data; 4. Check the server configuration. By outputting session ID and data, viewing session file content, etc., you can effectively diagnose and solve session-related problems.

What happens if session_start() is called multiple times?What happens if session_start() is called multiple times?Apr 25, 2025 am 12:06 AM

Multiple calls to session_start() will result in warning messages and possible data overwrites. 1) PHP will issue a warning, prompting that the session has been started. 2) It may cause unexpected overwriting of session data. 3) Use session_status() to check the session status to avoid repeated calls.

How do you configure the session lifetime in PHP?How do you configure the session lifetime in PHP?Apr 25, 2025 am 12:05 AM

Configuring the session lifecycle in PHP can be achieved by setting session.gc_maxlifetime and session.cookie_lifetime. 1) session.gc_maxlifetime controls the survival time of server-side session data, 2) session.cookie_lifetime controls the life cycle of client cookies. When set to 0, the cookie expires when the browser is closed.

What are the advantages of using a database to store sessions?What are the advantages of using a database to store sessions?Apr 24, 2025 am 12:16 AM

The main advantages of using database storage sessions include persistence, scalability, and security. 1. Persistence: Even if the server restarts, the session data can remain unchanged. 2. Scalability: Applicable to distributed systems, ensuring that session data is synchronized between multiple servers. 3. Security: The database provides encrypted storage to protect sensitive information.

How do you implement custom session handling in PHP?How do you implement custom session handling in PHP?Apr 24, 2025 am 12:16 AM

Implementing custom session processing in PHP can be done by implementing the SessionHandlerInterface interface. The specific steps include: 1) Creating a class that implements SessionHandlerInterface, such as CustomSessionHandler; 2) Rewriting methods in the interface (such as open, close, read, write, destroy, gc) to define the life cycle and storage method of session data; 3) Register a custom session processor in a PHP script and start the session. This allows data to be stored in media such as MySQL and Redis to improve performance, security and scalability.

What is a session ID?What is a session ID?Apr 24, 2025 am 12:13 AM

SessionID is a mechanism used in web applications to track user session status. 1. It is a randomly generated string used to maintain user's identity information during multiple interactions between the user and the server. 2. The server generates and sends it to the client through cookies or URL parameters to help identify and associate these requests in multiple requests of the user. 3. Generation usually uses random algorithms to ensure uniqueness and unpredictability. 4. In actual development, in-memory databases such as Redis can be used to store session data to improve performance and security.

How do you handle sessions in a stateless environment (e.g., API)?How do you handle sessions in a stateless environment (e.g., API)?Apr 24, 2025 am 12:12 AM

Managing sessions in stateless environments such as APIs can be achieved by using JWT or cookies. 1. JWT is suitable for statelessness and scalability, but it is large in size when it comes to big data. 2.Cookies are more traditional and easy to implement, but they need to be configured with caution to ensure security.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version