Multibyte characters can be tricky in programming.
Warning
mbstring is not enable by default. Ensure you read that part before.
Why bother with multibyte strings?
A document can contain multibyte strings. While PHP has plenty of useful helpers for strings, these helpers are simply not meant for multibyte strings.
It will likely cause nasty bugs and other unexpected errors, especially when you count chars.
That's why you'd rather use Multibyte String Functions in PHP instead.
Besides, new multibyte string functions, such as mb_trim, mb_ltrim, and mb_rtrim will be available in 8.4 (the next release of PHP at the time of writing).
Why do some characters require multiple bytes?
English uses the ASCII character set, so letters like r or s only require one byte.
In contrast, some languages use characters that need more than one byte, for example, Han characters (it can be up to 6 bytes!).
A few examples
Count chars
$strings = [ "?????", "チャーミング", "González", ]; foreach ($strings as $string) { echo 'strlen:' . strlen($string) . ' vs. mb_strlen:' . mb_strlen($string) . PHP_EOL; }
Find position
echo strpos("チャーミング", "ャ"); // gives 3 echo mb_strpos("チャーミング", "ャ"); // gives 1 because 1st position is 0
Cut string
echo substr("チャーミング", 3) . PHP_EOL;// ャーミング echo mb_substr("チャーミング", 3);// ミング
Impact on performance
You might read that mbstring functions can have a significant impact.
You may even reproduce it with the following script:
$cnt = 100000; $strs = [ 'empty' => '', 'short' => 'zluty kun', 'short_with_uc' => 'zluty Kun', 'long' => str_repeat('this is about 10000 chars long string', 270), 'long_with_uc' => str_repeat('this is about 10000 chars long String', 270), 'short_utf8' => 'žlutý kůň', 'short_utf8_with_uc' => 'Žlutý kŮň', ]; foreach ($strs as $k => $str) { $a1 = microtime(true); for($i=0; $i <p>Source: PHP bugs</p> <p>mb_* functions are slower, but it's always a tradeoff, and only the context should determine whether you should use these helpers or make your own.</p> <p>For example, if you replace $cnt = 100000; by $cnt = 100; in the above script, mb_* helpers are still significantly slower, but the final impact might be fine in your case (e.g., 0.008 ms vs. 0.004 ms).</p> <h2> Wrap up </h2> <p>You must take multibytes into account, especially in a multingual context, and PHP has built-in helpers for that.</p>
The above is the detailed content of PHP: Going multibytes. For more information, please follow other related articles on the PHP Chinese website!

PHP remains a powerful and widely used tool in modern programming, especially in the field of web development. 1) PHP is easy to use and seamlessly integrated with databases, and is the first choice for many developers. 2) It supports dynamic content generation and object-oriented programming, suitable for quickly creating and maintaining websites. 3) PHP's performance can be improved by caching and optimizing database queries, and its extensive community and rich ecosystem make it still important in today's technology stack.

In PHP, weak references are implemented through the WeakReference class and will not prevent the garbage collector from reclaiming objects. Weak references are suitable for scenarios such as caching systems and event listeners. It should be noted that it cannot guarantee the survival of objects and that garbage collection may be delayed.

The \_\_invoke method allows objects to be called like functions. 1. Define the \_\_invoke method so that the object can be called. 2. When using the $obj(...) syntax, PHP will execute the \_\_invoke method. 3. Suitable for scenarios such as logging and calculator, improving code flexibility and readability.

Fibers was introduced in PHP8.1, improving concurrent processing capabilities. 1) Fibers is a lightweight concurrency model similar to coroutines. 2) They allow developers to manually control the execution flow of tasks and are suitable for handling I/O-intensive tasks. 3) Using Fibers can write more efficient and responsive code.

The PHP community provides rich resources and support to help developers grow. 1) Resources include official documentation, tutorials, blogs and open source projects such as Laravel and Symfony. 2) Support can be obtained through StackOverflow, Reddit and Slack channels. 3) Development trends can be learned by following RFC. 4) Integration into the community can be achieved through active participation, contribution to code and learning sharing.

PHP and Python each have their own advantages, and the choice should be based on project requirements. 1.PHP is suitable for web development, with simple syntax and high execution efficiency. 2. Python is suitable for data science and machine learning, with concise syntax and rich libraries.

PHP is not dying, but constantly adapting and evolving. 1) PHP has undergone multiple version iterations since 1994 to adapt to new technology trends. 2) It is currently widely used in e-commerce, content management systems and other fields. 3) PHP8 introduces JIT compiler and other functions to improve performance and modernization. 4) Use OPcache and follow PSR-12 standards to optimize performance and code quality.

The future of PHP will be achieved by adapting to new technology trends and introducing innovative features: 1) Adapting to cloud computing, containerization and microservice architectures, supporting Docker and Kubernetes; 2) introducing JIT compilers and enumeration types to improve performance and data processing efficiency; 3) Continuously optimize performance and promote best practices.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

Atom editor mac version download
The most popular open source editor

Dreamweaver CS6
Visual web development tools

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.