search

PHP: Going multibytes

Multibyte characters can be tricky in programming.

Warning

mbstring is not enable by default. Ensure you read that part before.

Why bother with multibyte strings?

A document can contain multibyte strings. While PHP has plenty of useful helpers for strings, these helpers are simply not meant for multibyte strings.

It will likely cause nasty bugs and other unexpected errors, especially when you count chars.

That's why you'd rather use Multibyte String Functions in PHP instead.

Besides, new multibyte string functions, such as mb_trim, mb_ltrim, and mb_rtrim will be available in 8.4 (the next release of PHP at the time of writing).

Why do some characters require multiple bytes?

English uses the ASCII character set, so letters like r or s only require one byte.

In contrast, some languages use characters that need more than one byte, for example, Han characters (it can be up to 6 bytes!).

A few examples

Count chars

$strings = [
    "?????",
    "チャーミング",
    "González",
];

foreach ($strings as $string) {
    echo 'strlen:' . strlen($string) . ' vs. mb_strlen:' . mb_strlen($string) . PHP_EOL;
}

Find position

echo strpos("チャーミング", "ャ"); // gives 3
echo mb_strpos("チャーミング", "ャ"); // gives 1 because 1st position is 0

Cut string

echo substr("チャーミング", 3) . PHP_EOL;// ャーミング
echo mb_substr("チャーミング", 3);// ミング

Impact on performance

You might read that mbstring functions can have a significant impact.

You may even reproduce it with the following script:

$cnt = 100000;

$strs = [
    'empty' => '',
    'short' => 'zluty kun',
    'short_with_uc' => 'zluty Kun',
    'long' => str_repeat('this is about 10000 chars long string', 270),
    'long_with_uc' => str_repeat('this is about 10000 chars long String', 270),
    'short_utf8' => 'žlutý kůň',
    'short_utf8_with_uc' => 'Žlutý kŮň',
];

foreach ($strs as $k => $str) {
    $a1 = microtime(true);
    for($i=0; $i 



<p>Source: PHP bugs</p>

<p>mb_*  functions are slower, but it's always a tradeoff, and only the context should determine whether you should use these helpers or make your own.</p>

<p>For example, if you replace $cnt = 100000; by $cnt = 100; in the above script, mb_* helpers are still significantly slower, but the final impact might be fine in your case (e.g., 0.008 ms vs. 0.004 ms).</p>

<h2>
  
  
  Wrap up
</h2>

<p>You must take multibytes into account, especially in a multingual context, and PHP has built-in helpers for that.</p>


          

            
  

            
        

The above is the detailed content of PHP: Going multibytes. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Beyond the Hype: Assessing PHP's Role TodayBeyond the Hype: Assessing PHP's Role TodayApr 12, 2025 am 12:17 AM

PHP remains a powerful and widely used tool in modern programming, especially in the field of web development. 1) PHP is easy to use and seamlessly integrated with databases, and is the first choice for many developers. 2) It supports dynamic content generation and object-oriented programming, suitable for quickly creating and maintaining websites. 3) PHP's performance can be improved by caching and optimizing database queries, and its extensive community and rich ecosystem make it still important in today's technology stack.

What are Weak References in PHP and when are they useful?What are Weak References in PHP and when are they useful?Apr 12, 2025 am 12:13 AM

In PHP, weak references are implemented through the WeakReference class and will not prevent the garbage collector from reclaiming objects. Weak references are suitable for scenarios such as caching systems and event listeners. It should be noted that it cannot guarantee the survival of objects and that garbage collection may be delayed.

Explain the __invoke magic method in PHP.Explain the __invoke magic method in PHP.Apr 12, 2025 am 12:07 AM

The \_\_invoke method allows objects to be called like functions. 1. Define the \_\_invoke method so that the object can be called. 2. When using the $obj(...) syntax, PHP will execute the \_\_invoke method. 3. Suitable for scenarios such as logging and calculator, improving code flexibility and readability.

Explain Fibers in PHP 8.1 for concurrency.Explain Fibers in PHP 8.1 for concurrency.Apr 12, 2025 am 12:05 AM

Fibers was introduced in PHP8.1, improving concurrent processing capabilities. 1) Fibers is a lightweight concurrency model similar to coroutines. 2) They allow developers to manually control the execution flow of tasks and are suitable for handling I/O-intensive tasks. 3) Using Fibers can write more efficient and responsive code.

The PHP Community: Resources, Support, and DevelopmentThe PHP Community: Resources, Support, and DevelopmentApr 12, 2025 am 12:04 AM

The PHP community provides rich resources and support to help developers grow. 1) Resources include official documentation, tutorials, blogs and open source projects such as Laravel and Symfony. 2) Support can be obtained through StackOverflow, Reddit and Slack channels. 3) Development trends can be learned by following RFC. 4) Integration into the community can be achieved through active participation, contribution to code and learning sharing.

PHP vs. Python: Understanding the DifferencesPHP vs. Python: Understanding the DifferencesApr 11, 2025 am 12:15 AM

PHP and Python each have their own advantages, and the choice should be based on project requirements. 1.PHP is suitable for web development, with simple syntax and high execution efficiency. 2. Python is suitable for data science and machine learning, with concise syntax and rich libraries.

PHP: Is It Dying or Simply Adapting?PHP: Is It Dying or Simply Adapting?Apr 11, 2025 am 12:13 AM

PHP is not dying, but constantly adapting and evolving. 1) PHP has undergone multiple version iterations since 1994 to adapt to new technology trends. 2) It is currently widely used in e-commerce, content management systems and other fields. 3) PHP8 introduces JIT compiler and other functions to improve performance and modernization. 4) Use OPcache and follow PSR-12 standards to optimize performance and code quality.

The Future of PHP: Adaptations and InnovationsThe Future of PHP: Adaptations and InnovationsApr 11, 2025 am 12:01 AM

The future of PHP will be achieved by adapting to new technology trends and introducing innovative features: 1) Adapting to cloud computing, containerization and microservice architectures, supporting Docker and Kubernetes; 2) introducing JIT compilers and enumeration types to improve performance and data processing efficiency; 3) Continuously optimize performance and promote best practices.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.