Function to intercept strings according to utf8 encoding rules (utf8 version of sub

Home

Backend Development

PHP Tutorial

Function to intercept strings according to utf8 encoding rules (utf8 version of sub_str)

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jul 25, 2016 am 09:03 AM

/*
* Function: The function is the same as substr, except that it will not cause garbled characters
* Parameters:
* Return:
*/
function utf8_substr( $str , $start , $length =null ){
// First intercept normally.
$res = substr( $str , $start , $length );
$strlen = strlen( $str );
/* Then determine whether the first and last 6 bytes are Complete (not incomplete) */
// If the parameter start is a positive number
if ( $start >= 0 ){
// intercept about 6 bytes forward
$next_start = $start + $length; // Initial Position
$next_len = $next_start + 6 $next_segm = substr( $str , $next_start , $next_len );
// If the first byte is not complete The first byte of the character, and then intercept about 6 bytes
$prev_start = $start - 6 > 0 ? $start - 6 : 0;
$prev_segm = substr( $str , $prev_start , $start - $prev_start );
}
// start is a negative number
else{
// intercept about 6 bytes forward
$next_start = $strlen + $start + $length; // initial position
$next_len = $next_start + 6 < ;= $strlen ? 6 : $strlen - $next_start;
$next_segm = substr( $str , $next_start , $next_len );
// If the first byte is not the first byte of the complete character, intercept it later About 6 bytes.
$start = $strlen + $start;
$prev_start = $start - 6 > 0 ? $start - 6 : 0;
$prev_segm = substr( $str , $prev_start , $start - $ prev_start );
}
// Determine whether the first 6 bytes comply with utf8 rules
if ( preg_match( '@^([x80-xBF]{0,5})[xC0-xFD]?@' , $next_segm , $ bytes ) ){
if ( !empty( $bytes[1] ) ){
$bytes = $bytes[1];
$res .= $bytes;
}
}
// Determine whether the last 6 bytes match utf8 rules
$ord0 = ord( $res[0] );
if ( 128 = $ord0 ){
// Intercept from the back and add it in front of res.
if ( preg_match( '@[xC0-xFD][x80-xBF]{0,5}$@' , $prev_segm , $bytes ) ){
if ( !empty( $bytes[0] ) ){
$bytes = $ bytes[0];
$res = $bytes . $res;
}
}
}
return $res;
}
?>

Copy code

Test ---

$str = 'dfjdjf test 13f test 65&2 datafｄdｊ（1 on mfe&...on';
var_dump( utf8_substr( $str , 22 , 12 ) ); echo '
';
var_dump( utf8_substr( $str , 22 , -6 ) ); echo '
';
var_dump( utf8_substr( $str , 9 , 12 ) ); echo '
';
var_dump( utf8_substr( $str , 19 , 12 ) ); echo '
';
var_dump( utf8_substr( $str , 28 , -6 ) ); echo '
?>

Copy code

Display results: (interception without garbled characters) string(12) "According to fdj" string(26) "According to fdj (1 is mfe&..." string(13) "13f try 65&2 number" string(12) "data fd" string(20) "dｊ（1justmfe&..."

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How do you create and use an interface in PHP?Apr 30, 2025 pm 03:40 PM

The article explains how to create, implement, and use interfaces in PHP, focusing on their benefits for code organization and maintainability.

What is the difference between crypt() and password_hash()?Apr 30, 2025 pm 03:39 PM

The article discusses the differences between crypt() and password_hash() in PHP for password hashing, focusing on their implementation, security, and suitability for modern web applications.

How can you prevent Cross-Site Scripting (XSS) in PHP?Apr 30, 2025 pm 03:38 PM

Article discusses preventing Cross-Site Scripting (XSS) in PHP through input validation, output encoding, and using tools like OWASP ESAPI and HTML Purifier.

What is autoloading in PHP?Apr 30, 2025 pm 03:37 PM

Autoloading in PHP automatically loads class files when needed, improving performance by reducing memory use and enhancing code organization. Best practices include using PSR-4 and organizing code effectively.

What are PHP streams?Apr 30, 2025 pm 03:36 PM

PHP streams unify handling of resources like files, network sockets, and compression formats via a consistent API, abstracting complexity and enhancing code flexibility and efficiency.

What is the maximum size of a file that can be uploaded using PHP ?Apr 30, 2025 pm 03:35 PM

The article discusses managing file upload sizes in PHP, focusing on the default limit of 2MB and how to increase it by modifying php.ini settings.

What is Nullable types in PHP ?Apr 30, 2025 pm 03:34 PM

The article discusses nullable types in PHP, introduced in PHP 7.1, allowing variables or parameters to be either a specified type or null. It highlights benefits like improved readability, type safety, and explicit intent, and explains how to declar

What is the difference between the unset() and unlink() functions ?Apr 30, 2025 pm 03:33 PM

The article discusses the differences between unset() and unlink() functions in programming, focusing on their purposes and use cases. Unset() removes variables from memory, while unlink() deletes files from the filesystem. Both are crucial for effec

See all articles