Home >Backend Development >PHP Tutorial >PHP supports Chinese character interception functions for multiple file encodings

PHP supports Chinese character interception functions for multiple file encodings

PHP中文网
PHP中文网Original
2016-07-29 09:12:471227browse

Use a variety of methods to achieve perfect interception of Chinese strings. It supports UTF-8, GBK, GB2312, and BIG5 encoding without installing the mbstring and iconv extensions. After installing the above extensions, more encodings are supported. For details, refer to the function illustrate.
There are three methods
1. mb_substr() requires mbstring extension
2. iconv_substr() requires iconv extension
3. Regular matching, supported by default
Three methods are prioritized from top to bottom. If the previous method is not available, it will be automatically Use the next method.

This code is optimized from the "String Interception, Support Common Encoding" code released by Midnight

1. Repair the original code that does not return mb_substr and iconv_substr, so it is equivalent to an invalid call
2. Optimize the interception of string suffix , the suffix can be customized. Default is empty.

<?php
/**
 * 字符串截取,支持中文和其他编码
 *
 * @param string $str 需要转换的字符串
 * @param string $start 开始位置
 * @param string $length 截取长度
 * @param string $charset 编码格式
 * @param string $suffix 截断字符串后缀
 * @return string
 */
function substr_ext($str, $start=0, $length, $charset="utf-8", $suffix="")
{
    if(function_exists("mb_substr")){
         return mb_substr($str, $start, $length, $charset).$suffix;
    }
    elseif(function_exists(&#39;iconv_substr&#39;)){
         return iconv_substr($str,$start,$length,$charset).$suffix;
    }
    $re[&#39;utf-8&#39;]  = "/[\x01-\x7f]|[\xc2-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf]{2}|[\xf0-\xff][\x80-\xbf]{3}/";
    $re[&#39;gb2312&#39;] = "/[\x01-\x7f]|[\xb0-\xf7][\xa0-\xfe]/";
    $re[&#39;gbk&#39;]    = "/[\x01-\x7f]|[\x81-\xfe][\x40-\xfe]/";
    $re[&#39;big5&#39;]   = "/[\x01-\x7f]|[\x81-\xfe]([\x40-\x7e]|\xa1-\xfe])/";
    preg_match_all($re[$charset], $str, $match);
    $slice = join("",array_slice($match[0], $start, $length));
    return $slice.$suffix;
}


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn