Home >Backend Development >PHP Tutorial >Use regular rules to intercept a fixed-length string from the source string from the specified starting position_PHP tutorial

Use regular rules to intercept a fixed-length string from the source string from the specified starting position_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 16:58:231412browse

[Code] Use regular rules to intercept a fixed-length string (including Chinese) from the source string from the specified starting position [Fourth Edition]
[Code] Use regular rules to intercept a string of a certain length from the source string starting from the specified starting position [Fourth Edition]
[Code] Use regular expressions to intercept a string of a certain length from the source string starting from the specified starting position [Fourth revision]
[Code] Use regular expressions to intercept a string of a certain byte length from the source string starting from the head of the string
[Code] Use regular expressions to intercept a string of a certain length from the source string starting from the specified starting position

(BTW: Chinese encoding is very complex and somewhat unreasonable. The high bits are 0xa1-0xfe (excluding 0xff because 0xff, which is 255, plays an important role in the telnet protocol), and the low bits are 0x40-0xfe; and GBK has extended the high bits to unicode mapping. 0x81-0xfe


Explanation on whether the last byte is intercepted in wrong Chinese:
The last byte, if half of the Chinese text is intercepted, should be the high-order byte, and its ASCII code is greater than 0x81.
Because the high-order bytes of Chinese are greater than 0x81, but the low-order bytes are not limited.
A complete Chinese character: [0x81-0xfe][0x40-0xfe]
Therefore, regular expressions are used to extract Chinese characters and non-Chinese characters in sequence, with Chinese characters taking priority.
The last byte, if half of the Chinese character is intercepted, then it will be a non-Chinese character, and it will be the high-order byte of the Chinese character
And determine whether this byte is in [0x81-0xfe], you can know whether the interception is wrong.


//------------------------------------------------ ---------------
// File name: preg_substr.php
// Description: Use regular expressions to intercept a certain amount of string from the source string starting from the specified starting position
//------------------------------------------------ ----------

/// Function description
/// Function name: preg_substr
/// Function version: Fourth revision
/// Function: Use regular expressions to intercept a certain amount of string from the source string starting from the specified starting position
/// Function parameters:
/// $strSource : source string
/// $intStart: starting position, the default is 0, which means starting from the beginning
/// $intLen: intercept length, default is 32

function preg_substr($strSource, $intStart=0, $intLen=32)
{
is_int($intLen) ?0:die("len isn't an integer");
is_int($intStart) ?0:die("start isn't an integer");
if ($intStart>=0 && $intLen>0 && @preg_match('/^(.{'.$intStart.'})(.{0,'.$intLen.'})/si', $strSource) ) {
@preg_match('/^(.{'.$intStart.'})(.{0,'.$intLen.'})/si', $strSource, $regs);
@preg_match_all('/([x81-xFE].|.)/sim', $regs[1], $regs1, PREG_PATTERN_ORDER);
@preg_match('/^[x81-xFE]$/',$regs1[1][count($regs1[1])-1])?$intStart--:0;

@preg_match('/^(.{'.$intStart.'})(.{0,'.$intLen.'})/si', $strSource, $regs);
@preg_match_all('/([x81-xFE].|.)/sim', $regs[2], $regs1, PREG_PATTERN_ORDER);
@preg_match('/^[x81-xFE]$/',$regs1[1][count($regs1[1])-1])?$intLen--:0;

@preg_match('/^(.{'.$intStart.'})(.{0,'.$intLen.'})/si', $strSource, $regs);

$strResult = $regs[2];
}else{
$strResult = "";
}
return $strResult;
}

function preg_substr2($strSource, $intStart=0, $intLen=32)
{
is_int($intLen) ?0:die("len isn't an integer");
is_int($intStart) ?0:die("start isn't a integer");
if ($intStart>=0 && $intLen>=0)
{
$strResult = substr($strSource, 0, $intStart);
@preg_match_all('/([x81-xFE].|.)/sim', $strResult, $regs, PREG_PATTERN_ORDER);
if(@preg_match('/^[x81-xFE]$/',$regs[1][count($regs[1])-1], $regs)){
$intStart--;
}

$strResult = substr($strSource, $intStart, $intLen);
@preg_match_all('/([x81-xFE].|.)/sim', $strResult, $regs, PREG_PATTERN_ORDER);
if(@preg_match('/^[x81-xFE]$/',$regs[1][count($regs[1])-1], $regs)){
$strResult = substr($strSource, $intStart, --$intLen);
}
}
return $strResult;
}

$strHTML = << ab

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/631416.htmlTechArticle[Code] Use regular rules to intercept a fixed-length string from the source string from the specified starting position ( (Including Chinese) [Fourth Edition] [Code] Use regular expressions to intercept from the source string starting from the specified starting position...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn