Home >Backend Development >PHP Tutorial >Use regular rules to intercept a fixed-length string from the source string from the specified starting position_PHP tutorial
[Code] Use regular rules to intercept a fixed-length string (including Chinese) from the source string from the specified starting position [Fourth Edition]
[Code] Use regular rules to intercept a string of a certain length from the source string starting from the specified starting position [Fourth Edition]
[Code] Use regular expressions to intercept a string of a certain length from the source string starting from the specified starting position [Fourth revision]
[Code] Use regular expressions to intercept a string of a certain byte length from the source string starting from the head of the string
[Code] Use regular expressions to intercept a string of a certain length from the source string starting from the specified starting position
(BTW: Chinese encoding is very complex and somewhat unreasonable. The high bits are 0xa1-0xfe (excluding 0xff because 0xff, which is 255, plays an important role in the telnet protocol), and the low bits are 0x40-0xfe; and GBK has extended the high bits to unicode mapping. 0x81-0xfe
Explanation on whether the last byte is intercepted in wrong Chinese:
The last byte, if half of the Chinese text is intercepted, should be the high-order byte, and its ASCII code is greater than 0x81.
Because the high-order bytes of Chinese are greater than 0x81, but the low-order bytes are not limited.
A complete Chinese character: [0x81-0xfe][0x40-0xfe]
Therefore, regular expressions are used to extract Chinese characters and non-Chinese characters in sequence, with Chinese characters taking priority.
The last byte, if half of the Chinese character is intercepted, then it will be a non-Chinese character, and it will be the high-order byte of the Chinese character
And determine whether this byte is in [0x81-0xfe], you can know whether the interception is wrong.
//------------------------------------------------ ---------------
// File name: preg_substr.php
// Description: Use regular expressions to intercept a certain amount of string from the source string starting from the specified starting position
//------------------------------------------------ ----------
/// Function description
/// Function name: preg_substr
/// Function version: Fourth revision
/// Function: Use regular expressions to intercept a certain amount of string from the source string starting from the specified starting position
/// Function parameters:
/// $strSource : source string
/// $intStart: starting position, the default is 0, which means starting from the beginning
/// $intLen: intercept length, default is 32
function preg_substr($strSource, $intStart=0, $intLen=32)
{
is_int($intLen) ?0:die("len isn't an integer");
is_int($intStart) ?0:die("start isn't an integer");
if ($intStart>=0 && $intLen>0 && @preg_match('/^(.{'.$intStart.'})(.{0,'.$intLen.'})/si', $strSource) ) {
@preg_match('/^(.{'.$intStart.'})(.{0,'.$intLen.'})/si', $strSource, $regs);
@preg_match_all('/([x81-xFE].|.)/sim', $regs[1], $regs1, PREG_PATTERN_ORDER);
@preg_match('/^[x81-xFE]$/',$regs1[1][count($regs1[1])-1])?$intStart--:0;
@preg_match('/^(.{'.$intStart.'})(.{0,'.$intLen.'})/si', $strSource, $regs);
@preg_match_all('/([x81-xFE].|.)/sim', $regs[2], $regs1, PREG_PATTERN_ORDER);
@preg_match('/^[x81-xFE]$/',$regs1[1][count($regs1[1])-1])?$intLen--:0;
@preg_match('/^(.{'.$intStart.'})(.{0,'.$intLen.'})/si', $strSource, $regs);
$strResult = $regs[2];
}else{
$strResult = "";
}
return $strResult;
}
function preg_substr2($strSource, $intStart=0, $intLen=32)
{
is_int($intLen) ?0:die("len isn't an integer");
is_int($intStart) ?0:die("start isn't a integer");
if ($intStart>=0 && $intLen>=0)
{
$strResult = substr($strSource, 0, $intStart);
@preg_match_all('/([x81-xFE].|.)/sim', $strResult, $regs, PREG_PATTERN_ORDER);
if(@preg_match('/^[x81-xFE]$/',$regs[1][count($regs[1])-1], $regs)){
$intStart--;
}
$strResult = substr($strSource, $intStart, $intLen);
@preg_match_all('/([x81-xFE].|.)/sim', $strResult, $regs, PREG_PATTERN_ORDER);
if(@preg_match('/^[x81-xFE]$/',$regs[1][count($regs[1])-1], $regs)){
$strResult = substr($strSource, $intStart, --$intLen);
}
}
return $strResult;
}
$strHTML = <<
ab