Home >Backend Development >PHP Tutorial >How to use PHP regular expressions_PHP tutorial
PHP regular expressions are mainly used for pattern segmentation, matching, search and replacement operations on strings. Using regular expressions may not be efficient in some simple environments, so how to better use PHP regular expressions needs to be considered comprehensively.
My introduction to PHP regular expressions originated from an article on the Internet. This article explains the use of PHP regular expressions from shallow to deep. I think it is a good introductory material, but it still depends on the individual to learn it. In the process of using it, I still keep forgetting it, so I read this article four or five times. For some of the more difficult knowledge points, it even takes a long time to digest it, but as long as I can see it, keep reading. After finishing it, you will find that your ability to apply regular rules will be significantly improved.
Definition of PHP regular expression:
A syntax rule that describes character arrangement and matching patterns. It is mainly used for pattern segmentation, matching, search and replacement operations of strings.
Regular functions in PHP:
There are two sets of regular functions in PHP, both of which have similar functions:
One set is provided by the PCRE (Perl Compatible Regular Expression) library. Functions named with the prefix "preg_";
A set of extensions provided by POSIX (Portable Operating System Interface of Unix). Use functions named with the prefix "ereg_"; (POSIX regular function library is no longer recommended for use since PHP 5.3, and will be removed from PHP 6)
Since POSIX regularity is about to be launched on the historical stage, and the forms of PCRE and perl are similar, it is more convenient for us to switch between perl and php, so here we focus on the use of PCRE regularity.
PCRE regular expression
The full name of PCRE is Perl Compatible Regular Expression, which means Perl compatible regular expression.
In PCRE, the pattern expression (regular expression) is usually enclosed between two backslashes "/", such as "/apple/".
Several important concepts in regular expressions are: metacharacters, escapes, pattern units (repetitions), antonyms, references, and assertions. These concepts can be easily understood and mastered in the article [1].
Commonly used meta-characters:
Metacharacter Description
A matches the atom at the beginning of the string
Z matches the atom at the end of the string
b Match the boundary of the word /bis/ Match the string whose head is is /isb/ Match the string whose tail is is /bisb/ Delimitation
B Matches any character except word boundaries /Bis/ Matches "is" in the word "This"
d Matches a number; equivalent to [0-9]
D Matches any character except numbers; equivalent to [^0-9]
w Matches an English letter, number or underscore; equivalent to [0-9a-zA-Z_]
W matches any character except English letters, numbers and underscores; equivalent to [^0-9a-zA-Z_]
s matches a whitespace character; equivalent to [ftv]
S Matches any character except whitespace characters; equivalent to [^ftv]
f Matches a form feed equivalent to x0c or cL
Matches a newline character; equivalent to x0a or cJ
Matching a carriage return is equivalent to x0d or cM
t matches a tab; equivalent to x09 or cl
v Matches a vertical tab character; equivalent to x0b or ck
oNN matches an octal number
xNN matches a hexadecimal number
cC Matches a control character
Pattern Modifiers:
Pattern modifiers are particularly used in ignoring case and matching multiple lines. Mastering this modifier can often solve many problems we encounter.
i - can match both uppercase and lowercase letters
M - treat string as multiple lines
S - Treat the string as a single line, and treat newlines as ordinary characters, making "." match any character
X - Whitespace in the pattern is ignored
U - matches the nearest string
e - Use the replaced string as an expression
Format: /apple/i matches "apple" or "Apple", etc., ignoring case. /i
PCRE pattern unit:
//1 Extract the first attribute
/^d{2} ([W])d{2}1d{4}$ matches strings such as "12-31-2006", "09/27/1996", "86 01 4321". But the above regular expression does not match the format of "12/34-5678". This is because the result "/" of pattern "[W]" has already been stored. When the next position "1" refers to , its matching pattern is also the character "/".
Use the non-storage pattern unit "(?:)" when there is no need to store the matching results
For example /(?:a|b|c)(D|E|F)1g/ will match "aEEg". In some regular expressions, it is necessary to use non-storage mode units. Otherwise, the order of subsequent references needs to be changed. The above example can also be written as /(a|b|c)(C|E|F)2g/.
PCRE regular expression function:
preg_match() and preg_match_all()
preg_quote()
preg_split()
preg_grep()
preg_replace()
We can find the specific use of functions through the PHP manual. Here are some regular expressions we have accumulated:
Match action attribute
$str = '';
$match = '';
preg_match_all(’/s+action="(?!http:)(.*?)"s/’, $str, $match);
print_r($match);
Use callback functions in regular expressions
/**
* replace some string by callback function
*
*/
function callback_replace() {
$url = ‘http://esfang.house.sina.com.cn’;
$str = '';
$str = preg_replace ( '/(?<=saction=")(?!http:)(.*?)(?="s)/e', 'search($url, 1)', $str ) ;
echo $str;
}
function search($url, $match){
return $url . ’/’ . $match;
}
Regular matching with assertions
$match = '';
$str = ’xxxxxx.com.cn bold font
paragraph text
’;
preg_match_all ( ’/(?<=<(w{1})>).*(?=1>)/’, $str, $match );
echo "Match content in HTML tags without attributes:";
print_r ($match);
Replace the address in the HTML source code
$form_html = preg_replace ( '/(?<=saction="|ssrc="|shref=")(?!http:|javascript)(.*?)(?="s)/e', 'add_url( $url, '1')', $form_html );
Finally, although the regular tool is powerful, in terms of efficiency and writing time, sometimes it may not be more direct than explode. For some urgent or undemanding tasks, a simple and crude method may be better.
As for the execution efficiency between the two series of preg and ereg, I have seen an article saying that preg is faster. Specifically, because ereg is not used much, and it is about to be launched on the historical stage, I also prefer it. The way of PCRE