Home >Backend Development >PHP Tutorial >Several tips for using PHP regular expressions_PHP Tutorial
PHP regular expressions are mainly used for pattern segmentation, matching, search and replacement operations on strings. Using regular expressions may not be efficient in some simple environments, so how to better use PHP regular expressions requires comprehensive consideration
My introduction to PHP regular expressions originated from an article on the Internet. This article explains the use of PHP regular expressions from simple to deep. I think it is a good introductory material, but it still takes a long time to learn it. Personally, in the process of using it, I still keep forgetting it, so I read this article over and over again four or five times. For some of the more difficult knowledge points, it even takes a long time to digest, but as long as you can see If you stick to reading it, you will find that your ability to apply regular rules will be significantly improved. BKJIA editor recommends "Basic Introduction to PHP Development"
Definition of PHP regular expression:
A grammar rule used to describe character arrangement and matching patterns. It is mainly used for pattern segmentation, matching, search and replacement operations of strings.
Regular function in PHP:
There are two sets of regular functions in PHP, both of which have similar functions:
One set is provided by the PCREPerl Compatible Regular Expression) library. Functions named with the prefix "preg_";
A set of extensions provided by POSIXPortable Operating System Interface of Unix. Use functions named with the prefix "ereg_"; the POSIX regular function library is no longer recommended for use since PHP 5.3 and will be removed from PHP 6)
Since POSIX regularity is about to be launched on the historical stage, and the forms of PCRE and perl are similar, it is more convenient for us to switch between perl and php, so here we focus on the use of PCRE regularity.
PCRE regular expression
PCRE stands for Perl Compatible Regular Expression, which means Perl compatible regular expression.
In PCRE, the pattern expression (regular expression) is usually enclosed between two backslashes "/", such as "/apple/".
Several important concepts in regular expressions include: metacharacters, escaping, pattern unit repetition), antonyms, references and assertions. These concepts can be easily understood and mastered in the article [1].
Commonly used meta-characters:
Metacharacter Description
A matches the atom at the beginning of the string
Z matches the atom at the end of the string
b Match the boundary of the word /bis/ Match the string whose head is is /isb/ Match the string whose tail is is /bisb/ Delimitation
B Matches any character except word boundaries /Bis/ Matches "is" in the word "This"
d Matches a number; equivalent to [0-9]
D matches any character except numbers; equivalent to [^0-9]
w Matches an English letter, number or underscore; equivalent to [0-9a-zA-Z_]
W matches any character except English letters, numbers and underscores; equivalent to [^0-9a-zA-Z_]
s Matches a whitespace character; equivalent to [ftv]
S Matches any character except whitespace characters; equivalent to [^ftv]
f Matches a form feed equivalent to x0c or cL
Matches a newline character; equivalent to x0a or cJ
Matches a carriage return equivalent to x0d or cM
t Matches a tab character; equivalent to x09 or cl
v Matches a vertical tab character; equivalent to x0b or ck
oNN matches an octal number
xNN matches a hexadecimal number
cC Matches a control character
Pattern Modifiers):
Pattern modifiers are especially used in ignoring case and matching multiple lines. Mastering this modifier can often solve many problems we encounter.
i - can match both uppercase and lowercase letters
M - treat string as multiple lines
S - Treat the string as a single line, and treat newlines as ordinary characters, making "." match any character
X - Whitespace in the pattern is ignored
U - matches the nearest string
e - Use the replaced string as an expression
Format: /apple/i matches "apple" or "Apple", etc., ignoring case. /i
Pattern unit of PCRE:
//1 Extract the first attribute
/^d{2} ([W])d{2}\1d{4}$ matches strings such as "12-31-2006", "09/27/1996", and "86 01 4321". But the above regular expression does not match the format of "12/34-5678". This is because the result "/" of pattern "[W]" has already been stored. When the next position "1" is referenced, its matching pattern is also the character "/".
当不需要存储匹配结果时使用非存储模式单元“?:)”
例如/(?:a|b|c)(D|E|F)\\1g/ 将匹配“aEEg”。在一些正则表达式中,使用非存储模式单元是必要的。否则,需要改变其后引用的顺序。上例还可以写成/a|b|c)(C|E|F)\2g/。
PCRE正则表达式函数:
<ol class="dp-c"> <li class="alt"><span><span>preg_match()和preg_match_all() </span></span></li> <li class=""><span>preg_quote() </span></li> <li class="alt"><span>preg_split() </span></li> <li class=""><span>preg_grep() </span></li> <li class="alt"><span>preg_replace() </span></li> </ol>
函数的具体使用,我们可以通过PHP手册来找到,下面分享一些平时积累的正则表达式:
匹配action属性
<ol class="dp-c"> <li class="alt"><span><span class="vars"><font color="#dd0000">$str</font></span><span> = </span><span class="string"><font color="#0000ff">'<form></form> <form test.php www.bac.com></form> <form>'</form></font></span><span>; </span></span></li> <li class=""> <span> </span><span class="vars"><font color="#dd0000">$match</font></span><span> = </span><span class="string"><font color="#0000ff">''</font></span><span>; </span> </li> <li class="alt"> <span> preg_match_all(</span><span class="string"><font color="#0000ff">'/\s+action=\"(?!http:)(.*?)\"\s/'</font></span><span>, </span><span class="vars"><font color="#dd0000">$str</font></span><span>, </span><span class="vars"><font color="#dd0000">$match</font></span><span>); </span> </li> <li class=""> <span> print_r(</span><span class="vars"><font color="#dd0000">$match</font></span><span>); </span> </li> </ol>
在正则中使用回调函数
<ol class="dp-c"> <li class="alt"><span><span class="comment"><font color="#008200">/** </font></span> </span></li> <li class=""><span><span class="comment"><font color="#008200"> * replace some string by callback function </font></span> </span></li> <li class="alt"><span><span class="comment"><font color="#008200"> * </font></span> </span></li> <li class=""><span><span class="comment"><font color="#008200"> */</font></span><span> </span></span></li> <li class="alt"> <span> </span><span class="keyword"><strong><font color="#006699">function</font></strong></span><span> callback_replace() { </span> </li> <li class=""> <span> </span><span class="vars"><font color="#dd0000">$url</font></span><span> = </span><span class="string"><font color="#0000ff">'http://esfang.house.sina.com.cn'</font></span><span>; </span> </li> <li class="alt"> <span> </span><span class="vars"><font color="#dd0000">$str</font></span><span> = </span><span class="string"><font color="#0000ff">'<form></form> <form test.php www.bac.com></form> <form>'</form></font></span><span>; </span> </li> <li class=""> <span> </span><span class="vars"><font color="#dd0000">$str</font></span><span> = preg_replace ( </span><span class="string"><font color="#0000ff">'/(?<=\saction=\")(?!http:)(.*?)(?=\"\s)/e'</FONT></SPAN><SPAN>, </SPAN><SPAN class=string><FONT color=#0000ff>'search(\$url, \\1)'</FONT></SPAN><SPAN>, </SPAN><SPAN class=vars><FONT color=#dd0000>$str</FONT></SPAN><SPAN> ); </SPAN></SPAN><LI class=alt><SPAN> </SPAN><LI class=""><SPAN> </SPAN><SPAN class=func>echo</SPAN><SPAN> </SPAN><SPAN class=vars><FONT color=#dd0000>$str</FONT></SPAN><SPAN>; </SPAN></SPAN><LI class=alt><SPAN> } </SPAN><LI class=""><SPAN> </SPAN><LI class=alt><SPAN> </SPAN><SPAN class=keyword><STRONG><FONT color=#006699>function</FONT></STRONG></SPAN><SPAN> search(</SPAN><SPAN class=vars><FONT color=#dd0000>$url</FONT></SPAN><SPAN>, </SPAN><SPAN class=vars><FONT color=#dd0000>$match</FONT></SPAN><SPAN>){ </SPAN></SPAN><LI class=""><SPAN> </SPAN><SPAN class=keyword><STRONG><FONT color=#006699>return</FONT></STRONG></SPAN><SPAN> </SPAN><SPAN class=vars><FONT color=#dd0000>$url</FONT></SPAN><SPAN> . </SPAN><SPAN class=string><FONT color=#0000ff>'/'</FONT></SPAN><SPAN> . </SPAN><SPAN class=vars><FONT color=#dd0000>$match</FONT></SPAN><SPAN>; </SPAN></SPAN><LI class=alt><SPAN> } </SPAN></LI></OL>
带断言的正则匹配
<OL class=dp-c><LI class=alt><SPAN><SPAN class=vars><FONT color=#dd0000>$match</FONT></SPAN><SPAN> = </SPAN><SPAN class=string><FONT color=#0000ff>''</FONT></SPAN><SPAN>; </SPAN></SPAN><LI class=""><SPAN> </SPAN><SPAN class=vars><FONT color=#dd0000>$str</FONT></SPAN><SPAN> = </SPAN><SPAN class=string><FONT color=#0000ff>'xxxxxx.com.cn <B>bold font</B> <p>paragraph text</p>'</FONT></SPAN><SPAN>; </SPAN></SPAN><LI class=alt><SPAN> preg_match_all ( </SPAN><SPAN class=string><FONT color=#0000ff>'/(?<=<(\w{1})>).*(?=<\/\1>)/'</font></span><span>, </span><span class="vars"><font color="#dd0000">$str</font></span><span>, </span><span class="vars"><font color="#dd0000">$match</font></span><span> ); </span> </li> <li class=""> <span> </span><span class="func">echo</span><span> </span><span class="string"><font color="#0000ff">"<br>匹配没有属性的HTML标签中的内容:"</font></span><span>; </span> </li> <li class="alt"> <span> print_r ( </span><span class="vars"><font color="#dd0000">$match</font></span><span> ); </span> </li> </ol>
替换HTML源码中的地址
<ol class="dp-c"><li class="alt"><span><span class="vars"><font color="#dd0000">$form_html</font></span><span> = preg_replace ( </span><span class="string"><font color="#0000ff">'/(?<=\saction=\"|\ssrc=\"|\shref=\")(?!http:|javascript)(.*?)(?=\"\s)/e'</font></span><span>, </span><span class="string"><font color="#0000ff">'add_url(\$url, \'\\1\')'</font></span><span>, </span><span class="vars"><font color="#dd0000">$form_html</font></span><span> ); </span></span></li></ol>
最后,正则工具虽然强大,但是从效率和编写时间上来讲,有的时候可能没有explode来的更直接,对于一些紧急或者要求不高的任务,简单、粗暴的方法也许更好。
而对于preg和ereg两个系列之间的执行效率,曾看到文章说preg要更快一点,具体由于使用ereg的时候并不多,而且也要推出历史舞台了,再加个个人更偏好于PCRE的方式,所以笔者就不做比较了,熟悉的朋友可以发表下意见,谢谢。
本文来自Cocowool的博客园博文《PHP中正则的使用》