Home >Backend Development >PHP Tutorial >PHP regular expression introductory tutorial [translated], regular expression introductory tutorial_PHP tutorial
$regex = '/^http://([w.] )/([w] )/([w] ).html$/i' ;
$str = 'http://www.youku.com/show_page/id_ABCDEFG.html' ;
$matches = array();
if (preg_match($regex, $str, $matches)){
var_dump($matches);
}
echo "n" ;
|
$matches[0] in preg_match will contain the string matching the entire pattern.
The code using the "#" delimiter is as follows. At this time, "/" will not be escaped!
$regex = '#^http://([w.] )/([w] )/([w] ).html$#i' ;
$str = 'http://www.youku.com/show_page/id_ABCDEFG.html' ;
$matches = array();
if (preg_match($regex, $str, $matches)){
var_dump($matches);
}
echo "n" ;
|
¤ Modifier: used to change the behavior of regular expressions.
The last one we see in ('/^http://([w.] )/([w] )/([w] ).html/i') i" is the modifier, which means ignoring case, and another one we often use is "x" which means ignoring spaces.
Contribute code:
$regex = '/HELLO/' ;
$str = 'hello word' ;
$matches = array ();
if (preg_match( $regex , $str , $matches )){
echo 'No i:Valid Successful!' , "n" ;
}
if (preg_match( $regex . 'i' , $str , $matches )){
echo 'YES i:Valid Successful!' , "n" ;
}
|
¤ Character field: [w] The part expanded with square brackets is the character field.
¤ Qualifier: such as [w]{3,5} or [w]* or [w]. The symbols after [w] all represent qualifiers. The specific meaning is now introduced.
{3,5} means 3 to 5 characters. {3,} is more than 3 characters, {,5} is up to 5 characters, and {3} is three characters.
* represents 0 to multiple
means 1 to more.
¤ caret
^:
& gt; placed in the character domain (such as: [^w]) indicate the negative (excluding meaning) - "reverse selection"
can be placed before the expression to start with the current character. (/^n/i, means starting with n).
Note, we often call "" "escape character". Used to escape some special symbols, such as ".", "/"
Wildcards (lookarounds): Assert the presence or absence of certain characters in certain strings! There are two types of lookarounds: lookaheads (forward lookup ?=) and lookbehinds (reverse lookup?<=). > Format: Positive lookup: (?=) corresponding to (?!) means negative meaning Reverse pre-lookup: (?<=) The corresponding (?
$regex
=
'/(?<=c)d(?=e)/'
;
/* d 前面紧跟c, d 后面紧跟e*/
$str
=
'abcdefgk'
;
$matches
=
array
();
if
(preg_match(
$regex
,
$str
,
$matches
)){
var_dump(
$matches
);
}
echo
"n"
;
Negative meaning:
$regex = '/(?<!c)d(?!e)/' ; /* d 前面不紧跟c, d 后面不紧跟e*/
$str = 'abcdefgk' ;
$matches = array ();
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
>Character width: zero Verify zero character codes
$regex = '/HE(?=L)LO/i' ;
$str = 'HELLO' ;
$matches = array ();
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
Cannot print the result!
$regex = '/HE(?=L)LLO/i' ;
$str = 'HELLO' ;
$matches = array ();
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
Can print out the results!
Explanation: (?=L) means HE is followed by an L character. However, (?=L) itself does not occupy a character and must be distinguished from (L), which itself occupies one character.
capture data Groupings without specifying a type will be retrieved for later use. > indicates that the type refers to the wildcard character. Therefore, only those without question marks at the beginning of the parentheses can be captured. > References within the same expression are called backreferences. > Calling format: number (such as 1).
$regex = '/^(Chuanshanjia)[ws!] 1$/' ;
$str = 'Chuanshanjia thank Chuanshanjia' ;
$matches = array ();
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
> Avoid capturing data Format:(?:pattern) Advantages: It will keep the number of effective back references to a minimum and the code will be clearer. >Named capturing group Format: (?P
$regex = '/(?P<author>chuanshanjia)[s]Is[s](?P=author)/i' ;
$str = 'author:chuanshanjia Is chuanshanjia' ;
$matches = array ();
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
Run results
Lazy matching (remember: two operations will be performed, please see the principle part below)
Format: Qualifier?
Principle: "?": If there is a qualifier in front of it, the smallest data will be used. For example, "*" will take 0, and " " will take 1. If it is {3,5}, 3 will be taken.
Look at the following two codes first:
Code 1.
<?php
$regex = '/heL*/i' ;
$str = 'heLLLLLLLLLLLLLLLL' ;
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
Result 1.
Code 2
<?php
$regex = '/heL*?/i' ;
$str = 'heLLLLLLLLLLLLLLLL' ;
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
Result 2
Code 3, use " "
<?php
$regex = '/heL ?/i' ;
$str = 'heLLLLLLLLLLLLLLLL' ;
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
Result 3
Code 4, use {3,5}
<?php
$regex = '/heL{3,10}?/i' ;
$str = 'heLLLLLLLLLLLLLLLL' ;
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
Result 4
Comments on regular expressions Format: (?# comment content) Purpose: Mainly used for complex annotations Contribution code: It is a regular expression used to connect to MYSQL database
$regex = '/
^host=(?<!.)([d.] )(?!.) (?#主机地址)
|
([w!@#$%^&*()_ -] ) (?#用户名)
|
([w!@#$%^&*()_ -] ) (?#密码)
(?!|)$/ix';
$str = 'host=192.168.10.221|root|123456' ;
$matches = array ();
if (preg_match( $regex , $str , $matches )){
var_dump( $matches );
}
echo "n" ;
|
special characters
特殊字符 | 解释 |
* | 0到多次 |
1到多次还可以写成{1,} | |
? | 0或1次 |
. | 匹配除换行符外的所有单个的字符 |
w | [a-zA-Z0-9_] |
s | 空白字符(空格,换行符,回车符)[tnr] |
d | [0-9] |