Home  >  Article  >  Backend Development  >  php regular expression

php regular expression

WBOY
WBOYOriginal
2016-08-08 09:20:07857browse

What is a regular expression

A regular expression is a logical formula that operates on strings. It uses some specific characters to combine into a regular string, which is called a regular matching pattern.

$p = '/apple/'; $str = "apple banna"; if (preg_match($p, $str)) { echo 'matched'; }

The string '/apple/' is a regular expression, which is used to match whether the apple string exists in the source string.

The PCRE library function is used in PHP for regular matching. For example, preg_match in the above example is used to perform a regular matching, and is often used to determine whether a type of character pattern exists.

Basic syntax of regular expressions

In the PCRE library function, the regular matching pattern is composed of delimiters and metacharacters. The delimiter can be any character that is not numbers, non-backslashes, or non-spaces. Commonly used delimiters are forward slash (/), hash symbol (#) and negation symbol (~), for example:

/foo bar/ #^[^0-9]$# ~php~

If the pattern contains a delimiter, the delimiter needs to be separated by a backslash () escape.

/http:\/\//

If the pattern contains a lot of delimiting characters, it is recommended to use other characters as delimiters, or you can use preg_quote to escape.

$p = 'http://'; $p = '/'.preg_quote($p, '/').'/'; echo $p;

You can use pattern modifiers after the delimiter. Pattern modifiers include: i, m, s, x, etc. For example, using the i modifier can ignore case matching:

$str = "Http://www.imooc.com/"; if (preg_match('/http/i', $str)) { echo '匹配成功'; }

Metacharacters and conversion Meaning

Characters with special meanings in regular expressions are called metacharacters. Commonly used metacharacters are:

Generally used to escape characters
^ to assert the starting position of the target (or in multiple lines mode)
$ Asserts the end position of the target (or end of line in multiline mode)
. Matches any character except newline (default)
[Start character class definition
] End character class definition
| Start an optional branch
(start tag of subgroup
) end tag of subgroup
? As a quantifier, indicating 0 or 1 matches. Placed after the quantifier to change the greedy nature of the quantifier. (Look up quantifiers)
* quantifier, 0 or more matches
+ quantifier, 1 or more matches
{custom quantifier start tag
} custom quantifier end tag

//下面的\s匹配任意的空白符,包括空格,制表符,换行符。[^\s]代表非空白符。[^\s]+表示一次或多次匹配非空白符。 $p = '/^我[^\s]+(苹果|香蕉)$/'; $str = "我喜欢吃苹果"; if (preg_match($p, $str)) { echo '匹配成功'; }

Metacharacters have two usage scenarios , one can be used anywhere, the other can only be used within square brackets. The ones used within square brackets are:

The escape character
^ is only used as the first character (square brackets (inside brackets), indicates character class negation
- marks the character range

, where ^ is outside the anti-bracket, indicating the starting position of the assertion target, but inside the square brackets, it represents the character class negation, and the minus inside the square brackets Number - can mark character ranges, for example 0-9 means all numbers between 0 and 9.

//下面的\w匹配字母或数字或下划线。 $p = '/[\w\.\-]+@[a-z0-9\-]+\.(com|cn)/'; $str = "我的邮箱是Spark.eric@imooc.com"; preg_match($p, $str, $match); echo $match[0];

Greedy mode and lazy mode

Each metacharacter in the regular expression matches one character. When + is used, it will become greedy. It will match as many characters as possible, but When using the question mark ? character, it will match as few characters as possible, which is lazy mode.

Greedy mode: Prioritize matching when matchable and non-matchable

//下面的\d表示匹配数字 $p = '/\d+\-\d+/'; $str = "我的电话是010-12345678"; preg_match($p, $str, $match); echo $match[0]; //结果为:010-12345678

Lazy mode: Prioritize non-matching when matchable and non-matchable

$p = '/\d?\-\d?/'; $str = "我的电话是010-12345678"; preg_match($p, $str, $match); echo $match[0]; //结果为:0-1

When we know exactly what is matched When the length of characters, you can use {} to specify the number of matching characters

$p = '/\d{3}\-\d{8}/'; $str = "我的电话是010-12345678"; preg_match($p, $str, $match); echo $match[0]; //结果为:010-12345678

Use regular expressions for matching

The purpose of using regular expressions is to achieve more flexibility than string processing functions Processing method, so it is the same as the string processing function, which is mainly used to determine whether a substring exists, replace strings, split strings, obtain pattern substrings, etc.

PHP uses the PCRE library function to perform regular processing, by setting the pattern and then calling the relevant processing function to obtain the matching result.

preg_match is used to perform a match. It can be simply used to determine whether the pattern matches successfully, or to obtain a matching result. Its return value is the number of successful matches, 0 or 1, and it will stop after matching 1 time. search.

$subject = "abcdef"; $pattern = '/def/'; preg_match($pattern, $subject, $matches); print_r($matches); //结果为:Array ( [0] => def )

The above code simply performs a match and simply determines whether def can match successfully. However, the power of regular expressions is pattern matching, so more often, patterns are used:

$subject = "abcdef"; $pattern = '/a(.*?)d/'; preg_match($pattern, $subject, $matches); print_r($matches); //结果为:Array ( [0] => abcd [1] => bc )

You can match a pattern with regular expressions to get more useful data.

查找所有匹配结果

preg_match只能匹配一次结果,但很多时候我们需要匹配所有的结果,preg_match_all可以循环获取一个列表的匹配结果数组。

$p = "|<[^>]+>(.*?)]+>|i"; $str = "example:
this is a test
"; preg_match_all($p, $str, $matches); print_r($matches);

可以使用preg_match_all匹配一个表格中的数据:

$p = "/(.*?)<\/td>\s*(.*?)<\/td>\s*<\/tr>/i"; $str = "
Eric25
John26
"; preg_match_all($p, $str, $matches); print_r($matches);

$matches结果排序为$matches[0]保存完整模式的所有匹配, $matches[1] 保存第一个子组的所有匹配,以此类推。

正则表达式的搜索和替换

正则表达式的搜索与替换在某些方面具有重要用途,比如调整目标字符串的格式,改变目标字符串中匹配字符串的顺序等。

例如我们可以简单的调整字符串的日期格式:

$string = 'April 15, 2014'; $pattern = '/(\w+) (\d+), (\d+)/i'; $replacement = '$3, ${1} $2'; echo preg_replace($pattern, $replacement, $string); //结果为:2014, April 15

其中${1}与$1的写法是等效的,表示第一个匹配的字串,$2代表第二个匹配的。

通过复杂的模式,我们可以更加精确的替换目标字符串的内容。

$patterns = array ('/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/', '/^\s*{(\w+)}\s*=/'); $replace = array ('\3/\4/\1\2', '$\1 =');//\3等效于$3,\4等效于$4,依次类推 echo preg_replace($patterns, $replace, '{startDate} = 1999-5-27'); //结果为:$startDate = 5/27/1999 //详细解释下结果:(19|20)表示取19或者20中任意一个数字,(\d{2})表示两个数字,(\d{1,2})表示1个或2个数字,(\d{1,2})表示1个或2个数字。^\s*{(\w+)\s*=}表示以任意空格开头的,并且包含在{}中的字符,并且以任意空格结尾的,最后有个=号的。

用正则替换来去掉多余的空格与字符:

$str = 'one two'; $str = preg_replace('/\s+/', ' ', $str); echo $str; // 结果改变为'one two'

正则匹配常用案例

正则匹配常用在表单验证上,一些字段会有一定的格式要求,比如用户名一般都要求必须是字母、数字或下划线组成,邮箱、电话等也都有自己的规则,因此使用正则表达式可以很好的对这些字段进行验证。

版权声明:本文为博主原创文章,未经博主允许不得转载。

以上就介绍了php正则表达式,包括了方面的内容,希望对PHP教程有兴趣的朋友有所帮助。

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn