


This article mainly introduces the preg_replace() method of regularly replacing all qualified strings in PHP. It has a certain reference value. Now I share it with you. Friends in need can refer to it
PHP preg_replace() regular replacement is different from Javascript regular replacement. PHP preg_replace() defaults to replacing all elements whose symbols match the conditions
The data that we need to process with programs is not always designed in advance with database thinking. In other words, it cannot be stored using the database structure.
For example, template engine parsing templates, spam sensitive information filtering, etc.
Generally in this case, we use regular expressions to match preg_match and replace preg_replace according to our rules.
But in general applications, they are nothing more than database CRUD, and there are very few opportunities to fiddle with regular expressions.
According to what was said before, there are two scenarios: statistical analysis, using matching; processing using replacement.
PHP preg_replace() regular replacement is different from Javascript regular replacement. PHP preg_replace() defaults to replacing all elements whose symbols match the conditions.
preg_replace (正则表达式, 替换成, 字符串, 最大替换次数【默认-1,无数次】, 替换次数)
The regular expressions in most languages are similar, but there are also subtle differences.
PHP Regular Expression
Regular characters | Regular explanation |
---|---|
\ | Mark the next character as a special character , or a literal character, or a backreference, or an octal escape character. For example, "\n" matches the character "n". "\\n" matches a newline character. The sequence "\\" matches "\" and "\(" matches "(". |
^ | matches the beginning of the input string. If set The Multiline property of the RegExp object, ^ also matches the position after "\n" or "\r". |
$ | matches the end position of the input string. If When the Multiline property of the RegExp object is set, $ also matches the position before "\n" or "\r". |
* | matches the preceding subexpression zero times or multiple times. For example, zo* can match "z" and "zoo". * is equivalent to {0,}. |
matches the previous sub Expression one or more times. For example, "zo" can match "zo" and "zoo", but not "z". Equivalent to {1,}. | |
Matches the preceding subexpression zero or one time. For example, "do(es)?" can match "does" or "do" in "does". ? is equivalent to {0,1} . | |
n is a non-negative integer. Matches a certain number of n times. For example, "o{2}" cannot match "Bob" "o", but can match two o's in "food". | |
n is a non-negative integer. Match at least n times. For example, "o{2,}" cannot match the "o" in "Bob", but it can match all o's in "foooood". "o{1,}" is equivalent to "o ". "o{0, }" is equivalent to "o*". | |
m and n are non-negative integers, where n | |
When this character is immediately followed by any other limiter (*, ,?, {n} , {n,}, {n,m}), the matching mode is non-greedy. The non-greedy mode matches as few of the searched characters as possible, while the default greedy mode matches as many of the searched characters as possible String. For example, for the string "oooo", "o?" will match a single "o", and "o" will match all "o"s. | |
Matches any single character except "\n". To match any character including "\n", use a pattern like "[\s\S]". | |
Match pattern and get this match. The obtained match can be obtained from the generated Matches collection, using the SubMatches collection in VBScript, and using the $0...$9 attribute in JScript . To match parentheses characters, use "\(" or "\)". | |
Matches pattern but does not obtain the matching result, which means that this is a non-acquisition match and is not stored for later use. This is useful when combining parts of a pattern using the or character "(|)". For example, "industr(?:y|ies)" is a simpler expression than "industry|industries". | |
Forward positive pre-check, match the search string at the beginning of any string matching pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, "Windows(?=95|98|NT|2000)" can match "Windows" in "Windows2000", but cannot match "Windows" in "Windows3.1". Prefetching does not consume characters, that is, after a match occurs, the search for the next match begins immediately after the last match, rather than starting after the character containing the prefetch. | |
Forward negative pre-check, match the search string at the beginning of any string that does not match pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, "Windows(?!95|98|NT|2000)" can match "Windows" in "Windows3.1", but cannot match "Windows" in "Windows2000". | |
Reverse negative pre-check is similar to forward negative pre-check, but in the opposite direction. For example, "(? | x|y |
Matches x or y. For example, "z|food" matches "z" or "food". "(z|f)ood" matches "zood" or "food". | |
Character collection. Matches any one of the characters contained. For example, "[abc]" would match the "a" in "plain". | |
[^xyz] | Negative value character set. Matches any character not included. For example, "[^abc]" would match "plin" in "plain". |
[a-z] | Character range. Matches any character within the specified range. For example, "[a-z]" matches any lowercase alphabetic character in the range "a" through "z". Note: Only when the hyphen is inside the character group and appears between two characters, it can represent the range of characters; if it appears at the beginning of the character group, it can only represent the hyphen itself. |
[^a-z] | Negative character range. Matches any character not within the specified range. For example, "[^a-z]" matches any character that is not in the range "a" through "z". |
\b | Matches a word boundary, which refers to the position between a word and a space. For example, "er\b" matches the "er" in "never" but not the "er" in "verb". |
\B | Matches non-word boundaries. "er\B" can match the "er" in "verb", but not the "er" in "never". |
\cx | Matches the control character specified by x. For example, \cM matches a Control-M or carriage return character. The value of x must be one of A-Z or a-z. Otherwise, treat c as a literal "c" character. |
\d | Matches a numeric character. Equivalent to [0-9]. |
\D | Matches a non-numeric character. Equivalent to [^0-9]. |
\f | Matches a form feed character. Equivalent to \x0c and \cL. |
\n | Matches a newline character. Equivalent to \x0a and \cJ. |
\r | Matches a carriage return character. Equivalent to \x0d and \cM. |
\s | Matches any whitespace character, including spaces, tabs, form feeds, etc. Equivalent to [ \f\n\r\t\v]. |
\S | Matches any non-whitespace character. Equivalent to [^ \f\n\r\t\v]. |
\t | Matches a tab character. Equivalent to \x09 and \cI. |
\v | Matches a vertical tab character. Equivalent to \x0b and \cK. |
\w | Matches any word character including an underscore. Equivalent to "[A-Za-z0-9_]". |
\W | Matches any non-word character. Equivalent to "[^A-Za-z0-9_]". |
\xn | Matches n, where n is the hexadecimal escape value. The hexadecimal escape value must be exactly two digits long. For example, "\x41" matches "A". "\x041" is equivalent to "\x04&1". ASCII encoding can be used in regular expressions. |
\num | Matches num, where num is a positive integer. A reference to the match obtained. For example, "(.)\1" matches two consecutive identical characters. |
\n | Identifies an octal escape value or a backreference. If \n is preceded by at least n fetched subexpressions, n is a backward reference. Otherwise, if n is an octal number (0-7), then n is an octal escape value. |
\nm | Identifies an octal escape value or a backreference. If there are at least nm get subexpressions before \nm, nm is a backward reference. If \nm is preceded by at least n obtains, then n is a backward reference followed by the literal m. If none of the previous conditions are met, and if n and m are both octal numbers (0-7), then \nm will match the octal escape value nm. |
\nml | If n is an octal number (0-7), and m and l are both octal numbers (0-7), match the octal escape Value nml. |
\un | Matches n, where n is a Unicode character represented by four hexadecimal digits. For example, \u00A9 matches the copyright symbol (©). |
上表是正则表达式比较全面的解释,而商标中的正则字符都有特殊含义,已经不再代表原字符含义。如正则表达式中“+”不代表加号,而是代表匹配一次或多次。而如果想要让“+”表示加号,则需要在其前面加上“\”转义,也就是用“\+”表示加号。
1+1=2 正则表达式是: 1\+1=2 而正则表达式 1+1=2 可以代表,多个1=2,即: 11=2 正则表达式:1+1=2 111=2 正则表达式:1+1=2 1111=2 正则表达式:1+1=2 ……
也就是说所有正则字符都有特定含义,如果需要再用来表示原字符含义,就需要在前面加“\”转义,即使非正则字符,用“\”转义也是没有问题的。
1+1=2 正则表达式也可以是: \1\+\1\=\2
对所有字符都转义,但是这种不建议使用。
而正则表达式必须要使用定界符包围起来,在Javascript中定界符是“/”,而在PHP中,比较常见的是用“/”定界,也可以用“#”定界,而且外面还需要用引号包围起来。
如果正则表达式包含这些定界符,您就需要对这些字符进行转义。
PHP 正则表达式定界符
大多数语言的正则表达式都是由“/”作为定界符的,而在PHP中,还可以使用“#”定界,如果字符串中包含大量“/”字符,在使用“/”定界的时候,就需要对这些“/”转义,而使用“#”就不需要转义,更简洁。
<?php $weigeti='W3CSchool 在线教程的网址是 http://e.jb51.net/ ,你能把这个网址替换成正确的网址吗?'; // 上面的要求就是把http://e.jb51.net/ 替换成 http://e.jb51.net/w3c/ // . : - 都是正则符号,所以需要转义,而 / 是定界符,如果字符串中包含 / 定界符,就需要转义 echo preg_replace('/http\:\/\/www\.jb51\.net\//','http://e.jb51.net/w3c/',$weigeti); // 在 #作为定界符,/ 就不再是定界符的含义,就不需要转义了。 echo preg_replace('#http\://www\.jb51\.net/#','http://e.jb51.net/w3c/',$weigeti); //上面两条输出结果都一样,【W3CSchool 在线教程的网址是 http://e.jb51.net/w3c/ ,你能把这个网址替换成正确的网址吗?】 ?>
通过上面的两条PHP 正则替换代码我们可以发现,如果正则语句中包含大量“/”,无论使用“/” 还是 “#”做定界符都是可以的,但是使用“#”能让代码看起来更简洁。但是E维科技建议您还是保持使用“/”作为定界符,因为在Javascript等语言中,只能使用“/”作为定界符,这样写起来可以形成习惯,贯通于其他语言中。
PHP 正则表达式修饰符
修饰符被放在PHP正则表达式定界符“/”尾部,在正则表达式尾部引号之前。
i 忽略大小写,匹配不考虑大小写 m 多行独立匹配,如果字符串不包含[\n]等换行符就和普通正则一样。 s 设置正则符号 . 可以匹配换行符[\n],如果没有设置,正则符号.不能匹配换行符\n。 x 忽略没有转义的空格 e eval() 对匹配后的元素执行函数。 A 前置锚定,约束匹配仅从目标字符串开始搜索 D 锁定$作为结尾,如果没有D,如果字符串包含[\n]等换行符,$依旧依旧匹配换行符。如果设置了修饰符m,修饰符D 就会被忽略。 S 对非锚定的匹配进行分析 U 非贪婪,如果在正则字符量词后加“?”,就可以恢复贪婪 X 打开与perl 不兼容附件 u 强制字符串为UTF-8编码,一般在非UTF-8编码的文档中才需要这个。建议UTF-8环境中不要使用这个,据E维科技调查使用这个会有一个Bug。
如果您熟悉Javascript 的正则表达式,或许一定熟悉Javascript 正则表达式的修饰符“g”,代表匹配所有符合条件的元素。而在PHP 正则替换中,是匹配所有符号条件的元素,所以不存在Javascript 修饰符“g”。
PHP 正则中文和忽略大小写PHP preg_replace() 是区分大小写的,同时只能匹配ASCII编码内的字符串,如果需要匹配不区分大小写和中文等字符需要添加相应的修饰符 i 或 u。
<?php $weigeti='php中文网 在线教程网址://www.php.cn/'; echo preg_replace('/php中文网/','php',$weigeti); //大小写不同,输出【php 在线教程网址://www.php.cn/】 echo preg_replace('/php中文网/i','php',$weigeti); //忽略大小写,执行替换输出【php 在线教程网址:http://e.php.cn/】 echo preg_replace('/网址/u','',$weigeti); //强制 UTF-8中文,执行替换,输出【PHP中文网 在线教程://www.php.cn/】 ?>
大小写和中文在PHP中都是敏感的,但是在Javascript正则中,只对大小写敏感,忽略大小写也是通过修饰符 i 作用的,但是Javascript 不需要告知是否是UTF-8中文等特殊字符,直接可以匹配中文。
PHP 正则换行符实例
PHP 正则表达式在遇到换行符时,会将换行符当做字符串中间一个普通字符。而通用符号.不能匹配\n,所以遇到带有换行符的字符串正则会有很多要点。
<?php $weigeti="php.cn\nIS\nLOVING\nYOU"; // 想要把上面$weigeti 替换成php.cn echo preg_replace('/^[A-Z].*[A-Z]$/','',$weigeti); // 这个正则表达式是,匹配只包含\w的元素,$weigeti 是以V开头,符合[A-Z],而且结尾是U,也符合[A-Z]。.无法匹配\n // 输出【jb51.net IS LOVEING YOU】 echo preg_replace('/^[A-Z].*[A-Z]$/s','',$weigeti); // 这个用修饰符s,也就是 . 可以匹配 \n 了,所以整句匹配,输出空 // 输出【】 echo preg_replace('/^[A-Z].*[A-Z]$/m','',$weigeti); // 这里使用了修饰符,将\n作为多行独立匹配。也就等价于: /* $preg_m=preg_replace('/^[A-Z].*[A-Z]$/m','',$weigeti); $p='/^[A-Z].*[A-Z]$/'; $a=preg_replace($p,'','php.cn'); $b=preg_replace($p,'','IS'); $c=preg_replace($p,'','LOVING'); $d=preg_replace($p,'','YOU'); $preg_m === $a.$b.$c.$d; */ // 输出【php.cn】 ?>
以后您在使用PHP 抓取某个网站内容,并用正则批量替换的时候,总无法避免忽略获取的内容包含换行符,所以在使用正则替换的时候一定要注意。
PHP 正则匹配执行函数PHP 正则替换可以使用一个修饰符e,代表 eval() 来执行匹配后的内容某个函数。
<?php $weigeti='W3CSchool 在线教程网址://www.jb51.net ,你Jbzj!了吗?'; // 将上面网址转为小写 echo preg_replace('/(http\:[\/\w\.\-]+\/)/e','strtolower("$1")',$weigeti); // 使用修饰符e之后,就可以对匹配的网址执行PHP 函数 strtolower() 了 // 输出 【W3CSchool 在线教程网址://www.jb51.net ,你Jbzj!了吗?】 ?>
根据上面代码,尽管匹配后的函数 strtolower() 在引号内,但是依旧会被eval()执行。
正则替换匹配变量向后引用
如果您熟悉Javascript,一定对$1 $2 $3 …… 等向后引用比较熟悉,而在 PHP 中这些也可以被当作向后引用参数。而在PHP中,还可以使用 \1 \\1 来表示向后引用。
向后引用的概念就是匹配一个大片段,这个正则表达式内部又被用括号切割成若干小匹配元素,那么每个匹配元素就被按照小括号序列用向后引用代替。
<?php $weigeti='W3CSchool 在线教程网址://www.jb51.net ,你Jbzj!了吗?'; echo preg_replace('/.+(http\:[\w\-\/\.]+\/)[^\w\-\!]+([\w\-\!]+).+/','$1',$weigeti); echo preg_replace('/.+(http\:[\w\-\/\.]+\/)[^\w\-\!]+([\w\-\!]+).+/','\1',$weigeti); echo preg_replace('/.+(http\:[\w\-\/\.]+\/)[^\w\-\!]+([\w\-\!]+).+/','\\1',$weigeti); // 上面三个都是输出 【//www.jb51.net】 echo preg_replace('/^(.+)网址:(http\:[\w\-\/\.]+\/)[^\w\-\!]+([\w\-\!]+).+$/','栏目:$1<br>网址:$2<br>商标:$3',$weigeti); /* 栏目:W3CSchool 在线教程 网址://www.jb51.net 商标:Jbzj! */ // 括号中括号,外面括号先计数 echo preg_replace('/^((.+)网址:(http\:[\w\-\/\.]+\/)[^\w\-\!]+([\w\-\!]+).+)$/','原文:$1<br>栏目:$2<br>网址:$3<br>商标:$4',$weigeti); /* 原文:W3CSchool 在线教程网址://www.jb51.net ,你Jbzj!了吗? 栏目:W3CSchool 在线教程 网址://www.jb51.net 商标:Jbzj! */ ?>
以上就是本文的全部内容,希望对大家的学习有所帮助,更多相关内容请关注PHP中文网!
相关推荐:
The above is the detailed content of About the method of preg_replace() in PHP to regularly replace all strings that meet the conditions. For more information, please follow other related articles on the PHP Chinese website!

php把负数转为正整数的方法:1、使用abs()函数将负数转为正数,使用intval()函数对正数取整,转为正整数,语法“intval(abs($number))”;2、利用“~”位运算符将负数取反加一,语法“~$number + 1”。

实现方法:1、使用“sleep(延迟秒数)”语句,可延迟执行函数若干秒;2、使用“time_nanosleep(延迟秒数,延迟纳秒数)”语句,可延迟执行函数若干秒和纳秒;3、使用“time_sleep_until(time()+7)”语句。

php除以100保留两位小数的方法:1、利用“/”运算符进行除法运算,语法“数值 / 100”;2、使用“number_format(除法结果, 2)”或“sprintf("%.2f",除法结果)”语句进行四舍五入的处理值,并保留两位小数。

判断方法:1、使用“strtotime("年-月-日")”语句将给定的年月日转换为时间戳格式;2、用“date("z",时间戳)+1”语句计算指定时间戳是一年的第几天。date()返回的天数是从0开始计算的,因此真实天数需要在此基础上加1。

方法:1、用“str_replace(" ","其他字符",$str)”语句,可将nbsp符替换为其他字符;2、用“preg_replace("/(\s|\ \;||\xc2\xa0)/","其他字符",$str)”语句。

php判断有没有小数点的方法:1、使用“strpos(数字字符串,'.')”语法,如果返回小数点在字符串中第一次出现的位置,则有小数点;2、使用“strrpos(数字字符串,'.')”语句,如果返回小数点在字符串中最后一次出现的位置,则有。

在PHP中,可以利用implode()函数的第一个参数来设置没有分隔符,该函数的第一个参数用于规定数组元素之间放置的内容,默认是空字符串,也可将第一个参数设置为空,语法为“implode(数组)”或者“implode("",数组)”。

php字符串有下标。在PHP中,下标不仅可以应用于数组和对象,还可应用于字符串,利用字符串的下标和中括号“[]”可以访问指定索引位置的字符,并对该字符进行读写,语法“字符串名[下标值]”;字符串的下标值(索引值)只能是整数类型,起始值为0。


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SublimeText3 Linux new version
SublimeText3 Linux latest version

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version
Useful JavaScript development tools

SublimeText3 English version
Recommended: Win version, supports code prompts!
