Home  >  Article  >  Backend Development  >  A brief discussion on regular expressions, regular expressions_PHP tutorial

A brief discussion on regular expressions, regular expressions_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:21:34796browse

A brief talk about regular expressions, regular expressions

1. What is a regular expression?

Simply put: Regular expression (Regular Expression) is a language for processing string matching;

Regular expression describes a string matching pattern, which can be used to check whether a string contains a certain substring, and perform "removal" or "replacement" operations on the matched substring.

2. Application of regular expressions

Regular expressions are very practical in the actual development process and can quickly solve some complex string processing problems. Below I will make some simple classifications of the applications of regular expressions:

The first type: data verification

For example, if you want to verify whether a string is the correct EMail, Telphone, IP, etc., then it is very convenient to use regular expressions.

Second type: content search

For example, if you want to grab a picture from a web page, then you must find the tag. At this time, you can use regular expressions to accurately match it.

The third type: content replacement

For example, if you want to hide the middle four digits of your mobile phone number and change it to this pattern, 123****4567, then it will be very convenient to use regular expressions.

3. What are the contents of regular expressions

I will briefly introduce regular expressions below:

1. Several important concepts of regular expressions

  • Subexpression: In a regular expression, if the content enclosed by "()" is used, it is called a "subexpression"
  • Capture: The result matched by the subexpression will be placed in the buffer by the system. This process is called "capture"
  • Back reference: We use "n", where n is a number, indicating the content between a certain buffer before the reference, we call it "back reference"

2. Quantity qualifier

  • X+   means: 1 or more
  • X* represents: 0 or more
  • X? Represents: 0 or 1
  • X{n}  means: n
  • X{n,} means: at least n
  • X{n,m} means: n to m, greedy principle, will match as many as possible; if you add one at the end? , then it is the non-greedy principle

Note: X represents the character to be found

3. Character qualifier

  • d means: match a numeric character, [0-9]
  • D means: match a non-numeric character, [^0-9]
  • w  means: match word characters including underscores, [0-9a-zA-Z_]
  • W means: matches any non-word character, [^0-9a-zA-Z_]
  • s means: matches any whitespace character, space, carriage return, tab
  • S means: match any non-whitespace character
  • . Represents: Match any single character

In addition, there are the following:

Range characters: [a-z], [A-Z], [0-9], [0-9a-z], [0-9a-zA-Z]
Any character: [abcd], [1234]
Except characters: [^a-z], [^0-9], [^abcd]

4. Locator

  • ^ represents: starting mark
  • $ represents: ending mark
  • b means: word boundary
  • B represents: non-word boundary

5. Escape character

  • Used to match certain special characters

6. Select matching character

  • | Can match multiple rules

7. Special usage

  • (?=): Forward lookup: match the string ending with the specified content
  • (?!): Negative lookup: matches a string that does not end with the specified content
  • (?:): Do not put the content of the selected match into the buffer

4. How to use regular expressions in Javascript

There are two ways to use regular expressions in Javascript:

First method: use RegExp class

The methods provided are:

  • test(str): Whether there is a string matching the pattern in string matching, return true/false
  • exec(str): Returns the string matched by the matching pattern. If yes, returns the corresponding string. If not, returns null;

// If there are subexpressions in the regular expression, when using the exec method

    //What is returned is: result[0] = matching result, result[1] = matching result of sub-expression 1...

The second method is: use String class

The methods provided are:

  • search: Returns the position where the string matching the pattern appears, if not, returns -1
  • match: Returns the string matched by the matching pattern, if any, returns an array, if not, returns null
  • replace: Replace the string matched by the matching pattern
  • split: Separate the string with matching pattern as delimiter and return array

5. How to use regular expressions in PHP

There are two functions using regular expressions under PHP:

The first one is: Perl regular expression function

The methods provided are:

  • preg_grep -- Returns the array elements matching the pattern
  • preg_match_all -- perform global regular expression matching
  • preg_match -- Regular expression matching
  • preg_quote -- escape regular expression characters
  • preg_replace_callback -- Use callback function to perform regular expression search and replacement
  • preg_replace -- Perform regular expression search and replacement
  • preg_split -- Split string using regular expression

The second one is: POSIX regular expression function

The methods provided are:

  • ereg_replace -- Replace regular expression
  • ereg -- Regular expression matching
  • eregi_replace -- case-insensitive replacement regular expression
  • eregi -- case-insensitive regular expression matching
  • split -- Use regular expressions to split strings into arrays
  • spliti -- Use regular expressions to split strings into arrays without case sensitivity
  • sql_regcase -- Generate a regular expression for size-insensitive matching

6. Summary

Regular expression is a tool for us to implement a certain function. This tool:

1. Powerful functions

Different combinations of various qualifiers in regular expressions will achieve different functions. Sometimes to implement a complex function requires writing a long regular expression. How to achieve accurate matching will test the ability of a programmer. .

2. Simple and convenient

Usually when we search for string content, we can only search for a specific string, but regular expressions can help us perform fuzzy searches, which is faster and more convenient and only requires a regular expression string.

3. Basically all languages ​​are supported

Currently mainstream languages ​​such as JAVA, PHP, Javascript, C#, C++, etc. all support regular expressions.

4. Learning is easy, application is profound

Learning regular expressions is quick and easy, but how to write efficient and accurate regular expressions in actual development still requires a long period of trial and accumulation.

Regular expression

Regular expressions are often used in js to determine mobile phone numbers, email addresses, etc., to achieve powerful functions through simple methods

Symbol explanation

Character description
\ Mark the next character as A special character, or a literal character, or a backreference, or an octal escape character. For example, 'n' matches the character "n". '\n' matches a newline character. The sequence '\\' matches "\" and "\(" matches "(".
^ matches the beginning of the input string. If the Multiline property of the RegExp object is set, ^ also matches '\n' or ' The position after \r'.
$ matches the end of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before '\n' or '\r'. subexpression zero or more times. For example, zo* matches "z" and "zoo". * is equivalent to {0,}. For example, 'zo+' matches "zo" and "zoo", but not "z". + is equivalent to {1,}? Matches the preceding subexpression zero or once. For example, "do(es". )?" can match "do" in "do" or "does". ? is equivalent to {0,1}.
{n} n is a non-negative integer. Matches a certain n times. For example, ' o{2}' cannot match 'o' in "Bob", but can match two o's in "food".
{n,} n is a non-negative integer, for example, '. o{2,}' cannot match 'o' in "Bob", but it can match all o's in "foooood". 'o{1,}' is equivalent to 'o{0,}' then. Equivalent to 'o*'.
{n,m} m and n are non-negative integers, where n ? When this The matching pattern is non-greedy when the character immediately follows any of the other qualifiers (*, +, ?, {n}, {n,}, {n,m}). The non-greedy pattern searches for as few matches as possible. string, while the default greedy mode matches as many of the searched strings as possible. For example, for the string "oooo", 'o+?' will match a single "o", while 'o+' will match all 'o's. '.
. Matches any single character except "\n". To match any character including '\n', use a pattern like '[.\n]'. y matches x or y. For example, 'z... The rest of the text >>




What is a regular expression? Give an example

Currently, regular expressions have been widely used in many software, including *nix (Linux, Unix, etc.), HP and other operating systems, PHP, C#, Java and other development environments, as well as many application software. You can see the shadow of regular expressions.

The use of regular expressions can achieve powerful functions in a simple way. In order to be simple and effective yet powerful, the regular expression code is more difficult and not easy to learn, so it requires some effort. After getting started, it is relatively simple and effective to use it by referring to certain references.

Example: ^.+@.+\\..+$

2. History of regular expressions

The "ancestors" of regular expressions can be traced all the way back to Early research into how the human nervous system works. Two neurophysiologists, Warren McCulloch and Walter Pitts, developed a mathematical way to describe these neural networks.
In 1956, a mathematician named Stephen Kleene published a paper titled "Representation of Neural Network Events" based on the early work of McCulloch and Pitts, introducing the concept of regular expressions. Regular expressions are used to describe expressions that he calls "the algebra of regular sets," hence the term "regular expression."

Subsequently, it was discovered that this work could be applied to some early research using computational search algorithms by Ken Thompson, the primary inventor of Unix. The first practical application of regular expressions was the qed editor in Unix.

The rest, as they say, is known history. From that time until now regular expressions have been an important part of text-based editors and search tools
3. Regular expression definition
Regular expression describes a string matching The pattern can be used to check whether a string contains a certain substring, replace the matching substring, or extract a substring that meets a certain condition from a string, etc.

When listing directories, *.txt in dir *.txt or ls *.txt is not a regular expression, because the meaning of * here is different from the * in regular expressions.
Regular expressions are literal patterns composed of ordinary characters (such as the characters a through z) and special characters called metacharacters. A regular expression acts as a template that matches a character pattern with a searched string.

3.1 Normal characters
consists of all those printing and non-printing characters that are not explicitly designated as metacharacters. This includes all uppercase and lowercase alphabetic characters, all numbers, all punctuation, and some symbols.

3.2 Non-printing characters Character Meaning
\cx Matches the control character specified by x. For example, \cM matches a Control-M or carriage return character. The value of x must be one of A-Z or a-z. Otherwise, c is treated as a literal 'c' character.
\f matches a form feed. Equivalent to \x0c and \cL.
\n matches a newline character. Equivalent to \x0a and \cJ.
\r matches a carriage return character. Equivalent to \x0d and \cM.
\s matches any whitespace character, including spaces, tabs, form feeds, etc. Equivalent to [ \f\n\r\t\v].
\S matches any non-whitespace character. Equivalent to [^ \f\n\r\t\v].
\t matches a tab character. Equivalent to \x09 and \cI.
\v matches a vertical tab character. Equivalent to \x0b and \cK.

3.3 Special characters

The so-called special characters are characters with special meanings, such as the * in "*.txt" mentioned above, simple... The rest of the text >>

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/856578.htmlTechArticleA brief discussion of regular expressions, regular expressions 1. What are regular expressions? To put it simply: Regular Expression (Regular Expression) is a language for processing string matching; Regular expression...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn