Home  >  Article  >  Backend Development  >  How Parsing Regular Expressions Work_PHP Tutorial

How Parsing Regular Expressions Work_PHP Tutorial

WBOY
WBOYOriginal
2016-07-20 10:57:17914browse

Regular expression refers to a single string used to describe or match a series of strings that conform to a certain syntax rule. In many text editors or other tools, regular expressions are often used to retrieve and/or replace text content that matches a certain pattern.

Rough writing of regular expressions is the main cause of performance bottlenecks, but there are many places where the efficiency of regular expressions can be improved. Just because two regular expressions match the same text doesn't mean they are equally fast.

Many factors affect the efficiency of regular expressions. First of all, the texts adapted by regular expressions vary widely, and partial matching takes longer than complete matching. Each browser's regular expression engine also has different internal optimizations.

In order to use regular expressions effectively, it is important to understand how they work. The following are the basic steps for regular expression processing:

Step 1: Compile

After you create a regular expression object (using a regular expression literal or RegExp constructor), the browser checks your template for errors and then converts it into a native code routine that performs the matching work. You can avoid repeating this step if you assign the regular expression to a variable.

Step 2: Set the starting position

When a regular expression is put into use, you must first determine the position in the target string where the search starts. It is the starting position of the string, or is specified by the lastIndex attribute of the regular expression, but when it returns here from step 4 (because the attempt to match failed), this position will be after the last attempted starting position. at the position of a character.

The way browsers optimize their regular expression engines is to skip some unnecessary work through early prediction at this stage. For example, if a regular expression begins with ^, IE and Chrome usually determine whether there is a match at the beginning of the string, and then avoid foolishly searching for subsequent positions. Another example is to match a string whose third letter is x. A smart way is to find x first, and then backtrack the starting position by two characters.

Step 3: Match the characters of each regular expression

Once the regular expression finds the starting position, it will scan the target text one by one and Regular expression template. When a specific character fails to match, the regular expression will try to backtrack to the previous position of the scan and then enter other possible paths of the regular expression.

Step 4: Match success or failure

If an exact match is found at the current position of the string, the regular expression is declared successful. If all possible paths in the regular expression have been tried without a successful match, the regular expression engine goes back to step two and tries again from the next character in the string. Only after each character in the string (and the position after the last character) has gone through such a process and has not been successfully matched, the regular expression will declare a complete failure.
Keeping this process in mind will help you wisely identify the types of problems that affect regular expression performance.

Original address: http://www.yiiyaa.net/1231


www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/445787.htmlTechArticleRegular expression refers to a single string used to describe or match a series of strings that conform to a certain syntactic rule string. In many text editors or other tools, regular expressions are...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn