Home > Article > Backend Development > How Parsing Regular Expressions Work_PHP Tutorial
Regular expression refers to a single string used to describe or match a series of strings that conform to a certain syntax rule. In many text editors or other tools, regular expressions are often used to retrieve and/or replace text content that matches a certain pattern.
Rough writing of regular expressions is the main cause of performance bottlenecks, but there are many places where the efficiency of regular expressions can be improved. Just because two regular expressions match the same text doesn't mean they are equally fast.
Many factors affect the efficiency of regular expressions. First of all, the texts adapted by regular expressions vary widely, and partial matching takes longer than complete matching. Each browser's regular expression engine also has different internal optimizations.
In order to use regular expressions effectively, it is important to understand how they work. The following are the basic steps for regular expression processing:
Step 1: Compile
After you create a regular expression object (using a regular expression literal or RegExp constructor), the browser checks your template for errors and then converts it into a native code routine that performs the matching work. You can avoid repeating this step if you assign the regular expression to a variable.
Step 2: Set the starting position
When a regular expression is put into use, you must first determine the position in the target string where the search starts. It is the starting position of the string, or is specified by the lastIndex attribute of the regular expression, but when it returns here from step 4 (because the attempt to match failed), this position will be after the last attempted starting position. at the position of a character.
The way browsers optimize their regular expression engines is to skip some unnecessary work through early prediction at this stage. For example, if a regular expression begins with ^, IE and Chrome usually determine whether there is a match at the beginning of the string, and then avoid foolishly searching for subsequent positions. Another example is to match a string whose third letter is x. A smart way is to find x first, and then backtrack the starting position by two characters.
Step 3: Match the characters of each regular expression
Once the regular expression finds the starting position, it will scan the target text one by one and Regular expression template. When a specific character fails to match, the regular expression will try to backtrack to the previous position of the scan and then enter other possible paths of the regular expression.
Step 4: Match success or failure
If an exact match is found at the current position of the string, the regular expression is declared successful. If all possible paths in the regular expression have been tried without a successful match, the regular expression engine goes back to step two and tries again from the next character in the string. Only after each character in the string (and the position after the last character) has gone through such a process and has not been successfully matched, the regular expression will declare a complete failure.
Keeping this process in mind will help you wisely identify the types of problems that affect regular expression performance.
Original address: http://www.yiiyaa.net/1231