Home >Backend Development >PHP Tutorial >Regular expression tutorial - detailed explanation of matching a single character

Regular expression tutorial - detailed explanation of matching a single character

高洛峰
高洛峰Original
2017-01-09 16:17:241914browse

The example in this article describes the regular expression tutorial of matching a single character. Share it with everyone for your reference, the details are as follows:

Note: In all examples, the regular expression matching results are included between [and] in the source text. Some examples will be implemented using Java. If The usage of regular expressions in Java itself will be explained in the corresponding places. All java examples are tested under JDK1.6.0_13.

java test code:

/**
 * 根据正则表达式和要匹配的源文本,输出匹配结果
 * @param regex 正则表达式
 * @param sourceText 要匹配的源文本
 */
public static void matchAndPrint(String regex, String sourceText){
  Pattern pattern = Pattern.compile(regex);
  Matcher matcher = pattern.matcher(sourceText);
  while(matcher.find()){
    System.out.println(matcher.group());
  }
}

1. Match plain text

1. There is only one matching result

First look at a simple regular expression, today, although it is plain text itself, it is a regular expression. Let’s look at an example:

Source text: Yesterday is history, tomorrow is a mystery, but today is a gift.

Regular expression: today

Result: Yesterday is history, tomorrow is a mystery, but [today] is a gift.

Analysis: The regular expression used here is plain text, which matches today in the source text.

Call the matchAndPrint method, the output result is:

today

2. There are multiple matching results

Source text: Yesterday is history, tomorrow is a mystery, but today is a gift.

Regular expression: is

Result: Yesterday is history, tomorrow is a mystery, but 【today】 is a gift.

Analysis: In the source text, there are three is, but four is are output, because the is in history will also be matched.

Call the matchAndPrint method, the output result is:

is

is

is

is

3, Letter case issues

Regular expressions are case-sensitive, but many regular expression implementations also support case-insensitive matching operations. In JavaScript, use the i flag to perform a case-insensitive match. In java, if you want to be case-insensitive, when compiling the regular expression, you can specify:

Patternpattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

2. Match any character

The regular expressions we saw earlier are all static plain text, they simply do not reflect the power of regular expressions. Next, let's see how to use regular expressions to match unpredictable characters.

In regular expressions, special characters (or collections of characters) are used to give what to search for. The . character (English status period) can match any single character. Equivalent to the ? character in DOS and the _ (underscore) character in SQL. For example: the regular expression c.t will match cat, cut, cot, etc. Let’s look at an example.

Text:

orders1.txt

orders2.txt

sales1.txt

salesA.txt

orders3.txt

sales2.txt

sales.txt

Regular expression: sales.

Result:

orders1.txt

orders2.txt

【sales1】.txt

【salesA】.txt

orders3.txt

【sales2】. txt

【sales.】txt

Analysis: The regular expression sales. will find the file name composed of the string sales and another note. It can be seen from the results that. Match letters, numbers, and itself. 4 out of 7 files match this pattern.

If the matchAndPrint method is called, the output result is:

sales1

salesA

sales2

sales.

3. Match the special characters

. Characters have special meanings in regular expressions. If you need a . in the pattern, you have to find a way to tell the regular expression that you need the . character itself rather than its special meaning in the regular expression. To do this, the . must be escaped by preceding it with a \ character. \ is also a metacharacter (metacharacter, indicating that this character has a special meaning, not the character meaning itself). Consider the following example.

Find files starting with na or sa, no matter what number follows it.

Text:

sales.txt

na1.txt

na2.txt

sa1.txt

sanatxt.txt

Regular expression: .a..txt

Result:

[sal]es.txt

[na1].txt

【na2】.txt

【sa1】.txt

【sanatxt】.txt

Analysis: This regular expression combines na1.txt and na2.txt , sa1.txt was found, but 2 unexpected results were also found. Because the . character in the regex .a..txt will match any character. To match the . character itself, you need to use \ escape. Modifying the regular expression to .a.\.txt can meet our needs.

Note: If you use java, then the regular expression .a.\.txt should be written as .a.\\.txt, because \ is also an escape character in the java language.

4. Summary

Regular expressions are often referred to as patterns. They are actually strings composed of some characters. These characters can be ordinary characters (plain text) or metacharacters (special characters with special meanings). Here is an introduction to how to use ordinary characters and metacharacters to match unit characters. .can match any character. \ is used to escape characters. In regular expressions, character sequences with special meanings always begin with the \ character. In the next article, we'll look at how to match groups of characters.

For more detailed explanations of regular expression tutorials on matching a single character, please pay attention to the PHP Chinese website!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn