Home >Backend Development >PHP Tutorial >Regular expression tutorial - detailed explanation of matching a single character
The example in this article describes the regular expression tutorial of matching a single character. Share it with everyone for your reference, the details are as follows:
Note: In all examples, the regular expression matching results are included between [and] in the source text. Some examples will be implemented using Java. If The usage of regular expressions in Java itself will be explained in the corresponding places. All java examples are tested under JDK1.6.0_13.
java test code:
/** * 根据正则表达式和要匹配的源文本,输出匹配结果 * @param regex 正则表达式 * @param sourceText 要匹配的源文本 */ public static void matchAndPrint(String regex, String sourceText){ Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(sourceText); while(matcher.find()){ System.out.println(matcher.group()); } }
1. Match plain text
1. There is only one matching result
First look at a simple regular expression, today, although it is plain text itself, it is a regular expression. Let’s look at an example:
Source text: Yesterday is history, tomorrow is a mystery, but today is a gift.
Regular expression: today
Result: Yesterday is history, tomorrow is a mystery, but [today] is a gift.
Analysis: The regular expression used here is plain text, which matches today in the source text.
Call the matchAndPrint method, the output result is:
today
2. There are multiple matching results
Source text: Yesterday is history, tomorrow is a mystery, but today is a gift.
Regular expression: is
Result: Yesterday is history, tomorrow is a mystery, but 【today】 is a gift.
Analysis: In the source text, there are three is, but four is are output, because the is in history will also be matched.
Call the matchAndPrint method, the output result is:
is
is
is
is
3, Letter case issues
Regular expressions are case-sensitive, but many regular expression implementations also support case-insensitive matching operations. In JavaScript, use the i flag to perform a case-insensitive match. In java, if you want to be case-insensitive, when compiling the regular expression, you can specify:
Patternpattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
2. Match any character
The regular expressions we saw earlier are all static plain text, they simply do not reflect the power of regular expressions. Next, let's see how to use regular expressions to match unpredictable characters.
In regular expressions, special characters (or collections of characters) are used to give what to search for. The . character (English status period) can match any single character. Equivalent to the ? character in DOS and the _ (underscore) character in SQL. For example: the regular expression c.t will match cat, cut, cot, etc. Let’s look at an example.
Text:
orders1.txt
orders2.txt
sales1.txt
salesA.txt
orders3.txt
sales2.txt
sales.txt
Regular expression: sales.
Result:
orders1.txt
orders2.txt
【sales1】.txt
【salesA】.txt
orders3.txt
【sales2】. txt
【sales.】txt
Analysis: The regular expression sales. will find the file name composed of the string sales and another note. It can be seen from the results that. Match letters, numbers, and itself. 4 out of 7 files match this pattern.
If the matchAndPrint method is called, the output result is:
sales1
salesA
sales2
sales.
3. Match the special characters
. Characters have special meanings in regular expressions. If you need a . in the pattern, you have to find a way to tell the regular expression that you need the . character itself rather than its special meaning in the regular expression. To do this, the . must be escaped by preceding it with a \ character. \ is also a metacharacter (metacharacter, indicating that this character has a special meaning, not the character meaning itself). Consider the following example.
Find files starting with na or sa, no matter what number follows it.
Text:
sales.txt
na1.txt
na2.txt
sa1.txt
sanatxt.txt
Regular expression: .a..txt
Result:
[sal]es.txt
[na1].txt
【na2】.txt
【sa1】.txt
【sanatxt】.txt
Analysis: This regular expression combines na1.txt and na2.txt , sa1.txt was found, but 2 unexpected results were also found. Because the . character in the regex .a..txt will match any character. To match the . character itself, you need to use \ escape. Modifying the regular expression to .a.\.txt can meet our needs.
Note: If you use java, then the regular expression .a.\.txt should be written as .a.\\.txt, because \ is also an escape character in the java language.
4. Summary
Regular expressions are often referred to as patterns. They are actually strings composed of some characters. These characters can be ordinary characters (plain text) or metacharacters (special characters with special meanings). Here is an introduction to how to use ordinary characters and metacharacters to match unit characters. .can match any character. \ is used to escape characters. In regular expressions, character sequences with special meanings always begin with the \ character. In the next article, we'll look at how to match groups of characters.
For more detailed explanations of regular expression tutorials on matching a single character, please pay attention to the PHP Chinese website!