We mentioned regular expressions in the previous section, which improve the expressive ability of text processing. This section will discuss regular expressions. What is it? What is the use? What do the various special characters mean? How to process text with the help of regular expressions in Java? What are the commonly used regular expressions? Due to the large amount of content, we will discuss it in three sections. This section will first briefly discuss the syntax of regular expressions.
A regular expression is a string of characters, which describes a text pattern. It can be used to conveniently process text, including text search, replacement, verification, segmentation, etc.
There are two types of characters in regular expressions. One type is ordinary characters, which are the matching characters themselves. The other type is metacharacters. These characters have special meanings. These metacharacters and their special meanings constitute Regular expression syntax.
Regular expressions have a relatively long history. Various tools, editors and systems related to text processing support regular expressions, and most programming languages also support regular expressions. Although they are all called regular expressions, due to historical reasons, the syntax of different languages, systems and tools is different. This article mainly focuses on the Java language, and other languages may be different.
Next, we will briefly introduce the syntax of regular expressions. We will first introduce them in the following parts:
Single characters
Character group
Quantifier
Group
Special boundary matching
Look around boundary matching
Finally, a summary of escaping, matching modes and various syntaxes is provided.
Single characters
Most single characters are represented by the characters themselves, such as the characters '0', '3', 'a ', 'horse', etc., but has some single characters that use multiple characters to represent . These characters all start with a slash '\', such as:
##Special characters, such as tab character '\t', line feed character '\n', carriage return character '\r', etc.;
The octal representation of the character starts with \0, followed by 1 to 3 digits, such as \0141, which corresponds to the character with ASCII encoding 97, that is, the character 'a ';
Hexadecimal representation of the character , starting with \x, followed by two characters, such as \x6A, corresponding It is a character with ASCII encoding 106, that is, the character 'j';
The character represented by the Unicode number, represented by \u The beginning, followed by four characters, such as \u9A6C, represents the Chinese character 'horse', which can only represent characters with numbers below 0xFFFF. If it exceeds 0XFFFF, use the \x{...} form, such as the character '
The above is the detailed content of What is a regular expression? What is the use?. For more information, please follow other related articles on the PHP Chinese website!