Detailed explanation of the use of regular metacharacters
This time I will bring you a detailed explanation of the use of regular metacharacters. What are the precautions when using regular metacharacters? The following is a practical case, let’s take a look.
Note: In all examples, the regular expression matching result is contained between [ and ] in the source text, Some examples will be implemented using Java. If it is the usage of regular expressions in Java itself, it will be explained in the corresponding place. All java examples are tested under JDK1.6.0_13.
1. Escape special characters
Metacharacters are characters that have special meanings in regular expressions. Because metacharacters have special meanings in regular expressions, these characters cannot be used to represent themselves. You can escape a metacharacter by preceding it with a backslash, so that the resulting escape sequence will match that character itself rather than its special metacharacter meaning. For example, if you want to match [and], you must escape it:
and
.
To escape metacharacters, you need to use the slash \ character, which means that the \ character itself is also a metacharacter. To match the \ character itself, it must be escaped into \\. Such as matching windows file path.
2. Match white space characters
Metacharacters can be roughly divided into two types: one is used to match text (such as .), and the other is regular The expression's syntax requires it (such as [and]).
When performing regular expression searches, we often encounter situations where we need to match non-printing whitespace characters in the original text. For example, we may need to find all tab characters, or we need to find newline characters. Such characters are difficult to be directly input into a regular expression. In this case, we can use the special elements listed below. characters to enter them:
\b | Go back (and delete) one character (Backspace key) |
\f | Form feed character |
\n | Line feed character |
\r | Carriage return character |
\t | Tab character (Tab key) |
\v | Vertical Tab |
Let’s look at an example to remove blank lines from the file:
Text:
8 5 4 1 6 3 2 7 9
7 6 2 9 5 8 3 4 1
9 3 1 4 2 7 8 5 6
6 9 3 8 7 5 1 2 4
5 1 8 3 4 2 6 9 7
2 4 7 6 1 9 5 3 8
3 26 7 8 4 9 1 5
4 8 9 5 3 1 7 6 2
1 7 5 2 9 6 4 8 3
Regular expression: \r\n\r\n
Analysis: \r\n matches a carriage return + line feed combination, it is used as the end tag of a text line in the Windows operating system. A search using the regular expression \r\n\r\n will match two consecutive end-of-line tags, which happen to be blank lines.
Note: Unix and Linux operating systems only use a newline character to end a text line. In other words, to match blank lines in Unix or Linux systems, just use \n\n. No need to add \r. Regular expressions applicable to both windows and Unix/Linux should include an optional \r and a must-match \n, that is, \r?\n\r?\n, which will be discussed in a later article .
The Java code is as follows:
public static void matchBlankLine() throws Exception{ BufferedReader br = new BufferedReader(new FileReader(new File("E:/九宫格.txt"))); StringBuilder sb = new StringBuilder(); char[] cbuf = new char[1024]; int len = 0; while(br.ready() && (len = br.read(cbuf)) > 0){ br.read(cbuf); sb.append(cbuf, 0, len); } String reg = "\r\n\r\n"; System.out.println("原内容:\n" + sb.toString()); System.out.println("处理后:-----------------------------"); System.out.println(sb.toString().replaceAll(reg, "\r\n")); }
The running result is as follows:
原内容: 8 5 4 1 6 3 2 7 9 7 6 2 9 5 8 3 4 1 9 3 1 4 2 7 8 5 6 6 9 3 8 7 5 1 2 4 5 1 8 3 4 2 6 9 7 2 4 7 6 1 9 5 3 8 3 2 6 7 8 4 9 1 5 4 8 9 5 3 1 7 6 2 1 7 5 2 9 6 4 8 3 处理后:----------------------------- 8 5 4 1 6 3 2 7 9 7 6 2 9 5 8 3 4 1 9 3 1 4 2 7 8 5 6 6 9 3 8 7 5 1 2 4 5 1 8 3 4 2 6 9 7 2 4 7 6 1 9 5 3 8 3 2 6 7 8 4 9 1 5 4 8 9 5 3 1 7 6 2 1 7 5 2 9 6 4 8 3
3. Match specific character categories
Character sets (matching one of multiple characters) are the most common form of matching, and some commonly used character sets can be replaced by special metacharacters. These metacharacters match a certain class of characters (class metacharacters). Class metacharacters are not essential because you can match a certain class of characters by enumerating the relevant characters one by one or by defining a character range, but using them The constructed regular expression is concise and easy to understand and is commonly used in practical applications.
1. Match numbers and non-numbers
\d Any number, equivalent to any one of [0-9] or [0123456789]
\D Non-digits, equivalent to [^0-9] or [^0123456789]
2. Match letters and numbers with non-letters and numbers
letters (A-Z is not Case-sensitive), numbers, and underscores are a commonly used set of characters. The following metacharacters can be used:
\w Any letter (case-insensitive), numbers, and underscores are equivalent to [0- 9a-zA-Z_]
\W Any non-alphanumeric and underscore, equivalent to [^0-9a-zA-Z_]
3. Matches whitespace characters and non-whitespace characters
\s Any white space character is equivalent to [\f\n\r\t\v]
\S Any white space character is equivalent to [^\f\n \r\t\v]
Note: The backspace metacharacter \b is not within the range of \s.
4. Match hexadecimal or octal values
Hexadecimal: given with the prefix \x, for example: \x0A corresponds to the ASCII character 10 (newline character), its effect is equivalent to \n.
Octal: given with the prefix \0, the value itself can be two or three digits, for example: \011 corresponds to ASCII character 9 (tab), and its effect is equivalent to \t.
4. Use POSIX character classes
POSIX character classes are a shorthand form supported by many regular expression implementations. Java also supports it, but JavaScript does not. POSIX characters are as follows:
[:alnum:] | Any letter or number, equivalent to [a-zA-Z0-9] |
[:alpha:] | Any letter is equivalent to [a-zA-Z] |
Space or tab character, equivalent to [\t] | |
ASCII control character ( ASCII 0 to 31, plus ASCII 127) | |
Any number, equivalent to [0-9] | |
Any printable character, but not including spaces | |
Any lowercase letter, equivalent to [a-z] | |
Any printable character | |
Any character that does not belong to [:alnum:] and [:cntrl:] | |
Any whitespace character, including spaces, is equivalent to [^\f\n\r\t\v] | |
Any uppercase letter is equivalent to [A-Z] | |
Any hexadecimal digit is equivalent to [a- fA-F0-9] |
\p{Alnum} | Alphanumeric characters: [\p{Alpha}\p {Digit}] |
\p{Alpha} | Alphabetic characters: [\p{Lower}\p{Upper}] |
\p{ASCII} | All ASCII: [\x00-\x7F] |
\p{Blank} | space or Tab character: [ \t] |
\p{Cntrl} | Control character: [\x00-\x1F\x7F] |
\p{Digit} | Decimal digits: [0-9] |
Visible characters: [\p{Alnum}\p{Punct}] | |
Lowercase alphabetic characters: [a-z] | |
Printable characters: [\p{Graph}\x20] | |
Punctuation: !"#$%&'()*+,-./:;?@[\]^_`{|}~ | |
White space characters: [ \t\n\x0B\f\r] | |
uppercase Alphabetical characters: [A-Z] | |
Hexadecimal digits: [0-9a-fA-F] |
## I believe you have mastered the method after reading the case in this article. For more exciting information, please pay attention to other related articles on the php Chinese website!
Recommended reading:
Position matching tutorial of regular expression tutorial (with code) JS password strength verification regular expression (with code) Code)The above is the detailed content of Detailed explanation of the use of regular metacharacters. For more information, please follow other related articles on the PHP Chinese website!

Python and JavaScript each have their own advantages, and the choice depends on project needs and personal preferences. 1. Python is easy to learn, with concise syntax, suitable for data science and back-end development, but has a slow execution speed. 2. JavaScript is everywhere in front-end development and has strong asynchronous programming capabilities. Node.js makes it suitable for full-stack development, but the syntax may be complex and error-prone.

JavaScriptisnotbuiltonCorC ;it'saninterpretedlanguagethatrunsonenginesoftenwritteninC .1)JavaScriptwasdesignedasalightweight,interpretedlanguageforwebbrowsers.2)EnginesevolvedfromsimpleinterpreterstoJITcompilers,typicallyinC ,improvingperformance.

JavaScript can be used for front-end and back-end development. The front-end enhances the user experience through DOM operations, and the back-end handles server tasks through Node.js. 1. Front-end example: Change the content of the web page text. 2. Backend example: Create a Node.js server.

Choosing Python or JavaScript should be based on career development, learning curve and ecosystem: 1) Career development: Python is suitable for data science and back-end development, while JavaScript is suitable for front-end and full-stack development. 2) Learning curve: Python syntax is concise and suitable for beginners; JavaScript syntax is flexible. 3) Ecosystem: Python has rich scientific computing libraries, and JavaScript has a powerful front-end framework.

The power of the JavaScript framework lies in simplifying development, improving user experience and application performance. When choosing a framework, consider: 1. Project size and complexity, 2. Team experience, 3. Ecosystem and community support.

Introduction I know you may find it strange, what exactly does JavaScript, C and browser have to do? They seem to be unrelated, but in fact, they play a very important role in modern web development. Today we will discuss the close connection between these three. Through this article, you will learn how JavaScript runs in the browser, the role of C in the browser engine, and how they work together to drive rendering and interaction of web pages. We all know the relationship between JavaScript and browser. JavaScript is the core language of front-end development. It runs directly in the browser, making web pages vivid and interesting. Have you ever wondered why JavaScr

Node.js excels at efficient I/O, largely thanks to streams. Streams process data incrementally, avoiding memory overload—ideal for large files, network tasks, and real-time applications. Combining streams with TypeScript's type safety creates a powe

The differences in performance and efficiency between Python and JavaScript are mainly reflected in: 1) As an interpreted language, Python runs slowly but has high development efficiency and is suitable for rapid prototype development; 2) JavaScript is limited to single thread in the browser, but multi-threading and asynchronous I/O can be used to improve performance in Node.js, and both have advantages in actual projects.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

SublimeText3 Mac version
God-level code editing software (SublimeText3)

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software
