Home >Java >javaTutorial >Java implements string matching (based on regularity)

Java implements string matching (based on regularity)

高洛峰
高洛峰Original
2017-01-16 11:05:091749browse

There is a String, how to check whether there are y and f characters in it? The darkest way is:

Program 1: I know if, for statements and charAt()

class Test{
 public static void main(String args[]) {
  String str="For my money, the important thing "+"about the meeting was bridge-building";
  char x='y';
  char y='f';
  boolean result=false;
  for(int i=0;i<str.length;i++){
   char z=str.charAt(i); //System.out.println(z);
   if(x==z||y==z) {
    result=true;
    break;
   }
   else result=false;
  }
  System.out.println(result);
 }
}

It seems very intuitive, but this method is difficult to deal with complex situations. Work. For example, query whether there is is in a piece of text? Is there a thing or ting, etc. It's a nasty job.

Java's java.util.regex package

According to the object-oriented idea, the string you want to query such as is, thing or ting is encapsulated into an object, and this object is used as a template to match A paragraph of text becomes more natural. The thing that serves as a template is the regular expression to be discussed below. Let’s ignore the complexity and look at an example: Program 2: I don’t understand. Can we take a look first?

import java.util.regex.*;
 
class Regex1{
 public static void main(String args[]) {
  String str="For my money, the important thing "+"about the meeting was bridge-building";
  String regEx="a|f"; //表示a或f
  Pattern p=Pattern.compile(regEx);
  Matcher m=p.matcher(str);
  boolean result=m.find();
  System.out.println(result);
 }
}

If str matches regEx, then result is true, otherwise it is false. If you want to ignore case when searching, you can write:

Pattern p=Pattern.compile(regEx,Pattern.CASE_INSENSITIVE);

Although I don’t know Pattern (template, mode) and The details of the Matcher (matcher), the program feels more comfortable. If you query is first, and then query thing or ting, we only need to modify the template Pattern instead of considering if statements and for statements, or through charAt() .

1. Write a special string - a regular expression such as a|f.

2. Compile the regular expression into a template: p

3. Use template p to match the string str.

The idea is clear, now let’s see how Java handles it (Java programmers cannot use these classes until JDK1.4.

Pattern class and search

 ①public final class java .util.regex.Pattern is a compiled expression of a regular expression. The following statement will create a Pattern object and assign it to the handle p: Pattern p=Pattern.compile(regEx);

Interestingly, The Pattern class is a final class, and its constructor is private. Maybe someone told you something about design patterns, or you can check the relevant information yourself. The conclusion here is: the Pattern class cannot be inherited, and we cannot create the Pattern class through new. Object.

Therefore, in the Pattern class, two overloaded static methods are provided, whose return value is the Pattern object (reference), such as:

public static Pattern compile(String regex) {
 return new Pattern(regex, 0);
}

Of course, we can declare the handle of the Pattern class, such as Pattern p=null;

②p.matcher(str) means using template p to generate a matcher for the string str, and its return value is a Matcher Class reference, why do we need this thing? According to the natural idea, can't we return a boolean value?

We can simply use the following method:


# #Copy code The code is as follows:

boolean result=Pattern.compile(regEx).matcher(str).find();


Actually there are three The handleless method of statement merging. No handle is often not a good method. Let’s learn about the Matcher class later. Let’s take a look at regEx first.

Regular expression qualifier

Regular expression (Regular Expression) is a string that generates strings. For example, String regEx="me+"; the strings that the string me+ can generate are: me, mee, meee, meeeeeeeee, etc. , a regular expression may generate infinite strings, so it is impossible (is it necessary?) to output everything produced by the regular expression

Consider the other way around, for strings: me, mee, meee. , meeeeeeeeeee, etc., can we have a language to describe them? Obviously, the regular expression language is this language, it is a concise and profound description of some strings.

 We use. Regular expressions are used for string search, matching, replacement of specified strings, string splitting, etc.

The string that generates the string - the regular expression is really a bit complicated, because we want to describe any string by ordinary characters (such as characters a to z) and special characters (called metacharacters), And be accurate.

Let’s look at a few regular expression examples:

Program 3: We always use this program to test regular expressions

import java.util.regex.*;
 
class Regex1{
 public static void main(String args[]) {
  String str="For my money, the important thing ";
  String regEx="ab*";
  boolean result=Pattern.compile(regEx).matcher(str).find();
  System.out.println(result);
 }
}//ture

 ①"ab* "——Can match a, ab, abb, abbb.... Therefore, * means that the preceding character can occur zero or more times. If you only consider searching, just use "a" directly. But think about the replacement situation. Question regEx="abb*" What is the result?

 ②"ab+"——Can match ab, abb, abbb.... Equivalent to "abb*". What is the result of question regEx="or+"?

 ③"or?"——Can match o and or. ? means that the preceding character can appear zero or once.

These qualifiers *, +, and ? conveniently represent the number of occurrences of the preceding character (substring) (we use {} to describe it): x*, zero or more times ≡{0,}

The above is the entire content of this article. I hope it can help everyone realize the power of regular expressions.

For more articles related to string matching in Java (based on regular rules), please pay attention to the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn