Home  >  Article  >  Backend Development  >  Parsing text using regular expressions in C++

Parsing text using regular expressions in C++

PHPz
PHPzOriginal
2023-08-22 14:58:43993browse

Parsing text using regular expressions in C++

Parsing text with regular expressions in C

Regular expressions are a powerful and flexible tool for matching and searching text patterns. In C, we can use regular expression library to parse text.

There are two main choices for regular expression libraries in C: std::regex and Boost.Regex. Both libraries provide similar interfaces and functionality. However, because they are implemented differently, there may be performance differences in some cases. Boost.Regex is generally considered the faster and more accurate option, but it also requires the use of the Boost library.

In this article, we will introduce how to use the std::regex library to parse text in C. We will demonstrate through several examples how to match and extract text using different regular expression syntax.

Example 1: Match basic text

In this example, we will match a string that contains "hello".

#include <iostream>
#include <regex>
 
int main() {
    std::string text = "hello world!";
    std::regex pattern("hello");
 
    if (std::regex_search(text, pattern)) {
        std::cout << "Match found!" << std::endl;
    } else {
        std::cout << "Match not found." << std::endl;
    }
 
    return 0;
}

This simple program uses the std::regex_search() function to search whether the "hello" string exists in text. If a match is found, the program will output "Match found!", otherwise it will output "Match not found.". Note that we used the std::string and std::regex classes and passed the regular expression as a string to the regex object.

Example 2: Using metacharacters

Metacharacters in regular expressions refer to characters with special meanings. Here are some of the most commonly used metacharacters and their meanings:

  • . Matches any character.
  • ^ Matches the beginning of a string.
  • $ Matches the end of the string.
  • d matches a number.
  • w Matches a word character (letter, number, or underscore).
  • s matches a whitespace character (space, tab, etc.).

In the example below, we will match any string that starts with "hello".

#include <iostream>
#include <regex>
 
int main() {
    std::string text1 = "hello world!";
    std::string text2 = "world hello!";
    std::regex pattern("^hello");
 
    if (std::regex_search(text1, pattern)) {
        std::cout << "Match found in text1!" << std::endl;
    }
 
    if (std::regex_search(text2, pattern)) {
        std::cout << "Match found in text2!" << std::endl;
    }
 
    return 0;
}

In this example, we use the metacharacter "^" to match strings starting with "hello". In the first text "hello world!", both the regular expression and the string start with "hello", so the program will output "Match found in text1!". In the second text "world hello!", the regular expression does not match the beginning of the string, so the program prints nothing.

Example 3: Using quantifiers

Quantifiers in regular expressions specify the number of pattern matches. Here are some of the most commonly used quantifiers and their meanings:

    • Matches the preceding pattern zero or more times.
    • # Matches the previous pattern one or more times.
  • ? Matches the preceding pattern zero or one time.
  • {n} Matches the previous pattern exactly n times.
  • {n,} Matches the previous pattern at least n times.
  • {n,m} matches the previous pattern at least n times, but not more than m times.

In the following example, we will use the quantifier " " to match one or more numbers.

#include <iostream>
#include <regex>
 
int main() {
    std::string text1 = "1234";
    std::string text2 = "a1234";
    std::regex pattern("\d+");
 
    if (std::regex_search(text1, pattern)) {
        std::cout << "Match found in text1!" << std::endl;
    }
 
    if (std::regex_search(text2, pattern)) {
        std::cout << "Match found in text2!" << std::endl;
    }
 
    return 0;
}

In this example, we use the regular expression "d" to match one or more numbers. In the first text "1234", the regular expression matches the entire string, so the program will output "Match found in text1!". In the second text "a1234", the regular expression only matches the numeric substring "1234", so the program will output "Match found in text2!".

Example 4: Using Grouping

Grouping in regular expressions allows us to split a pattern into sub-patterns and only consider one of them when matching. Grouping is expressed using parentheses. In the example below, we will match strings containing "hello" or "world".

#include <iostream>
#include <regex>
 
int main() {
    std::string text1 = "hello";
    std::string text2 = "world";
    std::string text3 = "hello world!";
    std::regex pattern("(hello|world)");
 
    if (std::regex_search(text1, pattern)) {
        std::cout << "Match found in text1!" << std::endl;
    }
 
    if (std::regex_search(text2, pattern)) {
        std::cout << "Match found in text2!" << std::endl;
    }
 
    if (std::regex_search(text3, pattern)) {
        std::cout << "Match found in text3!" << std::endl;
    }
 
    return 0;
}

In this example, we use the regular expression "(hello|world)" to group "hello" and "world" as two groups. In the first text "hello", the regular expression only matches the first grouping, so the program will output "Match found in text1!". In the second text "world", the regular expression only matches the second grouping, so the program will output "Match found in text2!". In the third text "hello world!", the regular expression matches the first or second grouping, so the program will output "Match found in text3!".

Summary

In this article, we introduced how to use regular expressions to parse text in C. We detail some of the most commonly used regular expression syntax, including metacharacters, quantifiers, and grouping. Hopefully these examples will help you better understand how to use regular expressions to process text data.

The above is the detailed content of Parsing text using regular expressions in C++. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn