search

Home  >  Q&A  >  body text

如何C++抽取txt中某指定内容并保存到另外一个txt中?

A文件中,内容是这样的:

[fullText]abcd[rating]
[fullText]efg[rating]

我想要抽取[fullText] [rating]之间的内容,并将其保存到B文件中,
不同标签对的内容用空格隔开。
应该怎么写呢?

怪我咯怪我咯2814 days ago620

reply all(3)I'll reply

  • 黄舟

    黄舟2017-04-17 11:50:06

    The idea is similar to @snailcoder.

    It is better to teach a man how to fish than to teach him how to fish. I would like to share with the subject the common thinking when encountering this kind of problem.

    1. First observe the structure of [fullText]efg[rating]. It is obvious that it consists of two tags before and after, sandwiching the content in the middle. So to extract the content in the middle of a specific tag, there must be a string matching process.

    2. In order to make it easier to match strings, if [fullText]efg[rating] can be separated into three parts: fullText, efg, rating, and then match the first and third strings, if they match on, extract the second string.

    3. The most basic file operation, read A.txt line by line, put the result into B.txt, separated by spaces.


    Breaking it down, it has the following key functions:

    1. Read and write files
    2. Split string

    For 2, I suggest the subject read an article I once summarized: String Segmentation Technology. For 1, it is the basic operation of C++.

    Write a simple pickup function:

    cppbool pickup(const string &source, const string &dest, const string &label_front, const string &label_back) {
        ifstream ifs( source );
        if ( ifs.fail() ) return false;
    
        ofstream ofs( dest );
        if ( ofs.fail() ) return false;
    
        for ( string line; std::getline(ifs, line); ) {
            vector<string> content;
            if ( 3 == split( line, "[]", content ).size() && content[0] == label_front && content[2] == label_back ) 
                ofs << content[1] << " ";
        }
    
        return true;
    }
    

    For string splitting, I directly used the function in the article mentioned above:

    cppvector<string> &split( const string &str, const string &delimiters, vector<string> &elems, bool skip_empty = true ) {
        string::size_type pos, prev = 0;
        while ( ( pos = str.find_first_of(delimiters, prev) ) != string::npos ) {
            if ( pos > prev ) {
                if ( skip_empty && 1 == pos - prev ) break;
                elems.emplace_back( str, prev, pos - prev );
            }
            prev = pos + 1;
        }
        if ( prev < str.size() ) elems.emplace_back( str, prev, str.size() - prev );
        return elems;
    }
    

    Finally you can call pickup to check whether a B.txt that meets the requirements is generated:

    cppint main()
    {
        if ( pickup("A.txt", "B.txt", "fullText", "rating") )
            std::cout << "pickup success!" << std::endl;
    }
    

    For the complete code, please see: https://gist.github.com/pezy/7d9fb9fa74eebe819eba

    reply
    0
  • 天蓬老师

    天蓬老师2017-04-17 11:50:06

    Just a few lines of program using regular expressions:

    #include <iostream>
    #include <regex>
    int main() {
        std::regex r("\[fullText\](.*)\[rating\]");
        std::string l;
        while(std::cin) {
            std::getline(std::cin, l);
            std::cout << std::regex_replace(l, r, "\n");
        }
    }
    

    If you don’t insist on using C++, it can be shorter:

    perl -pe 's/\[fullText\](.*)\[rating\]//g'
    

    reply
    0
  • PHPz

    PHPz2017-04-17 11:50:06

    The logic is very simple, just know a little bit about string operations and file operations. The code below can achieve your requirements without considering exception handling or efficiency. If necessary, you can just change it yourself

    #include <iostream>
    #include <fstream>
    #include <string>
    
    using namespace std;
    
    class Solution {
     public:
      int ProcessFile(const string &src_file, 
                      const string &dest_file,
                      const string &head, 
                      const string &end) {
        ifstream input(src_file.c_str(), ifstream::in);
        if (!input) {
          return -1;
        }
        ofstream output(dest_file.c_str(), ofstream::out);
        if (!output) {
          return -1;
        }
        string line;
        string ::size_type head_len = head.length();
        while(getline(input, line)) {
          string::size_type head_pos = line.find(head, 0);
          string::size_type end_pos = line.find(end, head_pos + head_len);
          output << line.substr(head_pos + head_len, 
                                end_pos - head_pos - head_len) << ' ';
        }
        input.close();
        output.close();
        return 0;
      }
    };
    int main() {
      string src_file = "input.txt", dest_file = "output.txt";
      string head_name = "[fullText]", end_name = "[rating]";
      Solution sln;
      if (sln.ProcessFile(src_file, dest_file, head_name, end_name) < 0) {
        cout << "Fail..." << endl;
      } else {
        cout << "Success..." << endl;
      }
      return 0;
    }
    

    reply
    0
  • Cancelreply