Home >Backend Development >C++ >How to use C++ multithreading to process large amounts of data?

How to use C++ multithreading to process large amounts of data?

王林
王林Original
2024-06-06 12:35:58851browse

Using multi-threading to process large amounts of data in C++ can significantly improve performance. The specific steps are as follows: Create a thread pool (a group of threads created in advance) to distribute data and tasks to threads: the queue stores data, and threads read atoms from the queue. Counters track unprocessed data, and thread processing counter increments define data processing logic (code that processes data, such as sorting, aggregation, or other calculations) Practical case: reading a large amount of data from a file and printing it on the screen

How to use C++ multithreading to process large amounts of data?

How to use multi-threading in C++ to process large amounts of data

When processing large amounts of data, multi-threading can significantly improve performance. This article guides you through using multithreading in C++ and provides a practical example of working with large amounts of data.

Create a thread pool

The thread pool refers to a group of threads created in advance, and the program does not need to reallocate resources each time it creates a thread. In C++, thread pools can be easily created using the std::thread and std::atomic libraries:

#include <thread>
#include <atomic>

std::atomic<bool> stop{false};
std::vector<std::thread> workers;

void WorkerThread() {
  while (!stop.load()) {
    // 在这里放置数据处理逻辑
  }
}

void CreateThreadPool(int num_threads) {
  workers.reserve(num_threads);
  for (int i = 0; i < num_threads; ++i) {
    workers.emplace_back(WorkerThread);
  }
}

Distributing data and tasks

Tasks assigned to the thread pool can take many forms. You can store data in a queue and have each thread read data from the queue. Another approach is to use an atomic counter, keep track of the amount of data that has not yet been processed, and have each thread handle a counter increment.

Data processing logic

Data processing logic is defined in the WorkerThread function. You can use any code that processes data, such as sorting, aggregation, or other calculations.

Practical case: file reading

We use multi-threading to read a large amount of data from the file and then print it on the screen.

#include <iostream>
#include <fstream>
#include <string>

void ReadFile(std::string filename, std::atomic<int>& num_lines) {
  std::ifstream file(filename);
  if (file.is_open()) {
    std::string line;
    while (std::getline(file, line)) {
      std::cout << line << std::endl;
      num_lines++;
    }
  }
}

int main() {
  const std::string filename = "data.txt";
  int num_threads = 4;
  std::atomic<int> num_lines{0};

  CreateThreadPool(num_threads);

  std::thread file_reader(ReadFile, filename, std::ref(num_lines));

  // 让主线程等待读取线程完成
  file_reader.join();

  std::cout << "总行数:" << num_lines << std::endl;

  // 停止线程池
  stop.store(true);
  for (auto& worker : workers) {
    worker.join();
  }

  return 0;
}

In this example, each worker thread reads a line from the file and prints it to the screen. Atomic counter num_lines Tracks the number of unprocessed lines.

By using multi-threading, we can process file reading tasks in parallel, significantly reducing the time required to read the entire file.

The above is the detailed content of How to use C++ multithreading to process large amounts of data?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn