Home >Backend Development >C++ >How to use C++ multithreading to process large amounts of data?
Using multi-threading to process large amounts of data in C++ can significantly improve performance. The specific steps are as follows: Create a thread pool (a group of threads created in advance) to distribute data and tasks to threads: the queue stores data, and threads read atoms from the queue. Counters track unprocessed data, and thread processing counter increments define data processing logic (code that processes data, such as sorting, aggregation, or other calculations) Practical case: reading a large amount of data from a file and printing it on the screen
How to use multi-threading in C++ to process large amounts of data
When processing large amounts of data, multi-threading can significantly improve performance. This article guides you through using multithreading in C++ and provides a practical example of working with large amounts of data.
Create a thread pool
The thread pool refers to a group of threads created in advance, and the program does not need to reallocate resources each time it creates a thread. In C++, thread pools can be easily created using the std::thread
and std::atomic
libraries:
#include <thread> #include <atomic> std::atomic<bool> stop{false}; std::vector<std::thread> workers; void WorkerThread() { while (!stop.load()) { // 在这里放置数据处理逻辑 } } void CreateThreadPool(int num_threads) { workers.reserve(num_threads); for (int i = 0; i < num_threads; ++i) { workers.emplace_back(WorkerThread); } }
Distributing data and tasks
Tasks assigned to the thread pool can take many forms. You can store data in a queue and have each thread read data from the queue. Another approach is to use an atomic counter, keep track of the amount of data that has not yet been processed, and have each thread handle a counter increment.
Data processing logic
Data processing logic is defined in the WorkerThread
function. You can use any code that processes data, such as sorting, aggregation, or other calculations.
Practical case: file reading
We use multi-threading to read a large amount of data from the file and then print it on the screen.
#include <iostream> #include <fstream> #include <string> void ReadFile(std::string filename, std::atomic<int>& num_lines) { std::ifstream file(filename); if (file.is_open()) { std::string line; while (std::getline(file, line)) { std::cout << line << std::endl; num_lines++; } } } int main() { const std::string filename = "data.txt"; int num_threads = 4; std::atomic<int> num_lines{0}; CreateThreadPool(num_threads); std::thread file_reader(ReadFile, filename, std::ref(num_lines)); // 让主线程等待读取线程完成 file_reader.join(); std::cout << "总行数:" << num_lines << std::endl; // 停止线程池 stop.store(true); for (auto& worker : workers) { worker.join(); } return 0; }
In this example, each worker thread reads a line from the file and prints it to the screen. Atomic counter num_lines
Tracks the number of unprocessed lines.
By using multi-threading, we can process file reading tasks in parallel, significantly reducing the time required to read the entire file.
The above is the detailed content of How to use C++ multithreading to process large amounts of data?. For more information, please follow other related articles on the PHP Chinese website!