Home >Backend Development >C#.Net Tutorial >C++ multi-threaded programming summary

C++ multi-threaded programming summary

黄舟
黄舟Original
2017-02-06 14:02:111501browse

When developing C++ programs, there are generally higher requirements in terms of throughput, concurrency, and real-time performance. When designing a C++ program, in summary, efficiency can be improved from the following points:

  • Concurrency

  • Asynchronous

  • Caching

The following are some examples of some problems I encounter in my daily work. The design ideas are nothing more than the above three points.

1 Task Queue

1.1 Design the task queue based on the producer-consumer model

The producer-consumer model is a model that people are very familiar with, such as in a certain server In the program, when the User data is modified by the logic module, a task to update the database (produce) is generated and delivered to the IO module task queue. The IO module takes out the task from the task queue and performs the SQL operation (consume).

Design a general task queue, the sample code is as follows:

For detailed implementation, please see:

http://ffown.googlecode.com/svn/trunk/fflib/include /detail/task_queue_impl.h

void task_queue_t::produce(const task_t& task_) {
lock_guard_t lock(m_mutex);
if (m_tasklist->empty()){
//! 条件满足唤醒等待线程
m_cond.signal();
}
m_tasklist->push_back(task_);
}
int task_queue_t::comsume(task_t& task_){
lock_guard_t lock(m_mutex);
while (m_tasklist->empty())
//! 当没有作业时,就等待直到条件满足被唤醒{
if (false == m_flag){
return -1;
}
m_cond.wait();
}
task_ = m_tasklist->front();
m_tasklist->pop_front();
return 0;
}


##1.2 Tips for using task queue

1.2.1 Separation of IO and logic

For example, online game server In the program, the network module receives the message packet, returns immediately after delivering it to the logic layer, and continues to accept the next message packet. Logical threads run in an environment without IO operations to ensure real-time performance. Example:

void handle_xx_msg(long uid, const xx_msg_t& msg){
logic_task_queue->post(boost::bind(&servie_t::proces, uid, msg));
}


Note that this mode is a single task queue, and each task queue is single threaded.

1.2.2 Parallel Pipeline

The above only completes the parallelization of io and cpu operations, while the logical operations in the cpu are serial. In some cases, the CPU logical operation part can also be parallelized. For example, in the game, the two operations of user A and B can be completely parallelized because the two operations do not share data. The simplest way is to assign operations related to A and B to different task queues. An example is as follows:

void handle_xx_msg(long uid, const xx_msg_t& msg) {
logic_task_queue_array[uid % sizeof(logic_task_queue_array)]->post(
boost::bind(&servie_t::proces, uid, msg));
}


Note that this mode is a multi-task queue, and each task queue is single-threaded.


1.2.3 Connection pool and asynchronous callback

For example, the logical Service module requires the database module to asynchronously load user data and perform subsequent processing and calculations. The database module has a connection pool with a fixed number of connections. When the task of executing SQL arrives, it selects an idle connection, executes SQL, and passes the SQL to the logic layer through the callback function. The steps are as follows:

Pre-allocate the thread pool, and each thread creates a connection to the database

Create a task queue for the database module, and all threads are consumers of this task queue


The logic layer delivers the sql execution task to the database module, and at the same time passes a callback function to receive the sql execution result

The example is as follows:

void db_t:load(long uid_, boost::functionpost(boost::bind(&db_t:load, uid, func));

Note that in this mode It is a single task queue, and each task queue has multiple threads.



2. Log

This article mainly talks about C++ multi-thread programming. The logging system is not intended to improve program efficiency, but during program debugging and running In terms of troubleshooting, logs are an irreplaceable tool. I believe that friends who develop background programs will use logs. Common ways to use logs include the following:


##Streaming, such as logstream << “start service time[%d]” << time(0 ) << ” app name[%s]” << app_string.c_str() << endl;

Printf Format: logtrace(LOG_MODULE, “start service time[%d] app name[%s]", time(0), app_string.c_str());



Both have their own advantages and disadvantages, streaming is thread-safe Yes, formatting strings in printf format will be more direct, but the disadvantage is that it is thread-unsafe. If you replace app_string.c_str() with app_string (std::string), the compilation will pass, but it will crash during runtime (if you are lucky every time It crashes all the time, but if you are unlucky, it will crash occasionally). I personally like the printf style and can make the following improvements:


# Increase thread safety and use the traits mechanism of C++ templates to achieve thread safety. Example:

template
void logtrace(const char* module, const char* fmt, ARG1 arg1){
boost::format s(fmt);
f % arg1;
}

In this way, in addition to the standard type + std::string, other types passed in will fail to compile. This is only an example of one parameter, this version can be overloaded to support more parameters, 9 parameters or more if you wish.

Add color to the log and add control characters to printf. The color can be displayed on the screen terminal. Example under Linux: printf(“33[32;49;1m [DONE] 33[39;49;0m ")

For more color schemes, see:

http://hi.baidu.com/jiemnij/blog/item/d95df8c28ac2815cb219a80e.html

When each thread starts , you should use logs to print what functions the thread is responsible for. In this way, when the program is running, you can know how much CPU is used by that function through top-H-p pid. In fact, each line of my log will print the thread ID. This thread ID is not pthread_id, but is actually the process ID number assigned by the system corresponding to the thread.


3. Performance Monitoring

Although there are many tools that can analyze the running performance of C++ programs, most of them still run in the program debugging stage. We need a way to monitor the program in both the debug and release phases. On the one hand, we can know where the bottlenecks of the program are, and on the other hand, we can find out as early as possible which components are abnormal during runtime.

通常都是使用gettimeofday 来计算某个函数开销,可以精确到微妙。可以利用C++的确定性析构,非常方便的实现获取函数开销的小工具,示例如下:

struct profiler{
profiler(const char* func_name){
gettimeofday(&tv, NULL);
}
~profiler(){
struct timeval tv2;
gettimeofday(&tv2, NULL);
long cost = (tv.tv_sec - tv.tv_sec) * 1000000 + (tv.tv_usec - tv.tv_usec);
//! post to some manager
}
struct timeval tv;
};
#define PROFILER() profiler(__FUNCTION__)

Cost 应该被投递到性能统计管理器中,该管理器定时讲性能统计数据输出到文件中。

4 Lambda 编程

使用foreach 代替迭代器

很多编程语言已经内建了foreach,但是c++还没有。所以建议自己在需要遍历容器的地方编写foreach函数。习惯函数式编程的人应该会非常钟情使用foreach,使用foreach的好处多多少少有些,如:

http://www.cnblogs.com/chsword/archive/2007/09/28/910011.html

但主要是编程哲学上层面的。

示例:

void user_mgr_t::foreach(boost::function func_){
for (iterator it = m_users.begin(); it != m_users.end() ++it){
func_(it->second);
}
}


比如要实现dump 接口,不需要重写关于迭代器的代码

void user_mgr_t:dump(){
struct lambda {
static void print(user_t& user){
//! print(tostring(user);
}
};
this->foreach(lambda::print);
}

实际上,上面的代码变通的生成了匿名函数,如果是c++ 11 标准的编译器,本可以写的更简洁一些:

this->foreach([](user_t& user) {} );

但是我大部分时间编写的程序都要运行在centos 上,你知道吗它的gcc版本是gcc 4.1.2, 所以大部分时间我都是用变通的方式使用lambda函数。


Lambda 函数结合任务队列实现异步


常见的使用任务队列实现异步的代码如下:

void service_t:async_update_user(long uid){
task_queue->post(boost::bind(&service_t:sync_update_user_impl, this, uid));
}
void service_t:sync_update_user_impl(long uid){
user_t& user = get_user(uid);
user.update()
}


这样做的缺点是,一个接口要响应的写两遍函数,如果一个函数的参数变了,那么另一个参数也要跟着改动。并且代码也不是很美观。使用lambda可以让异步看起来更直观,仿佛就是在接口函数中立刻完成一样。示例代码:

void service_t:async_update_user(long uid){
struct lambda {
static void update_user_impl(service_t* servie, long uid){
user_t& user = servie->get_user(uid);
user.update();
}
};
task_queue->post(boost::bind(&lambda:update_user_impl, this, uid));
}

这样当要改动该接口时,直接在该接口内修改代码,非常直观。



5. 奇技淫巧


利用 shared_ptr 实现 map/reduce


Map/reduce的语义是先将任务划分为多个任务,投递到多个worker中并发执行,其产生的结果经reduce汇总后生成最终的结果。Shared_ptr的语义是什么呢?当最后一个shared_ptr析构时,将会调用托管对象的析构函数。语义和map/reduce过程非常相近。我们只需自己实现讲请求划分多个任务即可。示例过程如下:


定义请求托管对象,加入我们需要在10个文件中搜索“oh nice”字符串出现的次数,定义托管结构体如下:

struct reducer{
void set_result(int index, long result) {
m_result[index] = result;
}
~reducer(){
long total = 0;
for (int i = 0; i < sizeof(m_result); ++i){
total += m_result[i];
}
//! post total to somewhere
}
long m_result[10];
};



定义执行任务的 worker

void worker_t:exe(int index_, shared_ptr ret) {
ret->set_result(index, 100);
}


将任务分割后,投递给不同的worker

shared_ptr ret(new reducer());
for (int i = 0; i < 10; ++i) { task_queue[i]->post(boost::bind(&worker_t:exe, i, ret));
}

以上就是C++ 多线程编程总结的内容,更多相关内容请关注PHP中文网(www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn