Python - How to use multiple processes to solve the problem of slow loop nesting?

Question

There is a pattern of a loop within a loop. The variables of both the large loop and the small loop must be used in the loop body of the inner loop. I have simplified it into a simple model. This model will be extremely slow if the function is complex. I would like to ask how to use multi-process to solve the speed problem? mine...

学习ing · Answer

First of all, parallel computing requires that there is no mutual causal relationship between the subroutines of each parallel operation.
In the small loop, res has a close causal relationship with x, y, z, and alist, blist, and clist, and it is difficult to split them into parallel calculations.
Although the code posted by the questioner is not the original code, I don’t know if there is a causal relationship between the large loops in the original code. However, judging from the schematic code,
splitting the large loop into N threads (no processes are needed) should be Yes, each thread calculates 2000/N times.
For example, divided into 8 threads, thread 1 calculates i=0 to 249, thread 2 calculates i=250 to 499, and so on. . .
The size of N here can be determined according to the number of CPU cores. If N exceeds the number of CPU cores, it will not make much sense, but it may reduce efficiency.

PHP中文网 · Answer

You should use elif in the middle. There seems to be something wrong with the indentation of for at the end

为情所困 · Answer

You can open multiple processes in the big loop, for example, if the big loop is 2000 times, if the number of CPU cores is 4, then 4 processes will be opened, and each process will be responsible for running 500

After the small loop ends, you can open a sub-thread to perform the following subsequent operations, and the big loop continues to process forward

for each in alist:#在大循环中对小循环中得出的结果进行进一步其他操作
    print(each)
for each in blist:
    print(each)
for each in clist:
    print(each)

phpcn_u1582 · Answer

You can use sub-processes to handle small loops, but in this case you need two large loops. One loop handles the small loop, and after this loop is processed, a big loop handles the following things

Like this

import random as r


def cumput(i, list1):
    alist = []
    blist = []
    clist = []
    reslist = []
    for each in list1:  # 小循环
        x = r.randint(i + 30, i + 60) + each  # 涉及到大、小循环变量的几个函数，这里用random示意
        y = r.randint(i + 60, i + 120) + each
        z = r.randint(i + 60, i + 180) + each

        res = 2.5 * x - y - z
        reslist.append(res)  # 对函数结果进行操作
        if res >= 50:
            alist.append(each)
        if -50 < res < 50:
            blist.append(each)
        if res <= -50:
            clist.append(each)
    return alist, blist, clist, reslist


if __name__ == '__main__':
    multiprocessing.freeze_support()
    list1 = list(range(100))
    i = 0
    pool = multiprocessing.Pool(2)
    res = {}
    while i < 2000:  # 大循环
        res[i]=pool.apply_async(cumput, (i, list1,))
        i += 1
    pool.close()
    pool.join()
    for i in res:
        for each in res[i].get()[0]:  # 在大循环中对小循环中得出的结果进行进一步其他操作
            print(each)
        for each in res[i].get()[1]:
            print(each)
        for each in res[i].get()[2]:
            print(each)

typecho · Answer

If the functions executed in the small loop are time-consuming, you can consider the producer-consumer model


import random
from threading import Thread
from Queue import Queue

resqueue = Queue()
aqueue = Queue()
bqueue = Queue()
cqueue = Queue()

def producer():
    list1=list(range(100))
    
    for _ in range(2000):
        for each in list1:
            x=r.randint(i+30,i+60)+each
            y=r.randint(i+60,i+120)+each
            z=r.randint(i+60,i+180)+each
            
            res=2.5*x-y-z
            resqueue.put(res)
            
            if res>=50:
                aqueue.put(each)
            if -50

怪我咯 · Answer

Should the questioner design the input and output of the process first? If multiple processes are doing parallel calculations, communication between processes is the most important. As far as I know, it should be MPI, such as multi-layer loops, and part of the data should be distributed first. Go to each process, each process does calculations and then returns to the data integration point, and then merges the results and outputs them.

Another important point is to estimate the execution time of each process. After all, if there is inter-process communication, waiting time will also lead to a decrease in efficiency.

@daijianke said that your nesting does not comply with the input rules of parallel computing. You can take a look at this example

http://blog.csdn.net/zouxy09/...

I have tested the examples in the article before and there is no problem. If you follow these steps, you should be able to get it done

Python - How to use multiple processes to solve the problem of slow loop nesting?

reply all(6)I'll reply