Home  >  Q&A  >  body text

Python - How to use multiple processes to solve the problem of slow loop nesting?

There is a pattern of a loop within a loop.
The variables of both the large loop and the small loop must be used in the loop body of the inner loop.

I have simplified it into a simple model here.
If the function is complex, the speed of this model will be super slow.
I would like to ask how to use multi-process method to solve the speed problem?

My idea is to use multi-process only for small loops,
Write multi-process code in the loop body of the big loop,
But it keeps failing,
Please ask God to give the correct code.

Thank you!

import random as r
list1=list(range(100))
i=0
reslist=[]
while i<2000:#大循环
    alist=[]#三个列表变量,每次循环开始时清空
    blist=[]
    clist=[]
    for each in list1:#小循环
        x=r.randint(i+30,i+60)+each#涉及到大、小循环变量的几个函数,这里用random示意
        y=r.randint(i+60,i+120)+each
        z=r.randint(i+60,i+180)+each
        
        res=2.5*x-y-z
        reslist.append(res)#对函数结果进行操作
        if res>=50:
            alist.append(each)
        if -50<res<50:
            blist.append(each)
        if res<=-50:
            clist.append(each)
            
    for each in alist:#在大循环中对小循环中得出的结果进行进一步其他操作
        print(each)
    for each in blist:
        print(each)
    for each in clist:
        print(each)
    
    i+=1
代言代言2685 days ago2236

reply all(6)I'll reply

  • 学习ing

    学习ing2017-06-12 09:24:04

    First of all, parallel computing requires that there is no mutual causal relationship between the subroutines of each parallel operation.
    In the small loop, res has a close causal relationship with x, y, z, and alist, blist, and clist, and it is difficult to split them into parallel calculations.
    Although the code posted by the questioner is not the original code, I don’t know if there is a causal relationship between the large loops in the original code. However, judging from the schematic code,
    splitting the large loop into N threads (no processes are needed) should be Yes, each thread calculates 2000/N times.
    For example, divided into 8 threads, thread 1 calculates i=0 to 249, thread 2 calculates i=250 to 499, and so on. . .
    The size of N here can be determined according to the number of CPU cores. If N exceeds the number of CPU cores, it will not make much sense, but it may reduce efficiency.

    reply
    0
  • PHP中文网

    PHP中文网2017-06-12 09:24:04

    You should use elif in the middle. There seems to be something wrong with the indentation of for at the end

    reply
    0
  • 为情所困

    为情所困2017-06-12 09:24:04

    You can open multiple processes in the big loop, for example, if the big loop is 2000 times, if the number of CPU cores is 4, then 4 processes will be opened, and each process will be responsible for running 500

    After the small loop ends, you can open a sub-thread to perform the following subsequent operations, and the big loop continues to process forward

    for each in alist:#在大循环中对小循环中得出的结果进行进一步其他操作
        print(each)
    for each in blist:
        print(each)
    for each in clist:
        print(each)

    reply
    0
  • phpcn_u1582

    phpcn_u15822017-06-12 09:24:04

    You can use sub-processes to handle small loops, but in this case you need two large loops. One loop handles the small loop, and after this loop is processed, a big loop handles the following things

    Like this

    import random as r
    
    
    def cumput(i, list1):
        alist = []
        blist = []
        clist = []
        reslist = []
        for each in list1:  # 小循环
            x = r.randint(i + 30, i + 60) + each  # 涉及到大、小循环变量的几个函数,这里用random示意
            y = r.randint(i + 60, i + 120) + each
            z = r.randint(i + 60, i + 180) + each
    
            res = 2.5 * x - y - z
            reslist.append(res)  # 对函数结果进行操作
            if res >= 50:
                alist.append(each)
            if -50 < res < 50:
                blist.append(each)
            if res <= -50:
                clist.append(each)
        return alist, blist, clist, reslist
    
    
    if __name__ == '__main__':
        multiprocessing.freeze_support()
        list1 = list(range(100))
        i = 0
        pool = multiprocessing.Pool(2)
        res = {}
        while i < 2000:  # 大循环
            res[i]=pool.apply_async(cumput, (i, list1,))
            i += 1
        pool.close()
        pool.join()
        for i in res:
            for each in res[i].get()[0]:  # 在大循环中对小循环中得出的结果进行进一步其他操作
                print(each)
            for each in res[i].get()[1]:
                print(each)
            for each in res[i].get()[2]:
                print(each)

    reply
    0
  • typecho

    typecho2017-06-12 09:24:04

    If the functions executed in the small loop are time-consuming, you can consider the producer-consumer model

    
    import random
    from threading import Thread
    from Queue import Queue
    
    resqueue = Queue()
    aqueue = Queue()
    bqueue = Queue()
    cqueue = Queue()
    
    def producer():
        list1=list(range(100))
        
        for _ in range(2000):
            for each in list1:
                x=r.randint(i+30,i+60)+each
                y=r.randint(i+60,i+120)+each
                z=r.randint(i+60,i+180)+each
                
                res=2.5*x-y-z
                resqueue.put(res)
                
                if res>=50:
                    aqueue.put(each)
                if -50<res<50:
                    bqueue.put(each)
                if res<=-50:
                    cqueue.put(each)
    
    def consumer_a():
        while True:
            try:
                data = aqueue.get(timeout=5)
            except Queue.Empty:
                return
            else:
                # 耗时操作
                deal_data(data)
                aqueue.task_done()
                
    def consumer_b():
        while True:
            try:
                data = bqueue.get(timeout=5)
            except Queue.Empty:
                return
            else:
                # 耗时操作
                deal_data(data)
                bqueue.task_done()
                
     def consumer_c():
        while True:
            try:
                data = cqueue.get(timeout=5)
            except Queue.Empty:
                return
            else:
                # 耗时操作
                deal_data(data)
                cqueue.task_done()
    
     def consumer_res():
        while True:
            try:
                data = resqueue.get(timeout=5)
            except Queue.Empty:
                return
            else:
                # 耗时操作
                deal_data(data)
                resqueue.task_done()
                
    if __name__ == "__main__":
        t1 = Thread(target=producer)
        t2 = Thread(target=consumer_a)
        ...
        
        t1.start()
        t2.start()
                           

    reply
    0
  • 怪我咯

    怪我咯2017-06-12 09:24:04

    Should the questioner design the input and output of the process first? If multiple processes are doing parallel calculations, communication between processes is the most important. As far as I know, it should be MPI, such as multi-layer loops, and part of the data should be distributed first. Go to each process, each process does calculations and then returns to the data integration point, and then merges the results and outputs them.

    Another important point is to estimate the execution time of each process. After all, if there is inter-process communication, waiting time will also lead to a decrease in efficiency.

    @daijianke said that your nesting does not comply with the input rules of parallel computing. You can take a look at this example

    http://blog.csdn.net/zouxy09/...

    I have tested the examples in the article before and there is no problem. If you follow these steps, you should be able to get it done

    reply
    0
  • Cancelreply