There is a pattern of a loop within a loop.
The variables of both the large loop and the small loop must be used in the loop body of the inner loop.
I have simplified it into a simple model here.
If the function is complex, the speed of this model will be super slow.
I would like to ask how to use multi-process method to solve the speed problem?
My idea is to use multi-process only for small loops,
Write multi-process code in the loop body of the big loop,
But it keeps failing,
Please ask God to give the correct code.
Thank you!
import random as r
list1=list(range(100))
i=0
reslist=[]
while i<2000:#大循环
alist=[]#三个列表变量,每次循环开始时清空
blist=[]
clist=[]
for each in list1:#小循环
x=r.randint(i+30,i+60)+each#涉及到大、小循环变量的几个函数,这里用random示意
y=r.randint(i+60,i+120)+each
z=r.randint(i+60,i+180)+each
res=2.5*x-y-z
reslist.append(res)#对函数结果进行操作
if res>=50:
alist.append(each)
if -50<res<50:
blist.append(each)
if res<=-50:
clist.append(each)
for each in alist:#在大循环中对小循环中得出的结果进行进一步其他操作
print(each)
for each in blist:
print(each)
for each in clist:
print(each)
i+=1
学习ing2017-06-12 09:24:04
First of all, parallel computing requires that there is no mutual causal relationship between the subroutines of each parallel operation.
In the small loop, res has a close causal relationship with x, y, z, and alist, blist, and clist, and it is difficult to split them into parallel calculations.
Although the code posted by the questioner is not the original code, I don’t know if there is a causal relationship between the large loops in the original code. However, judging from the schematic code,
splitting the large loop into N threads (no processes are needed) should be Yes, each thread calculates 2000/N times.
For example, divided into 8 threads, thread 1 calculates i=0 to 249, thread 2 calculates i=250 to 499, and so on. . .
The size of N here can be determined according to the number of CPU cores. If N exceeds the number of CPU cores, it will not make much sense, but it may reduce efficiency.
PHP中文网2017-06-12 09:24:04
You should use elif in the middle. There seems to be something wrong with the indentation of for at the end
为情所困2017-06-12 09:24:04
You can open multiple processes in the big loop, for example, if the big loop is 2000 times, if the number of CPU cores is 4, then 4 processes will be opened, and each process will be responsible for running 500
After the small loop ends, you can open a sub-thread to perform the following subsequent operations, and the big loop continues to process forward
for each in alist:#在大循环中对小循环中得出的结果进行进一步其他操作
print(each)
for each in blist:
print(each)
for each in clist:
print(each)
phpcn_u15822017-06-12 09:24:04
You can use sub-processes to handle small loops, but in this case you need two large loops. One loop handles the small loop, and after this loop is processed, a big loop handles the following things
Like this
import random as r
def cumput(i, list1):
alist = []
blist = []
clist = []
reslist = []
for each in list1: # 小循环
x = r.randint(i + 30, i + 60) + each # 涉及到大、小循环变量的几个函数,这里用random示意
y = r.randint(i + 60, i + 120) + each
z = r.randint(i + 60, i + 180) + each
res = 2.5 * x - y - z
reslist.append(res) # 对函数结果进行操作
if res >= 50:
alist.append(each)
if -50 < res < 50:
blist.append(each)
if res <= -50:
clist.append(each)
return alist, blist, clist, reslist
if __name__ == '__main__':
multiprocessing.freeze_support()
list1 = list(range(100))
i = 0
pool = multiprocessing.Pool(2)
res = {}
while i < 2000: # 大循环
res[i]=pool.apply_async(cumput, (i, list1,))
i += 1
pool.close()
pool.join()
for i in res:
for each in res[i].get()[0]: # 在大循环中对小循环中得出的结果进行进一步其他操作
print(each)
for each in res[i].get()[1]:
print(each)
for each in res[i].get()[2]:
print(each)
typecho2017-06-12 09:24:04
If the functions executed in the small loop are time-consuming, you can consider the producer-consumer model
import random
from threading import Thread
from Queue import Queue
resqueue = Queue()
aqueue = Queue()
bqueue = Queue()
cqueue = Queue()
def producer():
list1=list(range(100))
for _ in range(2000):
for each in list1:
x=r.randint(i+30,i+60)+each
y=r.randint(i+60,i+120)+each
z=r.randint(i+60,i+180)+each
res=2.5*x-y-z
resqueue.put(res)
if res>=50:
aqueue.put(each)
if -50<res<50:
bqueue.put(each)
if res<=-50:
cqueue.put(each)
def consumer_a():
while True:
try:
data = aqueue.get(timeout=5)
except Queue.Empty:
return
else:
# 耗时操作
deal_data(data)
aqueue.task_done()
def consumer_b():
while True:
try:
data = bqueue.get(timeout=5)
except Queue.Empty:
return
else:
# 耗时操作
deal_data(data)
bqueue.task_done()
def consumer_c():
while True:
try:
data = cqueue.get(timeout=5)
except Queue.Empty:
return
else:
# 耗时操作
deal_data(data)
cqueue.task_done()
def consumer_res():
while True:
try:
data = resqueue.get(timeout=5)
except Queue.Empty:
return
else:
# 耗时操作
deal_data(data)
resqueue.task_done()
if __name__ == "__main__":
t1 = Thread(target=producer)
t2 = Thread(target=consumer_a)
...
t1.start()
t2.start()
怪我咯2017-06-12 09:24:04
Should the questioner design the input and output of the process first? If multiple processes are doing parallel calculations, communication between processes is the most important. As far as I know, it should be MPI, such as multi-layer loops, and part of the data should be distributed first. Go to each process, each process does calculations and then returns to the data integration point, and then merges the results and outputs them.
Another important point is to estimate the execution time of each process. After all, if there is inter-process communication, waiting time will also lead to a decrease in efficiency.
@daijianke said that your nesting does not comply with the input rules of parallel computing. You can take a look at this example
http://blog.csdn.net/zouxy09/...
I have tested the examples in the article before and there is no problem. If you follow these steps, you should be able to get it done