Home >Backend Development >Python Tutorial >Understanding multi-process programming in python

Understanding multi-process programming in python

高洛峰
高洛峰Original
2017-03-03 13:51:331274browse

The following editor will bring you an in-depth understanding of python multi-process programming. The editor thinks it is quite good, so I will share it with you now and give it as a reference for everyone. Let’s follow the editor and take a look.

1. Python multi-process programming background

The biggest benefit of multi-process in python is to make full use of it The resources of multi-core CPU are not like multi-threading in Python, which is subject to GIL restrictions, so it can only be allocated to the CPU. In multi-processing of Python, it is suitable for all occasions. Basically, multi-threading can be used, so basically You can use multiple processes.

When doing multi-process programming, it is actually similar to multi-threading. In the multi-threading package threading, there is a thread class Thread, in which there are three methods to create a thread and start the thread. In fact, in multi-threading, In process programming, there is a process class Process, which can also be used using the centralized method; in multi-threads, data in memory can be shared directly, such as lists, etc., but in multi-processes, memory data cannot be shared. , thus a separate data structure needs to be used to process shared data; in multi-threads, data sharing must ensure the correctness of the data, so something must be done, but in multi-processes, locks should be rarely considered, because the process Memory information is not shared. The interactive data between processes must pass through a special data structure. In multiple processes, the main content is as follows:

Understanding multi-process programming in python

2. The multi-process class Process

The multi-process class Process and the multi-threaded class Thread have similar methods. The interfaces of the two are basically the same. See below for details. Code:

#!/usr/bin/env python

from multiprocessing import Process
import os
import time

def func(name):
  print 'start a process'
  time.sleep(3)
  print 'the process parent id :',os.getppid()
  print 'the process id is :',os.getpid()

if __name__ =='__main__':
  processes = []
  for i in range(2):
    p = Process(target=func,args=(i,))
    processes.append(p)
  for i in processes:
    i.start()
  print 'start all process'
  for i in processes:
    i.join()
    #pass
  print 'all sub process is done!'

As you can see in the above example, the API interfaces of multi-process and multi-thread are the same. The process is created and then started. Start running, then join and wait for the process to end.

In the function that needs to be executed, the id and pid of the process are printed, so that you can see the id numbers of the parent process and the child process. In linu, processes are mainly forked. When creating a process, The ID numbers of the parent process and child process can be queried, but the thread ID cannot be found in multi-threading. The execution effect is as follows:

start all process
start a process
start a process

the process parent id : 8036
the process parent id : 8036
the process id is : 8037
the process id is : 8038
all sub process is done!

In the operating system When querying the ID, it is best to use pstree, clear:

├─sshd(1508)─┬─sshd(2259)───bash(2261)───python(7520)─┬─python(7521)
    │      │                    ├─python(7522)
    │      │                    ├─python(7523)
    │      │                    ├─python(7524)
    │      │                    ├─python(7525)
    │      │                    ├─python(7526)
    │      │                    ├─python(7527)
    │      │                    ├─python(7528)
    │      │                    ├─python(7529)
    │      │                    ├─python(7530)
    │      │                    ├─python(7531)
    │      │                    └─python(7532)

When running, you can see that if there is no join statement, the main process will not wait for the child process to end, but will always Will continue to execute, and then wait for the execution of the child process.

When using multiple processes, how do I get the return value of multiple processes? Then I wrote the following code:

#!/usr/bin/env python

import multiprocessing

class MyProcess(multiprocessing.Process):
  def __init__(self,name,func,args):
    super(MyProcess,self).__init__()
    self.name = name
    self.func = func
    self.args = args
    self.res = ''

  def run(self):
    self.res = self.func(*self.args)
    print self.name
    print self.res
    return (self.res,'kel')

def func(name):
  print 'start process...'
  return name.upper()

if __name__ == '__main__':
  processes = []
  result = []
  for i in range(3):
    p = MyProcess('process',func,('kel',))
    processes.append(p)
  for i in processes:
    i.start()
  for i in processes:
    i.join()
  for i in processes:
    result.append(i.res)
  for i in result:
    print i

Try to return a value from the result, so as to get the return value of the child process in the main process, however,, and There was no result. Then I thought about it. In a process, memory is not shared between processes, so it is obviously not feasible to use a list to store data. The interaction between processes must rely on special data structures, so the above code is just It is an executing process and the return value of the process cannot be obtained. However, if the above code is modified to a thread, the return value can be obtained.

3. Inter-process interaction Queue

When interacting between processes, you can first use the same Queue structure in multi-threading. But in multi-process, you must use the Queue in multiprocessing. The code is as follows:

#!/usr/bin/env python

import multiprocessing

class MyProcess(multiprocessing.Process):
  def __init__(self,name,func,args):
    super(MyProcess,self).__init__()
    self.name = name
    self.func = func
    self.args = args
    self.res = ''

  def run(self):
    self.res = self.func(*self.args)

def func(name,q):
  print 'start process...'
  q.put(name.upper())

if __name__ == '__main__':
  processes = []
  q = multiprocessing.Queue()
  for i in range(3):
    p = MyProcess('process',func,('kel',q))
    processes.append(p)
  for i in processes:
    i.start()
  for i in processes:
    i.join()
  while q.qsize() > 0:
    print q.get()

In fact, this is an improvement of the above example, and nothing is used in it. Other codes mainly use Queue to save data, so that the purpose of exchanging data between processes can be achieved.

When using Queue, the socket is actually used, it feels like, because what is used in it is still sending, and then receiving recv.

When performing data interaction, the parent process actually interacts with all child processes. There is basically no interaction between all child processes, unless, however, it is possible, for example, each The process goes to Queue to get data, but locks should be considered at this time, otherwise data may be confused.

4. Interaction between processes Pipe

Pipe can also be used when exchanging data between processes. The code is as follows:

#!/usr/bin/env python

import multiprocessing

class MyProcess(multiprocessing.Process):
  def __init__(self,name,func,args):
    super(MyProcess,self).__init__()
    self.name = name
    self.func = func
    self.args = args
    self.res = ''

  def run(self):
    self.res = self.func(*self.args)

def func(name,q):
  print 'start process...'
  child_conn.send(name.upper())

if __name__ == '__main__':
  processes = []
  parent_conn,child_conn = multiprocessing.Pipe()
  for i in range(3):
    p = MyProcess('process',func,('kel',child_conn))
    processes.append(p)
  for i in processes:
    i.start()
  for i in processes:
    i.join()
  for i in processes:
    print parent_conn.recv()

In the above code, the two sockets returned in Pipe are mainly used to transmit and receive data. In the parent process, parent_conn is used, and in the child process child_conn is used. , so the child process sends data using the send method, and the receiving method recv

is used in the parent process. The best thing is that you clearly know the number of times of sending and receiving, but if an exception occurs, then the pipe cannot be used. .

5. Process pool pool

In fact, when using multiple processes, I feel that using pool is the most convenient. In multi-threading, it is There is no pool.

在使用pool的时候,可以限制每次的进程数,也就是剩余的进程是在排队,而只有在设定的数量的进程在运行,在默认的情况下,进程是cpu的个数,也就是根据multiprocessing.cpu_count()得出的结果。

在poo中,有两个方法,一个是map一个是imap,其实这两方法超级方便,在执行结束之后,可以得到每个进程的返回结果,但是缺点就是每次的时候,只能有一个参数,也就是在执行的函数中,最多是只有一个参数的,否则,需要使用组合参数的方法,代码如下所示:

#!/usr/bin/env python

import multiprocessing

def func(name):
  print 'start process'
  return name.upper()

if __name__ == '__main__':
  p = multiprocessing.Pool(5)
  print p.map(func,['kel','smile'])
  for i in p.imap(func,['kel','smile']):
    print i

在使用map的时候,直接返回的一个是一个list,从而这个list也就是函数执行的结果,而在imap中,返回的是一个由结果组成的迭代器,如果需要使用多个参数的话,那么估计需要*args,从而使用参数args。

在使用apply.async的时候,可以直接使用多个参数,如下所示:

#!/usr/bin/env python

import multiprocessing
import time
def func(name):
  print 'start process'
  time.sleep(2)
  return name.upper()

if __name__ == '__main__':
  results = []
  p = multiprocessing.Pool(5)
  for i in range(7):
    res = p.apply_async(func,args=('kel',))
    results.append(res)
  for i in results:
    print i.get(2.1)

在进行得到各个结果的时候,注意使用了一个list来进行append,要不然在得到结果get的时候会阻塞进程,从而将多进程编程了单进程,从而使用了一个list来存放相关的结果,在进行得到get数据的时候,可以设置超时时间,也就是get(timeout=5),这种设置。

总结:

在进行多进程编程的时候,注意进程之间的交互,在执行函数之后,如何得到执行函数的结果,可以使用特殊的数据结构,例如Queue或者Pipe或者其他,在使用pool的时候,可以直接得到结果,map和imap都是直接得到一个list和可迭代对象,而apply_async得到的结果需要用一个list装起来,然后得到每个结果。

以上这篇深入理解python多进程编程就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持PHP中文网。

更多Understanding multi-process programming in python相关文章请关注PHP中文网!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn