Home > Article > Backend Development > Is multi-processing or multi-threading faster in python?
The following editor will bring you an article on who is faster between python multi-process and multi-thread (detailed explanation). The editor thinks it’s pretty good, so I’ll share it with you now and give it as a reference. Let’s follow the editor and take a look.
python3.6
threading and multiprocessing
##Quad-core + Samsung 250G-850-SSD
Since used For multi-process and multi-thread programming, I still don’t understand which one is faster. Many people on the Internet say that Python multi-process is faster because of GIL (Global Interpreter Lock). But when I write code, the test time is faster with multi-threading, so what is going on? Recently I have been doing word segmentation work again. The original code is too slow and I want to speed it up, so let’s explore effective methods (there are codes and renderings at the end of the article)Here is the result of the program. Diagram to illustrate which thread or process is faster
Some definitions
Parallelism means two or more events occur at the same time. Concurrency means that two or more events occur within the same time interval. Threads are the smallest units that the operating system can perform computing scheduling. It is included in the process and is the actual operating unit in the process. An execution instance of a program is a process.Implementation process
The multi-threads in python obviously have to get the GIL, execute the code, and finally release the GIL. So because of GIL, you can't get it when using multiple threads. In fact, it is a concurrent implementation, that is, multiple events occur at the same time interval. But the process has an independent GIL, so it can be implemented in parallel. Therefore, for multi-core CPUs, theoretically using multiple processes can more effectively utilize resources.Real-life problems
In online tutorials, you can often see python multi-threading. For example, web crawler tutorials and port scanning tutorials. Take port scanning as an example. You can use multi-process to implement the following script, and you will find that python multi-process is faster. So isn’t it contrary to our analysis?import sys,threading from socket import * host = "127.0.0.1" if len(sys.argv)==1 else sys.argv[1] portList = [i for i in range(1,1000)] scanList = [] lock = threading.Lock() print('Please waiting... From ',host) def scanPort(port): try: tcp = socket(AF_INET,SOCK_STREAM) tcp.connect((host,port)) except: pass else: if lock.acquire(): print('[+]port',port,'open') lock.release() finally: tcp.close() for p in portList: t = threading.Thread(target=scanPort,args=(p,)) scanList.append(t) for i in range(len(portList)): scanList[i].start() for i in range(len(portList)): scanList[i].join()
Who is faster
Because of the python lock problem, threads competing for locks and switching threads will consume resources. So, let’s take a bold guess: In CPU-intensive tasks, multi-processing is faster or more effective; while in IO-intensive tasks, multi-threading can effectively improve efficiency. Let’s take a look at the following code:import time import threading import multiprocessing max_process = 4 max_thread = max_process def fun(n,n2): #cpu密集型 for i in range(0,n): for j in range(0,(int)(n*n*n*n2)): t = i*j def thread_main(n2): thread_list = [] for i in range(0,max_thread): t = threading.Thread(target=fun,args=(50,n2)) thread_list.append(t) start = time.time() print(' [+] much thread start') for i in thread_list: i.start() for i in thread_list: i.join() print(' [-] much thread use ',time.time()-start,'s') def process_main(n2): p = multiprocessing.Pool(max_process) for i in range(0,max_process): p.apply_async(func = fun,args=(50,n2)) start = time.time() print(' [+] much process start') p.close()#关闭进程池 p.join()#等待所有子进程完毕 print(' [-] much process use ',time.time()-start,'s') if name=='main': print("[++]When n=50,n2=0.1:") thread_main(0.1) process_main(0.1) print("[++]When n=50,n2=1:") thread_main(1) process_main(1) print("[++]When n=50,n2=10:") thread_main(10) process_main(10)
The results are as follows:
It can be seen that when the When the CPU usage becomes higher and higher (the more code cycles there are), the gap becomes wider and wider. Verify our conjectureCPU and IO intensive
1, CPU intensive code (various loop processing, counting, etc.)Judgment method:
1. Directly look at the CPU usage, hard disk IO read and write speedExamples of multi-process and multi-threading in Python (1)
2.Recommended to use multi-process instead of multi-threading in Python? Share the reasons for recommending the use of multi-process
3.Examples of multi-process and multi-threading in Python (2) Programming methods
4.About Python Detailed introduction to processes, threads, and coroutines
5.Python concurrent programming thread pool/process pool
The above is the detailed content of Is multi-processing or multi-threading faster in python?. For more information, please follow other related articles on the PHP Chinese website!