Home  >  Article  >  Backend Development  >  The use of python crawler threads and processes (with code)

The use of python crawler threads and processes (with code)

不言
不言forward
2018-09-28 14:31:371699browse

This article brings you the content about the use of python crawler threads and processes (with code). It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

Process

Use

  • Import class library

import multiprocessing
  • Create a process

p1 = multiprocessing.Process(target=test1)

Process parameters: group=None, target=None, name=None, args=(), kwargs={})

  • Global variables

import time, os
import multiprocessing

nums = [11, 22, 33]
def test():
    nums.append(44)
    print('在进程1中nums=%s' % str(nums),id(nums))
    time.sleep(3)
def test2():
    print('在进程2中nums=%s' % str(nums),id(nums))
def main():
    print('----in 主进程 pid=%d----父进程pid=%d----' % (os.getpid(), os.getppid()))
    p = multiprocessing.Process(target=test)
    p.start()

    p2 = multiprocessing.Process(target=test2)
    p2.start()   
     # test()    
     # test2()
     if __name__ == '__main__':
    main()

Global variables are not shared between processes because the processes are copied

Threads

Use

  • Import class library

import threading
  • Create thread

t1 = threading.Thread(target=test1,args=(1000000,))

Thread parameters: group=None, target= None, name=None,args=(), kwargs=None, *, daemon=None

  • ##Global variables

  • import time,threading
    
    g_num = 0
    mutex = threading.Lock()
    def test1(num):    
    global g_num    
    # mutex.acquire()     
        for i in range(num):
            mutex.acquire()
            g_num += 1
            mutex.release()    
            # mutex.release()
        print('-------in test1 g_num=%d-----' % g_num)
    def test2(num):    
    global g_num    
    # mutex.acquire()     
        for i in range(num):
            mutex.acquire()
            g_num += 1
            mutex.release()    
            # mutex.release()
        print('-------in test2 g_num=%d-----' % g_num)
    def main():
        t1 = threading.Thread(target=test1,args=(1000000,))
        t2 = threading.Thread(target=test2,args=(1000000,))
        t1.start()
        t2.start()
        time.sleep(3)
        print('-------------in main Thread g_num = %d----' % g_num)
     if __name__ == '__main__':
        main()
Critical section, Only one program enters the code block for execution at the same time, generally enclosing the changed place

If other threads call acquire, the current thread enters waiting

threading.RLock() recursive lock threading.Condition Semaphore or conditional lock

  • Producer-consumer pattern

The producer-consumer pattern is an application of inter-thread communication

Determine whether it is thread-safe when using the data structure. Queue itself is thread-safe. List([]) and dictionary dic({}) are not thread-safe

def set_value(q):
    index = 0    
    while True:
        q.put(index)
        index += 1
        q.put(index)
        index += 1
        time.sleep(2)
def get_value(q):    
while True:
        print('消费者获取数据:',q.get())   #若队列为空就sleep休眠,直到队列有数据def main():
    q = Queue(4)
    t1 = threading.Thread(target=set_value,args=[q])
    t2 = threading.Thread(target=get_value,args=[q])
    t1.start()
    t2.start()

The above is the detailed content of The use of python crawler threads and processes (with code). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:cnblogs.com. If there is any infringement, please contact admin@php.cn delete

Related articles

See more