Python multithreading


Multi-threading is similar to executing multiple different programs at the same time. Multi-threading has the following advantages:

  • Using threads can put tasks in long-term programs into Processed in the background.

  • The user interface can be more attractive, so that if the user clicks a button to trigger the processing of certain events, a progress bar can pop up to show the progress of the processing

  • The running speed of the program may be accelerated

  • Threads are more useful in the implementation of some waiting tasks such as user input, file reading and writing, and network sending and receiving data, etc. In this case we can release some precious resources such as memory usage and so on.

Threads are still different from processes during execution. Each independent thread has an entry point for program execution, a sequential execution sequence, and an exit point for the program. However, threads cannot execute independently and must exist in the application program, and the application program provides multiple thread execution control.

Each thread has its own set of CPU registers, called the thread's context, which reflects the state of the CPU registers the thread last ran.

The instruction pointer and stack pointer register are the two most important registers in the thread context. The thread always runs in the process context. These addresses are used to mark the memory in the address space of the process that owns the thread.

  • Threads can be preempted (interrupted).

  • A thread can be temporarily put on hold (also called sleeping) while other threads are running - this is called thread yielding.


Start learning Python threads

There are two ways to use threads in Python: functions or classes to wrap thread objects.

Functional: Call the start_new_thread() function in the thread module to generate a new thread. The syntax is as follows:

thread.start_new_thread ( function, args[, kwargs] )

Parameter description:

  • function - thread function.

  • args - the parameters passed to the thread function, it must be a tuple type.

  • kwargs - optional parameters.

Example:

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import thread
import time

# 为线程定义一个函数
def print_time( threadName, delay):
   count = 0
   while count < 5:
      time.sleep(delay)
      count += 1
      print "%s: %s" % ( threadName, time.ctime(time.time()) )

# 创建两个线程
try:
   thread.start_new_thread( print_time, ("Thread-1", 2, ) )
   thread.start_new_thread( print_time, ("Thread-2", 4, ) )
except:
   print "Error: unable to start thread"

while 1:
   pass

The output result of executing the above program is as follows:

Thread-1: Thu Jan 22 15:42:17 2009
Thread-1: Thu Jan 22 15:42:19 2009
Thread-2: Thu Jan 22 15:42:19 2009
Thread-1: Thu Jan 22 15:42:21 2009
Thread-2: Thu Jan 22 15:42:23 2009
Thread-1: Thu Jan 22 15:42:23 2009
Thread-1: Thu Jan 22 15:42:25 2009
Thread-2: Thu Jan 22 15:42:27 2009
Thread-2: Thu Jan 22 15:42:31 2009
Thread-2: Thu Jan 22 15:42:35 2009

The end of the thread generally relies on the natural end of the thread function; it can also be done in the thread When thread.exit() is called in the function, it throws SystemExit exception to exit the thread.


Thread module

Python provides support for threads through two standard libraries, thread and threading. thread provides low-level, primitive threads and a simple lock.

Other methods provided by the thread module:

  • threading.currentThread(): Returns the current thread variable.

  • threading.enumerate(): Returns a list containing running threads. Running refers to after the thread starts and before it ends, excluding threads before starting and after termination.

  • threading.activeCount(): Returns the number of running threads, which has the same result as len(threading.enumerate()).

In addition to using methods, the thread module also provides the Thread class to handle threads. The Thread class provides the following methods:

  • run() : Method used to represent thread activity.

  • start():Start thread activity.


  • join([time]): Wait until the thread terminates. This blocks the calling thread until the thread's join() method is called abort - exit normally or throw an unhandled exception - or an optional timeout occurs.

  • isAlive(): Returns whether the thread is active.

  • getName(): Returns the thread name.

  • setName(): Set the thread name.


Use the Threading module to create threads

Use the Threading module to create threads, inherit directly from threading.Thread, and then override the __init__ method and run method:

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import threading
import time

exitFlag = 0

class myThread (threading.Thread):   #继承父类threading.Thread
    def __init__(self, threadID, name, counter):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.name = name
        self.counter = counter
    def run(self):                   #把要执行的代码写到run函数里面 线程在创建后会直接运行run函数 
        print "Starting " + self.name
        print_time(self.name, self.counter, 5)
        print "Exiting " + self.name

def print_time(threadName, delay, counter):
    while counter:
        if exitFlag:
            thread.exit()
        time.sleep(delay)
        print "%s: %s" % (threadName, time.ctime(time.time()))
        counter -= 1

# 创建新线程
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)

# 开启线程
thread1.start()
thread2.start()

print "Exiting Main Thread"

The above program execution results are as follows;

Starting Thread-1
Starting Thread-2
Exiting Main Thread
Thread-1: Thu Mar 21 09:10:03 2013
Thread-1: Thu Mar 21 09:10:04 2013
Thread-2: Thu Mar 21 09:10:04 2013
Thread-1: Thu Mar 21 09:10:05 2013
Thread-1: Thu Mar 21 09:10:06 2013
Thread-2: Thu Mar 21 09:10:06 2013
Thread-1: Thu Mar 21 09:10:07 2013
Exiting Thread-1
Thread-2: Thu Mar 21 09:10:08 2013
Thread-2: Thu Mar 21 09:10:10 2013
Thread-2: Thu Mar 21 09:10:12 2013
Exiting Thread-2

Thread synchronization

If multiple threads jointly modify a certain data, unpredictable results may occur. In order to ensure the accuracy of the data, multiple threads need to be synchronized.

Simple thread synchronization can be achieved using the Thread object's Lock and Rlock. Both objects have acquire methods and release methods. For data that requires only one thread to operate at a time, its operations can be placed to between acquire and release methods. As follows:

The advantage of multi-threading is that it can run multiple tasks at the same time (at least it feels like this). But when threads need to share data, there may be data out-of-synchronization problems.

Consider this situation: all elements in a list are 0, thread "set" changes all elements to 1 from back to front, and thread "print" is responsible for reading the list from front to back and printing .

Then, maybe when the thread "set" starts to change, the thread "print" will print the list, and the output will be half 0 and half 1. This is the desynchronization of the data. To avoid this situation, the concept of locks was introduced.

Locks have two states - locked and unlocked. Whenever a thread such as "set" wants to access shared data, it must first obtain the lock; if another thread such as "print" has obtained the lock, then let the thread "set" pause, which is synchronous blocking; wait until the thread " Print "After the access is completed and the lock is released, let the thread "set" continue.

After such processing, when printing the list, either all 0s or all 1s will be output, and there will no longer be an embarrassing scene of half 0s and half 1s.

Example:

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import threading
import time

class myThread (threading.Thread):
    def __init__(self, threadID, name, counter):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.name = name
        self.counter = counter
    def run(self):
        print "Starting " + self.name
       # 获得锁,成功获得锁定后返回True
       # 可选的timeout参数不填时将一直阻塞直到获得锁定
       # 否则超时后将返回False
        threadLock.acquire()
        print_time(self.name, self.counter, 3)
        # 释放锁
        threadLock.release()

def print_time(threadName, delay, counter):
    while counter:
        time.sleep(delay)
        print "%s: %s" % (threadName, time.ctime(time.time()))
        counter -= 1

threadLock = threading.Lock()
threads = []

# 创建新线程
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)

# 开启新线程
thread1.start()
thread2.start()

# 添加线程到线程列表
threads.append(thread1)
threads.append(thread2)

# 等待所有线程完成
for t in threads:
    t.join()
print "Exiting Main Thread"

Thread priority queue (Quue)

Python’s Queue module provides synchronous, thread-safe queue classes, including FIFO ( First in first out) queue Queue, LIFO (last in first out) queue LifoQueue, and priority queue PriorityQueue. These queues implement lock primitives and can be used directly in multi-threads. Queues can be used to achieve synchronization between threads.

Commonly used methods in the Queue module:


  • Queue.qsize() Returns the size of the queue

  • Queue.empty() If the queue is empty, returns True, otherwise False

  • Queue.full() If the queue is full, return True, otherwise False

  • Queue.full corresponds to the maxsize size

  • Queue.get([block[, timeout]]) gets the queue, timeout waiting time

  • Queue.get_nowait() is equivalent to Queue.get(False)

  • Queue.put(item) writes to the queue, timeout waiting time

  • Queue.put_nowait(item) is equivalent to Queue.put(item, False)

  • Queue.task_done() After completing a task, the Queue.task_done() function sends a signal to the queue where the task has been completed

  • Queue. join() actually means waiting until the queue is empty before performing other operations

Example:

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import Queue
import threading
import time

exitFlag = 0

class myThread (threading.Thread):
    def __init__(self, threadID, name, q):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.name = name
        self.q = q
    def run(self):
        print "Starting " + self.name
        process_data(self.name, self.q)
        print "Exiting " + self.name

def process_data(threadName, q):
    while not exitFlag:
        queueLock.acquire()
        if not workQueue.empty():
            data = q.get()
            queueLock.release()
            print "%s processing %s" % (threadName, data)
        else:
            queueLock.release()
        time.sleep(1)

threadList = ["Thread-1", "Thread-2", "Thread-3"]
nameList = ["One", "Two", "Three", "Four", "Five"]
queueLock = threading.Lock()
workQueue = Queue.Queue(10)
threads = []
threadID = 1

# 创建新线程
for tName in threadList:
    thread = myThread(threadID, tName, workQueue)
    thread.start()
    threads.append(thread)
    threadID += 1

# 填充队列
queueLock.acquire()
for word in nameList:
    workQueue.put(word)
queueLock.release()

# 等待队列清空
while not workQueue.empty():
    pass

# 通知线程是时候退出
exitFlag = 1

# 等待所有线程完成
for t in threads:
    t.join()
print "Exiting Main Thread"

The above program execution result:

Starting Thread-1
Starting Thread-2
Starting Thread-3
Thread-1 processing One
Thread-2 processing Two
Thread-3 processing Three
Thread-1 processing Four
Thread-2 processing Five
Exiting Thread-3
Exiting Thread-1
Exiting Thread-2
Exiting Main Thread