Don't force kill python threads-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

Don't force kill python threads

高洛峰

Feb 28, 2017 am 09:06 AM

Foreword:

Don’t try to kill a python thread by force. This is unreasonable in terms of service design. Multi-threading is used for collaborative concurrency of tasks. If you use force to kill threads, there is a high chance that unexpected bugs will occur. Please remember that the lock resource will not be released because the thread exits!

We can cite two common examples:

1. Thread A got the lock because it was forcibly killed and failed to release the lock resource in time with release() , then all threads will be blocked in acquiring resources, which is a typical deadlock scenario.

2. In a common production-consumer scenario, the consumer obtains tasks from the task queue, but does not throw the ongoing task back into the queue after being killed, which results in data loss.

The following are methods for terminating threads in java and python:

Java has three methods to terminate threads:

1. Use exit flag, so that the thread exits normally, that is, the thread terminates when the run method is completed.
2. Use the stop method to forcefully terminate the thread (not recommended, because stop is the same as suspend and resume, and unpredictable results may occur).
3. Use the interrupt method to interrupt the thread.

Python can have two methods:

1. Exit mark
2. Use ctypes to forcefully kill the thread

No matter In a Python or Java environment, the ideal way to stop and exit a thread is to let the thread commit suicide. The so-called thread suicide means that you give it a flag and it exits the thread.

Below we will use a variety of methods to test the abnormal situation of stopping the python thread. We look at all the execution threads of a process. The process uses control resources, and the thread is used as a scheduling unit. To be scheduled for execution, a process must have a thread. The default thread is the same as the pid of the process.

ps -mp 31449 -o THREAD,tid
 
USER   %CPU PRI SCNT WCHAN USER SYSTEM  TID
root   0.0  -  - -     -   -   -
root   0.0 19  - poll_s  -   - 31449
root   0.0 19  - poll_s  -   - 31450

After obtaining all the threads of the process, we know through strace that 31450 is the thread ID that we need to kill. When we kill, the entire process will appear All crash. In a multi-threaded environment, the generated signal is passed to the entire process. Generally speaking, all threads have the opportunity to receive this signal. The process executes the signal processing function in the thread context that receives the signal, which thread executes it. Hard to know. In other words, the signal will be sent to a thread of the process at random.

strace -p <span style="font-size:14px;line-height:21px;">31450</span> Process <span style="font-size:14px;line-height:21px;">31450</span> attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 320326}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})   = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})   = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})   = ? ERESTARTNOHAND (To be restarted)
--- SIGTERM (Terminated) @ 0 (0) ---
Process <span style="font-size:14px;line-height:21px;">31450</span> detached

The above problem is actually consistent with the description of pthread. When we add the signal signal processing function to the python code, the callback function can prevent the entire process from exiting. Then the problem arises. The signal function cannot identify which thread you want to kill. In other words, it cannot accurately kill a certain thread. . Although you send the signal to the 31450 thread ID, the signal acceptor is any one of the process to which it belongs. In addition, the parameters passed to the signal processing function are only the signal number and the signal stack, which are optional.

After adding signal processing, the process will not exit

select(0, NULL, NULL, NULL, {1, 0})   = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})   = ? ERESTARTNOHAND (To be restarted)
--- SIGTERM (Terminated) @ 0 (0) ---
rt_sigreturn(0xffffffff)        = -1 EINTR (Interrupted system call)
select(0, NULL, NULL, NULL, {1, 0})   = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})   = 0 (Timeout)

If you want to kill a thread from an external notification, then You can build and use rpc services, or communicate in other ways, but signals cannot because they cannot transmit more information.

Python threads are not simulated, they are real kernel threads. The kernel calls the pthread method, but the upper layer of Python does not provide a method to close the thread, so we need to control it ourselves. It is strongly recommended to use event or custom flag bit methods. If you must forcefully kill the thread, you can use the python ctypes PyThreadState SetAsyncExc method to force exit, which will have no impact on the running python service.

The implementation principle of this function is relatively simple. In fact, it is to set a flag in the Python virtual machine, and then the virtual machine will run an exception to cancel the thread. The virtual machine will help you make a try cache. Remember not to kill a thread in Python externally. Although you can find the thread ID through ctypes, killing it directly will kill the entire process.

The following code is an example of using ctypes to kill threads. It is not recommended because it is too rude.

import ctypes
 
def terminate_thread(thread):
  if not thread.isAlive():
    return
 
  exc = ctypes.py_object(SystemExit)
  res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
    ctypes.c_long(thread.ident), exc)
  if res == 0:
    raise ValueError("nonexistent thread id")
  elif res > 1:
    ctypes.pythonapi.PyThreadState_SetAsyncExc(thread.ident, None)
    raise SystemError("PyThreadState_SetAsyncExc failed")

Let's take a brief look at the PyThreadState source code. In short, the exception mode of the trigger thread is triggered. Those who are interested can read the design of python pystate.c and share it with some videos on YouTube.

 int
PyThreadState_SetAsyncExc(long id, PyObject *exc) {
  PyInterpreterState *interp = GET_INTERP_STATE();
  ...
  HEAD_LOCK();
  for (p = interp->tstate_head; p != NULL; p = p->next) {
    if (p->thread_id == id) {
      从链表里找到线程的id，避免死锁，我们需要释放head_mutex。
      PyObject *old_exc = p->async_exc;
      Py_XINCREF(exc); #增加该对象的引用数
      p->async_exc = exc; # 更为exc模式
      HEAD_UNLOCK();
      Py_XDECREF(old_exc); # 因为要取消，当然也就递减引用
      ...
      return 1; #销毁线程成功
    }
  }
  HEAD_UNLOCK();
  return 0;
}

Native posix pthread can use ptread_cancel(tid) to end the child thread in the main thread. However, Python's thread library does not support this. The reason is that we should not forcefully end a thread. This will bring many hidden dangers, and the thread should be allowed to end itself. Therefore, in Python, the recommended method is to loop through a sub-thread to determine a flag, change the flag in the main thread, and end itself when the sub-thread reads the flag change.

Similar to this logic:

def consumer_threading():
 t1_stop= threading.Event()
 t1 = threading.Thread(target=thread1, args=(1, t1_stop))
 
 t2_stop = threading.Event()
 t2 = threading.Thread(target=thread2, args=(2, t2_stop))
 
 time.sleep(duration)
 #stop the thread2
 t2_stop.set()
 
def thread1(arg1, stop_event):
 while(not stop_event.is_set()):
   #similar to time.sleep()
   stop_event.wait(time)
   pass
 
 
def thread2(arg1, stop_event):
 while(not stop_event.is_set()):
   stop_event.wait(time)
   pass

A brief summary, although we can use pystats in ctypes to control threads, this method of rudely interrupting threads is unreasonable. Please use suicide mode! What if your thread is blocking io and cannot determine the event? Your program needs to be optimized. At least it needs to have an active timeout at the network IO layer to avoid continuous blocking.

Please pay attention to the PHP Chinese website for more related articles about not using forced methods to kill python threads!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How do you append elements to a Python list?May 04, 2025 am 12:17 AM

ToappendelementstoaPythonlist,usetheappend()methodforsingleelements,extend()formultipleelements,andinsert()forspecificpositions.1)Useappend()foraddingoneelementattheend.2)Useextend()toaddmultipleelementsefficiently.3)Useinsert()toaddanelementataspeci

How do you create a Python list? Give an example.May 04, 2025 am 12:16 AM

TocreateaPythonlist,usesquarebrackets[]andseparateitemswithcommas.1)Listsaredynamicandcanholdmixeddatatypes.2)Useappend(),remove(),andslicingformanipulation.3)Listcomprehensionsareefficientforcreatinglists.4)Becautiouswithlistreferences;usecopy()orsl

Discuss real-world use cases where efficient storage and processing of numerical data are critical.May 04, 2025 am 12:11 AM

In the fields of finance, scientific research, medical care and AI, it is crucial to efficiently store and process numerical data. 1) In finance, using memory mapped files and NumPy libraries can significantly improve data processing speed. 2) In the field of scientific research, HDF5 files are optimized for data storage and retrieval. 3) In medical care, database optimization technologies such as indexing and partitioning improve data query performance. 4) In AI, data sharding and distributed training accelerate model training. System performance and scalability can be significantly improved by choosing the right tools and technologies and weighing trade-offs between storage and processing speeds.

How do you create a Python array? Give an example.May 04, 2025 am 12:10 AM

Pythonarraysarecreatedusingthearraymodule,notbuilt-inlikelists.1)Importthearraymodule.2)Specifythetypecode,e.g.,'i'forintegers.3)Initializewithvalues.Arraysofferbettermemoryefficiencyforhomogeneousdatabutlessflexibilitythanlists.

What are some alternatives to using a shebang line to specify the Python interpreter?May 04, 2025 am 12:07 AM

In addition to the shebang line, there are many ways to specify a Python interpreter: 1. Use python commands directly from the command line; 2. Use batch files or shell scripts; 3. Use build tools such as Make or CMake; 4. Use task runners such as Invoke. Each method has its advantages and disadvantages, and it is important to choose the method that suits the needs of the project.

How does the choice between lists and arrays impact the overall performance of a Python application dealing with large datasets?May 03, 2025 am 12:11 AM

ForhandlinglargedatasetsinPython,useNumPyarraysforbetterperformance.1)NumPyarraysarememory-efficientandfasterfornumericaloperations.2)Avoidunnecessarytypeconversions.3)Leveragevectorizationforreducedtimecomplexity.4)Managememoryusagewithefficientdata

Explain how memory is allocated for lists versus arrays in Python.May 03, 2025 am 12:10 AM

InPython,listsusedynamicmemoryallocationwithover-allocation,whileNumPyarraysallocatefixedmemory.1)Listsallocatemorememorythanneededinitially,resizingwhennecessary.2)NumPyarraysallocateexactmemoryforelements,offeringpredictableusagebutlessflexibility.