Home >Backend Development >Python Tutorial >Pitfalls encountered in python multi-threaded crawlers

Pitfalls encountered in python multi-threaded crawlers

鸟救山
鸟救山Original
2020-05-20 11:03:05230browse

Python multi-threaded crawler methods include functional and class object methods. 1. Functional start_new_thread(func,args[]). The code example is as follows:

Pitfalls encountered in python multi-threaded crawlers

Figure 1: Functional multi-threading

2. The code example of calling the Thread class in class object mode is as follows:

Pitfalls encountered in python multi-threaded crawlers

Figure 2: Multi-threaded code structure and process in class object mode:

Introduce the threading module

Define subclasses myThread inherits the threading.Thread class.

Redefine the run() method of the parent class Thread and execute the function code in it

Instantiate the thread object

Start executing the thread start()

Join the thread queue until execution is completed, join().

Problems encountered:

When defining a subclass, an error occurred in the definition of the class and the reference method print_time() within the class. The specific code and error are shown in Figure 2 and Figure 3. shown.

Pitfalls encountered in python multi-threaded crawlers

Figure 2: Error code

Pitfalls encountered in python multi-threaded crawlers

##Figure 3: Error message

2 .Problems encountered in the specific application process of crawling

http://www.78b2b.com/lianghuizhuanti/324826_1.html web page information. The specific code is shown in Figure 4:

Pitfalls encountered in python multi-threaded crawlers

Pitfalls encountered in python multi-threaded crawlers

Figure 4: Specific application code

The code intention is Use multi-threading to crawl the 2020 Liaoning Government Work Report from 13 web pages and save it in a local TXT file. During the execution process, all web pages are opened, but the TXT storage data is incomplete and the content is repeatedly written.

The above is the detailed content of Pitfalls encountered in python multi-threaded crawlers. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn