Use the python scrapy framework to crawl URLs in a loop. Each time it runs for a while, it gets stuck without any error.

Question

Every time it runs for about half an hour, it freezes directly. There is no error in the log. When it freezes, the CPU usage is very high

I set the download timeout in setting.py, but it is not the reason for timeout

ctrl-c cannot exit normally. After ctrl-z exits, the same problem persists when continuing to execute. It freezes again after half an hour.

高洛峰 · Answer

First check TOP to see if the memory is too high or the CPU is too high, and then find out which processes are occupied
If they are all your crawler processes, then you have to check the code to see if there is anything that has not been released

In short, let’s investigate from all aspects

PHP中文网 · Answer

<p>strace</p>

Use the python scrapy framework to crawl URLs in a loop. Each time it runs for a while, it gets stuck without any error.

reply all(2)I'll reply