Home  >  Article  >  Backend Development  >  How to use phpspider crawler

How to use phpspider crawler

小云云
小云云Original
2018-03-20 10:38:296979browse

This article mainly shares with you how to use the phpspider crawler. Although it is very convenient to use the python crawler, I found that PHP is not weak in this convenience; it is really much more efficient to use the framework crawler.

1, first look at the structure of phpspider


2, for example: For example, I crawled a category of Nanchang News Network


This comment must be added, otherwise an error will be reported. You can look at the source code. There are many methods in the source code;

3, then configure Crawler:



##4, then put the configuration file into the framework class file and instantiate:

The on_scan_page here is the entry URL for crawling. These URLs match the content_url_regxes regular rules I configured, so in the subsequent crawling process, the data of these pages will be crawled


5. Perform callback processing on the matched fields:


6. Perform crawling data into the database , Run it


The above is just a simple example, you can also perform multi-process crawling, proxy crawling, and a lot of fun.

Related recommendations:

Detailed explanation of CURL for PHP web crawler

How to implement crawler in PHP

NodeJS crawler detailed explanation

The above is the detailed content of How to use phpspider crawler. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn