Home >Backend Development >Python Tutorial >Introduction to Scrapy's Common Command Line Tools

Introduction to Scrapy's Common Command Line Tools

零下一度
零下一度Original
2017-06-28 15:55:041551browse

View all commands

scrapy -h

View help information

scapy --help

View version information

(venv)ql@ql:~$ scrapy version
Scrapy 1.1.2(venv)ql@ql:~$ 
(venv)ql@ql:~$ scrapy version -vScrapy    : 1.1.2lxml      : 3.6.4.0libxml2   : 2.9.4Twisted   : 16.4.0Python    : 2.7.12 (default, Jul  1 2016, 15:12:24) - [GCC 5.4.0 20160609]pyOpenSSL : 16.1.0 (OpenSSL 1.0.2g-fips  1 Mar 2016)Platform  : Linux-4.4.0-36-generic-x86_64-with-Ubuntu-16.04-xenial
(venv)ql@ql:~$

Create a new project

scrapy startproject spider_name

Build a crawler genspider (generator spider)

There can be multiple spiders in a project, but the name must be unique

scrapy genspider name domain#For example: #scrapy genspider sohu sohu.org

View how many crawlers there are in the current project

scrapy list

view Use a browser to open the web page

scrapy view www.baidu.com

shell command, enter the scrapy interactive environment

#Enter the interactive environment scrapy shell of the url www.dmoz.org/Computers/Programming/Languages/Python/ Books/

Then enter the interactive environment
We mainly use the response command here, for example, you can use

response.xpath() #Add the xpath path directly in the brackets

The runspider command is used to directly run the created crawler, and will not run the entire project

scrapy runspider crawler name


The above is the detailed content of Introduction to Scrapy's Common Command Line Tools. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn