Home  >  Article  >  Backend Development  >  Teach you how to crawl web images through keywords

Teach you how to crawl web images through keywords

Y2J
Y2JOriginal
2017-05-09 14:21:015153browse

This article mainly introduces the Python crawler: the method of crawling Baidu images through keywords. It has a very good reference value. Let’s take a look at it with the editor.

Tools used: Python2.7, click here to download

scrapyFramework

sublime text3

一. Build python (Windows version)

1.Installpython2.7 ---Then enter python in cmd. If the interface is as follows, the installation is successful

## 2. Integrate the Scrapy framework----Enter the command line: pip install Scrapy

The successful installation interface is as follows:

There are many failure situations, here is an example:

Solution:

Remaining errors can be searched on Baidu

.

two. StartProgramming.

1. Crawl static websites without anti-crawler measures. For example, Baidu Tieba and Douban Reading.

For example - a post in "Desktop Bar" tieba.baidu.com/p/2460150866?red_tag=3569129009

The python code is as follows:

Code

Comments: Two modules urllib,re are introduced. Define two functions. The first function is to obtain the entire target webpage data, and the second function is to obtain the target image in the target webpage, traverse the webpage, and sort the acquired images starting from 0.

Note: re module knowledge points:

Crawling picture rendering:

picture The saving path defaults to the same directory as the created .py file.

2. Crawling Baidu images with anti-crawler measures. Such as Baidu pictures, etc.

For example, the keyword search "emoticon package" https://image.baidu.com/search/index?tn=baiduimage&ct=201326592&lm=-1&cl=2&ie=gbk&word=%B1% ED%C7%E9%B0%FC&fr=ala&ori_query=%E8%A1%A8%E6%83%85%E5%8C%85&ala=0&alatpl=sp&pos=0&hs=2&xthttps=111111

The picture is scrolling To load, crawl the top 30 pictures first.

The code is as follows:

Code comments: Import 4 modules,

os module is used to specify the save path. The first two functions are the same as above. The third function uses the if statement and tryException exception.

The crawling process is as follows:

Crawling results:

Note: Write python code Pay attention to alignment, and do not mix tabs and spaces, as it is easy to report errors.

【Related recommendations】

1.

Python Free Video Tutorial

2.

Python Learning Manual

3.

Python object-oriented video tutorial

The above is the detailed content of Teach you how to crawl web images through keywords. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn