Home >Backend Development >Python Tutorial >Teach you step by step how to use Python web crawler to obtain fund information

Teach you step by step how to use Python web crawler to obtain fund information

Go语言进阶学习forward: 2023-07-24 14:53:201053browse

##1. Preface

First few A fan came to me the other day to get fund information. I would like to share it here. Friends who are interested can also try it actively.

Teach you step by step how to use Python web crawler to obtain fund information

2. Data Acquisition

Our target website here is the official website of a certain fund, which needs to be crawled The data is shown in the figure below.

You can see that the fund code column in the picture above has different numbers. Click on one randomly to enter the fund details page. The links are also very regular, with the fund code as the symbol. Teach you step by step how to use Python web crawler to obtain fund information

In fact, this website is not difficult. The data is not encrypted. The information on the web page can be seen directly in the source code.

Teach you step by step how to use Python web crawler to obtain fund information

This reduces the difficulty of crawling. Through the browser packet capture method, you can see the specific request parameters, and you can see that only

pi is changing in the request parameters, and this value happens to correspond to the page, so you can directly construct the request parameters. . Teach you step by step how to use Python web crawler to obtain fund information

Code implementation process

After finding the data source, the next step is to implement the code. Let’s take a look. Here are Output some key codes.

Get the stock id data

response = requests.get(url, headers=headers, params=params, verify=False)
    pattern = re.compile(r&#39;.*?"(?P<items>.*?)".*?&#39;, re.S)
    result = re.finditer(pattern, response.text)
    ids = []
    for item in result:
        # print(item.group(&#39;items&#39;))
        gp_id = item.group(&#39;items&#39;).split(&#39;,&#39;)[0]

The result is as shown below:

The details will be constructed later page link to obtain the fund information of the details page. The key code is as follows:

response = requests.get(url, headers=headers)
response.encoding = response.apparent_encoding
selectors = etree.HTML(response.text)
danweijingzhi1 = selectors.xpath(&#39;//dl[@class="dataItem02"]/dd[1]/span[1]/text()&#39;)[0]
danweijingzhi2 = selectors.xpath(&#39;//dl[@class="dataItem02"]/dd[1]/span[2]/text()&#39;)[0]
leijijingzhi = selectors.xpath(&#39;//dl[@class="dataItem03"]/dd[1]/span/text()&#39;)[0]
lst = selectors.xpath(&#39;//div[@class="infoOfFund"]/table//text()&#39;)

The result is as shown in the figure below:

Teach you step by step how to use Python web crawler to obtain fund information Process the specific information into corresponding strings, and then save it to csv file, the results are as shown below:

Teach you step by step how to use Python web crawler to obtain fund information With this, you can do further statistics and data analysis.

3. Summary

Hello everyone, I am a Python advanced person. This article mainly shares the use of Python web crawler to obtain fund data information. This project is not too difficult, but there are a few pitfalls. Everyone is welcome to try it. If you encounter any problems, please add me as a friend and I will help solve it.

This article is mainly based on the classification of [stock type]. I have not done other types. You are welcome to try. In fact, the logic is the same, just change the parameters. . Teach you step by step how to use Python web crawler to obtain fund information

The above is the detailed content of Teach you step by step how to use Python web crawler to obtain fund information. For more information, please follow other related articles on the PHP Chinese website!

Statement：

This article is reproduced at:Go语言进阶学习. If there is any infringement, please contact admin@php.cn delete

Previous article：Pandas can directly read web page html (table), json, csv and other formatsNext article：Pandas can directly read web page html (table), json, csv and other formats

See more

Teach you step by step how to use Python web crawler to obtain fund information

##1. Preface

2. Data Acquisition

Code implementation process

Get the stock id data

3. Summary

Related articles