Home >Backend Development >PHP Tutorial >Practical crawler practice: using PHP to crawl stock information
The stock market has always been a topic of great concern. The daily rise, fall and changes in stocks directly affect investors' decisions. If you want to understand the latest developments in the stock market, you need to obtain and analyze stock information in a timely manner. The traditional method is to manually open major financial websites to view stock data one by one. This method is obviously too cumbersome and inefficient. At this time, crawlers have become a very efficient and automated solution.
Next, we will demonstrate how to use PHP to write a simple stock crawler program to obtain stock data.
Before writing the crawler program, you need to prepare the following work:
Among them, the HTTP request library is used to send HTTP requests and obtain the HTML source code of the target website; HTML DOM is used to parse and traverse HTML pages; XPath is a language for selecting in XML and HTML documents.
Before we start writing the crawler program, we need to know the URL of the target website and the stock code that needs to be obtained. Taking Sina Finance as an example, the URL of its stock data is as follows:
http://finance.sina.com.cn/realstock/company/sh600000/nc.shtml
Among them, sh600000 represents the stock code of the Shanghai Stock Exchange. Similarly, the stock code of the Shenzhen Stock Exchange starts with sz. We can build a URL based on the stock code we need to get, and use the HTTP request library to get the HTML source code.
After obtaining the HTML source code, we need to use the HTML DOM parser to parse the HTML page and use XPath syntax to filter out the required stock data. In this example, we need to filter out the name and current price of the stock.
Finally, we can print out the obtained stock data. The specific code is as follows:
$code = 'sh600000'; // 股票代码 $url = 'http://finance.sina.com.cn/realstock/company/' . $code . '/nc.shtml'; // 构建URL $html = file_get_contents($url); // 获取HTML源码 $dom = new DOMDocument(); @$dom->loadHTML($html); // 解析HTML $xpath = new DOMXPath($dom); $name = $xpath->query('//h1[@class="name"]/text()')->item(0)->nodeValue; // 筛选股票名称 $price = $xpath->query('//span[@class="price"]/text()')->item(0)->nodeValue; // 筛选当前价格 echo $name . '的当前价格为' . $price;
Before running the test, we need to ensure that the HTTP request library and related extensions have been installed in the local PHP environment. Taking the Windows system as an example, you can install it with the following command:
composer require php-http/guzzle6-adapter composer require php-http/message
Next, we can try to obtain the stock data of the Shanghai Composite Index (stock code sh000001):
$code = 'sh000001'; // 上证指数 $url = 'http://finance.sina.com.cn/realstock/company/' . $code . '/nc.shtml'; $client = new HttpAdapterGuzzle6Client(); $request = new HttpMessageRequest('GET', $url); $response = $client->sendRequest($request); $html = $response->getBody()->getContents(); $dom = new DOMDocument(); @$dom->loadHTML($html); // 解析HTML $xpath = new DOMXPath($dom); $name = $xpath->query('//h1[@class="name"]/text()')->item(0)->nodeValue; $price = $xpath->query('//span[@class="price"]/text()')->item(0)->nodeValue; echo $name . '的当前价格为' . $price;
After running the code, we can See the current price information of the Shanghai Composite Index output on the console.
The above code is just a simple example. In actual application, the following factors need to be considered for optimization:
In short, the writing of stock crawler programs needs to take into account many aspects such as security, efficiency and practicality, and needs to be designed and implemented to achieve the best results.
The above is the detailed content of Practical crawler practice: using PHP to crawl stock information. For more information, please follow other related articles on the PHP Chinese website!