Home >Backend Development >PHP Tutorial >Crawl and analyze_PHP tutorial

Crawl and analyze_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 17:25:06889browse

Crawling and analyzing a file is very simple. This tutorial will take you step by step through an example to implement it. Let's get started!

First, we must decide which URL address we will crawl. This can be set in a script or passed via $QUERY_STRING. For
simplicity's sake, let's set the variable directly in the script.


$url = http://www.php.net;
?>

In the second step, we grab the specified file and pass file() The function stores it in an array.


$url = http://www.php.net;
$lines_array = file($url);
?>

Okay, There are now files in the array. However, the text we want to analyze may not all be in one line. To resolve this file, we can simply convert the array $lines_array into a string. We can use the implode(x,y) function to achieve this. If you want to use explode later (array of string variables), it may be better to set x to "|" or "!" or other similar delimiter. But for our purposes, it's best to set x to a space. y is another necessary parameter because it is the array you want to process with implode().


$url = http://www.php.net;
$lines_array = file($url);
$lines_string = implode(, $lines_array);
?>

Now that the crawling work is done, it’s time to analyze. For the purposes of this example, we want to get everything between

and . In order to parse out the string, we also need something called a regular expression.


$url = http://www.php.net;
$lines_array = file($url);
$lines_string = implode(, $lines_array);
eregi("(.*)", $lines_string, $head);

?>

Let’s take a look at the code. As you can see, the eregi() function is executed in the following format:

eregi("(.*)", $lines_string, $head);

 (.*)" means everything, which can be interpreted as, "analyze everything between and ". $lines_string is the string we are analyzing, and $head is the array where the analyzed results are stored.

Finally, we can enter the data. Since there is only one instance between and , we can safely assume that there is only one element in the array, and it is the one we want. Let's print it out.


$url = http://www.php.net;
$lines_array = file($url);
$lines_string = implode(, $lines_array);
eregi("(.*)", $lines_string, $head);
echo $head[0];
?>

This is the whole code .

Reprinted from WeberDev.com

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/532087.htmlTechArticleCrawling and analyzing a file is very simple. This tutorial will take you step by step through an example to implement it. Let's get started! First, my leader must decide that we will arrest...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn