Home >Backend Development >PHP Tutorial >Program PHP Collection Program Principle Analysis

Program PHP Collection Program Principle Analysis

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOriginal
2016-07-29 08:42:04912browse

After thinking hard for a few days, I finally figured out the reason behind it. Write it down here and ask experts to correct me.
The idea of ​​the collection program is very simple. It is nothing more than opening a page, usually a list page, getting the addresses of all the links in it, and then opening the links one by one to look for what we are interested in. If found, put it into the database or elsewhere. processing. Let's talk about it with a very simple example.
First determine a collection page, usually the list page. The target here is: http://www.jb51.net/article/11/index.htm. This is a list page, and our purpose is to collect all articles on this list page.
There is a list page. The first step is to open it and incorporate its content into our program. Generally, the two functions fopen or file_get_contents are used. We use fopen as an example here. How to open it? It's very simple: $source=fopen("http://www.jb51.net/article/11/index.htm",'r'); In fact, the content has been incorporated into our program. Note that the $source obtained is a resource, not a processable text, so the function fread is used to read the content into a variable. This time it is a real editable text. Example:
$c//www.jb51.net/article/7/all/545.1.htm)]. By looking at the source code, we can see that the link addresses of the articles inside all look like this

Encapsulate the database connection code in a function and call it when you need to read..
We can write regular expressions. $count=preg_match_all("/
(.+?)/",$content,$art_list);
The array $art_list[1][$s] contains the link address of an article. And $art_list[2][$s] contains the title of a certain article. At this point, it can be considered half the battle.
Then use a for loop to hit each link in turn, and then get the content in the same way as the title. The above are similar to the tutorials I found online, but when it comes to this for loop, the online tutorials are terrible. I haven't found an article that can explain this clearly. At the beginning, I used js to help the loop, or used Let me give you an example. At the beginning, I did this:
for($i=0;$i<20;4i++ {
The middle is the part of collecting content, which is omitted.
After one page has been collected, I must collect another page. Ah
But it doesn't work when I use fopen to open the link. The request fails or something, and it doesn't work with js. Finally I know that we need to use this echo "}
My mind is a little uncomfortable, and the writing is a bit messy, so just make do with it. In the eyes of experts, this may not be a big deal, but for novices like me. , it’s really helpful.

The above is an introduction to the program PHP collection program principle analysis, including program content. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn