In the previous process of developing PHP, there was a website that needed to be navigated, which required the use of Baidu hot words and the TOP50 of Baidu search rankings.
You can find 50 items based on the FOR loop The address can be grabbed for these based on simple_html_dom.php simple_html_dom.php Baidu puts it in the same directory I use THINKPHP and put it in the same Action //http://top.baidu.com/buzz/top10.html //http://top.baidu.com/buzz?b=1&c=513 //http://top.baidu. com/buzz?b=1&fr=topcategory_c513
- $now_url = 'http://top.baidu.com/buzz.php?p=top10';
- $content = '';
- if (function_exists ( 'curl_init' )) {
- $ch = curl_init ( $now_url );
- curl_setopt ( $ch, CURLOPT_HEADER, 0 );
- curl_setopt ( $ch, CURLOPT_TIMEOUT, 30 ); // Set timeout limit to prevent infinite loop
- curl_setopt ( $ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" );
- // curl_setopt ( $ch, CURLOPT_USERAGENT,
- // "Baiduspider+(+http://www.baidu.com/search/spider.htm)" );
- curl_setopt ( $ ch, CURLOPT_RETURNTRANSFER, 1 );
- $content = curl_exec ( $ch );
- curl_close ( $ch );
- } elseif (function_exists ( 'file_get_contents' )) {
- $content = file_get_contents ( $now_url );
- } else {
- exit ( 'Your server does not support components at the same time and cannot start collecting!' );
- }
- include_once ('simple_html_dom.php');
- // Create a new Dom instance
- $html = new simple_html_dom ();
- // Load from string
- $html->load ( $content ); // syncad_3
- $new1 = $html->find ( 'table .keyword .list-title text' ); // According to the keyword list of table -title Find out the data under the tag
- $keyArray = array ();
- for($i = 0; $i < 20; $i ++) {
- $item = iconv ( "GB2312", "UTF- 8", $new1 [$i] . '' );
- $keyArray [] = $item;
- }
- $this->assign ( 'keyArray', $keyArray );
- $html->clear () ;
- unset ( $html );
Copy code
|