Using simple_html_dom to crawl and display the entire novel in laravel-Laravel-php.cn

Home

PHP Framework

Laravel

Using simple_html_dom to crawl and display the entire novel in laravel

L先生

May 07, 2020 pm 02:14 PM

laravel

As mentioned in Programmers also read novels with advertisements, many novel websites basically have very annoying advertisements, or add links to the overall div, and they will jump to some websites if they are accidentally touched. Even in an infinite loop, some mobile apps also have a lot of ads. This article will apply it to the laravel framework. It is best to understand the previous article first and then deploy it yourself.

1. Introduce third-party classes into laravel

1. Create a new folder in the app directory under the project root directory and name it Lib (custom name )

2. If you introduce many third-party libraries, you can create several new directory categories under Lib. Since only one class is introduced, there is no new folder here. (Defined by yourself according to the number of imported classes)

Copy simple_html_dom.php to Lib

3. Find the composer.json file in the project root directory and write the path of the third-party class Enter the classmap under autoload so that it can be loaded automatically

"autoload": {
"classmap": [
"database/seeds",
"database/factories" ,
"app/Lib/simple_html_dom.php"
]
},

4. Switch to the project root directory in the cmd console and execute the command:

composer dumpautoload

5. Use this class in the controller

use simple_html_dom;

$html = new simple_html_dom(); use

2. Create routing

Route::get(&#39;/novel_list&#39;,&#39;index\Spnovel@index&#39;);

3. Create controller Spnovel.php

<?php
namespace App\Http\Controllers\index;
use simple_html_dom;
use Illuminate\Http\Request;
use App\Http\Controllers\Controller;
class Spnovel extends Controller
{
	public function index(){
		$url = "https://www.7kzw.com/85/85445/";
		$list_html = mySpClass::getCurl($url);
		$data[&#39;List&#39;] = self::getList($list_html);
		return view(&#39;index.spnovel.index&#39;,$data);
	}
	private static function getList($list_html){
		$html = new simple_html_dom();
		@$html->load($list_html);
		$list = $html->find(&#39;#list dd a&#39;);
		foreach ($list as $k=>$v) {
			$arr1=$arr2=[];
			$p1 = &#39;/<a .*?>(.*?)<\/a>/i&#39;;
			$p2 = &#39;/<a .*? href="(.*?)">.*?<\/a>/i&#39;;
			preg_match($p1,$v->outertext,$arr1);
			preg_match($p2,$v->outertext,$arr2);
			$content[$k][0]=$arr1[1];
			$content[$k][1]=$arr2[1];
		}
		array_splice($content,0,12); 
		return $content;
	}
}
class mySpClass{
	// 向服务器发送最简单的get请求
	public static function getCurl($url,$header=null){
		// 1.初始化
		$ch = curl_init($url);   //请求的地址
		// 2.设置选项
		curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);//获取的信息以字符串返回,而不是直接输出(必须) 
		curl_setopt($ch,CURLOPT_TIMEOUT,10);//超时时间（必须）
		curl_setopt($ch, CURLOPT_HEADER,0);// 	启用时会将头文件的信息作为数据流输出。 
		//参数为1表示输出信息头,为0表示不输出
		curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false); //不验证证书
		curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false); //不验证证书
		if(!empty($header)){
			curl_setopt($ch,CURLOPT_HTTPHEADER,$header);//设置头信息
		}else{
			$_head = [
			&#39;User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0&#39;
			]; 
			curl_setopt($ch,CURLOPT_HTTPHEADER,$_head);
		}
		// 3.执行
		$res = curl_exec($ch);
		// 4.关闭
		curl_close($ch);
		return $res;
	}
}

Explanation of the above code: First of all, you need to understand the laravel framework and the php class.

After accessing the above route, the index method in the Spnovel.php controller is run. $url is the chapter of a certain novel. The address of the list, use it as a parameter to run the getcurl method in the custom class mySpClass, and return the html document string of this page. Run the getList method in this class, the parameter is the html string that needs to be parsed. Privatize this method, use simple_html_dom parsing, and configure regular rules to extract the URL address and chapter name of each chapter. And return this array, through return view('index.spnovel.index',$data); will open index/spnovel/index.blade.php, please see index.blade.php

four , Create the view index.blade.php

<!DOCTYPE html>
<html>
<head>
	<title>爬取的小说列表</title>
	<style type="text/css">
	body{padding:0px;margin:0px;}
	#lists{width:100%;padding:30px 50px;box-sizing:border-box;}
	ul{margin:0;padding: 0;overflow:hidden;}
	ul li{list-style:none;display:inline-block;float:left;width:25%;color:#444;}
	ul li:hover{color:#777;cursor: pointer;}
	img {z-index:-1;width:100%;height:100%;position:fixed;}
	</style>
</head>
<body>
	<img  src="/static/imghwm/default1.png"  data-src="/static/img/index/novelbg.jpg"  class="lazy"   alt="Using simple_html_dom to crawl and display the entire novel in laravel" >
	<div id="lists">
		<ul>
			@foreach($List as $item)
			<li>
			<a href="/novel_con{{$item[1]}}">{{$item[0]}}</a>
			</li>
			@endforeach
		</ul>		
	</div>
</body>
</html>

Explanation of the above code: The css is simply written here, and the img is used as the background image. In the loop li in ul, {{$item[1]}} is the obtained address parameter, and {{$item[0]}} is the obtained chapter name. Take a look at the array and the final effect.

Using simple_html_dom to crawl and display the entire novel in laravel

5. Run

Using simple_html_dom to crawl and display the entire novel in laravel

The following is the content of each chapter

Look at the routing first:

Route::get(&#39;/novel_con/{a}/{b}/{c}&#39;,&#39;index\Spnovel@get_nContent&#39;);

This corresponds to the url parameters of each chapter. For example, the parameters of a certain chapter are: novel_con/85/85445/27248645.html

Writeget_nContent method:

public function get_nContent(Request $req){
		$url1 = $req->a.&#39;/&#39;.$req->b.&#39;/&#39;.$req->c;
		$url = "https://www.7kzw.com/".$url1;
		$res = mySpClass::getCurl($url);//获得
		// 开始解析
		$data[&#39;artic&#39;]= self::getContent($res);
		$next = (int)$req->c;
		$next = $next+1;
		$data[&#39;artic&#39;][&#39;next&#39;]="/novel_con/".$req->a.&#39;/&#39;.$req->b.&#39;/&#39;.$next.&#39;.html&#39;;
		return view(&#39;index.spnovel.ncontent&#39;,$data);
	}
private static function getContent($get_html){
		$html = new simple_html_dom();
		@$html->load($get_html);
		$h1 = $html->find(&#39;.bookname h1&#39;);
		foreach ($h1 as $k=>$v) {
			$artic[&#39;title&#39;] = $v->innertext;
		}
		// 查找小说的具体内容
		$divs = $html->find(&#39;#content&#39;);
		foreach ($divs as $k=>$v) {
			$content = $v->innertext;
		}
		// 正则替换去除多余部分
		$pattern = "/(<p>.*?<\/p>)|(<div .*?>.*?<\/div>)/";
		$artic[&#39;content&#39;] = preg_replace($pattern,&#39;&#39;,$content);
		return $artic;
	}

Explanation:$req->a,$req- >b, $req->c, are three parameters respectively, and then merge them into a complete address to request a certain chapter, and then obtain the html string of a certain chapter through mySpClass::getCurl. Then use getContent in this class to parse this page. First, look at the parsing method, parse the title and content of the chapter with the previous article, write it into the array, and remove the redundant text advertisement part. $next is the address of the next chapter stored, which is used to jump to the chapter details page.

View ncontent.blade.php

<!DOCTYPE html>
<html>
<head>
	<title>{{$artic[&#39;title&#39;]}}</title>
	<style type="text/css">
	h2{text-align:center;padding-top:30px;}
	div{margin:20px 50px;font-size:20px;}
	img {z-index:-1;width:100%;height:100%;position:fixed;}
	.next {position:fixed;right:10px;bottom:20px;background:coral;border-radius:3px;padding:4px;}
	.next:hover{color:#fff;}
	</style>
</head>
<body>
	<img  src="/static/imghwm/default1.png"  data-src="/static/img/index/novelbg.jpg"  class="lazy"   alt="Using simple_html_dom to crawl and display the entire novel in laravel" >
	<h2 id="artic-title">{{$artic[&#39;title&#39;]}}</h2>
	<a href="{{$artic[&#39;next&#39;]}}" class="next">下一章</a>
	<div>
		{!!$artic[&#39;content&#39;]!!}
	</div>
</body>
</html>

Explanation: Because there is only the current article, there is no need to loop, { {$artic['title']}} is the title, and can also be written into the title. The way {!!$artic['content']!!} is written is that there is no need to escape the content of the article, otherwise there will be many other characters, such as
, etc. The address of the button for the next chapter can be passed directly. position:fixed fixes the positioning button, and you can go to the next chapter at any time.

Run:

Using simple_html_dom to crawl and display the entire novel in laravel

Summary: The most important part of this article is to introduce third-party classes that can be applied He, and also the basics of laravel, are more accustomed to using the controller view. If you use the model, please write your own verification.

This is enough for a novel. Of course, we can expand it and write out the novel list of the entire site. It will be even more perfect if we continue to pass the appropriate parameters.

The above is the detailed content of Using simple_html_dom to crawl and display the entire novel in laravel. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Using Laravel: Streamlining Web Development with PHPApr 19, 2025 am 12:18 AM

Laravel optimizes the web development process including: 1. Use the routing system to manage the URL structure; 2. Use the Blade template engine to simplify view development; 3. Handle time-consuming tasks through queues; 4. Use EloquentORM to simplify database operations; 5. Follow best practices to improve code quality and maintainability.

Laravel: An Introduction to the PHP Web FrameworkApr 19, 2025 am 12:15 AM

Laravel is a modern PHP framework that provides a powerful tool set, simplifies development processes and improves maintainability and scalability of code. 1) EloquentORM simplifies database operations; 2) Blade template engine makes front-end development intuitive; 3) Artisan command line tools improve development efficiency; 4) Performance optimization includes using EagerLoading, caching mechanism, following MVC architecture, queue processing and writing test cases.

Laravel: MVC Architecture and Best PracticesApr 19, 2025 am 12:13 AM

Laravel's MVC architecture improves the structure and maintainability of the code through models, views, and controllers for separation of data logic, presentation and business processing. 1) The model processes data, 2) The view is responsible for display, 3) The controller processes user input and business logic. This architecture allows developers to focus on business logic and avoid falling into the quagmire of code.

Laravel: Key Features and Advantages ExplainedApr 19, 2025 am 12:12 AM

Laravel is a PHP framework based on MVC architecture, with concise syntax, powerful command line tools, convenient data operation and flexible template engine. 1. Elegant syntax and easy-to-use API make development quick and easy to use. 2. Artisan command line tool simplifies code generation and database management. 3.EloquentORM makes data operation intuitive and simple. 4. The Blade template engine supports advanced view logic.

Building Backend with Laravel: A GuideApr 19, 2025 am 12:02 AM

Laravel is suitable for building backend services because it provides elegant syntax, rich functionality and strong community support. 1) Laravel is based on the MVC architecture, simplifying the development process. 2) It contains EloquentORM, optimizes database operations. 3) Laravel's ecosystem provides tools such as Artisan, Blade and routing systems to improve development efficiency.

Laravel framework skills sharingApr 18, 2025 pm 01:12 PM

In this era of continuous technological advancement, mastering advanced frameworks is crucial for modern programmers. This article will help you improve your development skills by sharing little-known techniques in the Laravel framework. Known for its elegant syntax and a wide range of features, this article will dig into its powerful features and provide practical tips and tricks to help you create efficient and maintainable web applications.

The difference between laravel and thinkphpApr 18, 2025 pm 01:09 PM

Laravel and ThinkPHP are both popular PHP frameworks and have their own advantages and disadvantages in development. This article will compare the two in depth, highlighting their architecture, features, and performance differences to help developers make informed choices based on their specific project needs.

Laravel user login function listApr 18, 2025 pm 01:06 PM

Building user login capabilities in Laravel is a crucial task and this article will provide a comprehensive overview covering every critical step from user registration to login verification. We will dive into the power of Laravel’s built-in verification capabilities and guide you through customizing and extending the login process to suit specific needs. By following these step-by-step instructions, you can create a secure and reliable login system that provides a seamless access experience for users of your Laravel application.

See all articles