I had a rest at night and wanted to watch two good movies.
I searched for a long time but couldn’t find what I wanted to watch.
I suddenly thought that someone had crawled Zhihu’s user data before. I had a whim,
It’s okay to crawl down the movie information of BT Paradise,I can check the database directly next time.
I can only say that I am so bored , haha, I can still code
^_^
1. Grab the website html source code
<span style="font-size:24px;">$url = "www.bttiantang.cc"; $html = shell_exec("curl $url");</span>
2. Get the total number of pages, Total number of movies (regular matching)
<span style="font-size:24px;">preg_match("/<span class='\"pageinfo\"'>.*?/", $html, $pageCount); preg_match_all("/\d{1,10000}/",$pageCount[0],$pageCount);</span></span>
3. Capture movie information (regular matching information)
<span style="font-size:24px;">preg_match("/\d{4}\/\d{2}\/\d{2}/" , $pageInfo[0][$i], $updateTime); preg_match("/<font color='\"#FF6600\"'>(.*?)<i>/" , $pageInfo[0][$i], $movieName); preg_match("/<strong>(\d{1})/" , $pageInfo[0][$i], $movieScore_int); preg_match("/<em class='\"fm\"'>(\d{1})/" , $pageInfo[0][$i], $movieScore_decimal); preg_match("/href=\"(.*?)\"/" , $pageInfo[0][$i], $movieUrl); preg_match("/<p class='\"des\"'>(.*?)/" , $pageInfo[0][$i], $actor); </p></em></strong></i></font></span>
4. Insert into the database and you’re done
Generally speaking, the speed of php crawling is quite fast. It takes less than 4 minutes to collect more than 20,000 pieces of information.
start:01:22:54
end:01:26:11
Attached database screenshot:
Attached source code:
<?php $url = "www.bttiantang.cc"; $html = shell_exec("curl $url"); preg_match("/<span class=\"pageinfo\">.*?/", $html, $pageCount); preg_match_all("/\d{1,10000}/",$pageCount[0],$pageCount); $pageSize = intval($pageCount[0][0]); $movieCount = $pageCount[0][1]; $conn = mysql_connect('***','***',''); mysql_select_db('***',$conn); mysql_query('set names utf8',$conn); for($j=1;$j.*?/s", $movieHtml, $pageInfo); for($i=0;$i<count preg_match ad if str_replace color='\"#FF6600\"'>(.*?)<i>/" , $pageInfo[0][$i], $movieName); /*****same conditions*****/ if(empty($movieName)) preg_match("/<b>(.*?)<i>/" , $pageInfo[0][$i], $movieName); if(empty($movieName)) preg_match("/<b>(.*?)/" , $pageInfo[0][$i], $movieName); /************************/ $movieName = $movieName[1]; preg_match("/<strong>(\d{1})/" , $pageInfo[0][$i], $movieScore_int); $movieScore_int = $movieScore_int[1]; preg_match("/<em class='\"fm\"'>(\d{1})/" , $pageInfo[0][$i], $movieScore_decimal); $movieScore_decimal = $movieScore_decimal[1]; $movieScore = floatval($movieScore_int.'.'.$movieScore_decimal); preg_match("/href=\"(.*?)\"/" , $pageInfo[0][$i], $movieUrl); $movieUrl = $movieUrl[1]; preg_match("/<p class='\"des\"'>(.*?)/" , $pageInfo[0][$i], $actor); $movieActor = str_replace("<em>",'',str_replace("</em>",'',$actor[1])); mysql_unbuffered_query("insert into movie (name,actor,url,update_ts,score) values ('$movieName','$movieActor','$movieUrl',<span style="white-space:pre"> </span>'$updateTime','$movieScore')"); } } ?></p></em></strong></b></i></b></i></count>
This movie information is grabbed from BT Paradise and does not involve confidential information. Therefore, I do not bear any legal responsibility!
If any relevant movie information involves your copyright or intellectual property rights or other interests, please inform us and it will be deleted as soon as possible after confirmation.
Copyright Statement: This article is an original article by the blogger and may not be reproduced without the blogger's permission.
The above introduces how to crawl BT Paradise movie data, including aspects of content. I hope it will be helpful to friends who are interested in PHP tutorials.

php提交表单通过后,弹出的对话框怎样在当前页弹出php提交表单通过后,弹出的对话框怎样在当前页弹出而不是在空白页弹出?想实现这样的效果:而不是空白页弹出:------解决方案--------------------如果你的验证用PHP在后端,那么就用Ajax;仅供参考:HTML code<form name="myform"

每个人都在期待今天的Windows1123H2发布。事实上,Microsoft刚刚启动了对发布预览版的更新,这是正式发布阶段之前最接近的频道。被称为Build22631的Microsoft表示,他们正在推出新的更名聊天应用程序,电话链接和一起玩小部件,这些小部件在过去几个月中已在其他内部渠道中进行了测试。“这个新的更新将具有与Windows11版本22H2相同的服务分支和代码库,并将与所有新宣布的功能累积,包括Windows中的Copilot(预览版),”Microsoft承诺。雷德蒙德官员进一

说明match用于匹配操作,其返回值为boolean类型。通过match,可以简单地验证list中是否存在某种要素。实例//验证list中string是否有以a开头的,匹配到第一个,即返回truebooleananyStartsWithA=stringCollection.stream().anyMatch((s)->s.startsWith("a"));System.out.println(anyStartsWithA);//true//验证list中string是否

概念1、各种Match操作可用于判断给定的Predicate是否符合Stream的要素。2、Match操作是终端操作,返回布尔值。实例booleananyStartsWithA=stringCollection.stream().anyMatch((s)->s.startsWith("a"));System.out.println(anyStartsWithA);//truebooleanallStartsWithA=stringCollection.stream().

概念1、各种Match操作可用于判断给定的Predicate是否符合Stream的要素。2、Match操作是终端操作,返回布尔值。实例booleananyStartsWithA=stringCollection.stream().anyMatch((s)->s.startsWith("a"));System.out.println(anyStartsWithA);//truebooleanallStartsWithA=stringCollection.stream().

图片消失如何解决先是图片文件上传$file=$_FILES['userfile']; if(is_uploaded_file($file['tmp_name'])){$query=mysql_query("INSERT INTO gdb_banner(image_src ) VALUES ('images/{$file['name'

请问如何修改url某一参数的参数值呢?是要拆开了再拼回去吗?那么请问如何修改url某一参数的参数值呢?是要拆开了再拼回去吗?http://127.0.0.1/myo/newuser.php?mod=search&type=fastone比如现在我要修改mod=new要怎么做呢?------解决方案--------------------发送了请求

不用数据库来实现用户的简单的下载,代码如下,但是却不能下载,请高手找下原因,文件路劲什么的没问题。<?phpfunction down_file($file_name,$file_sub_dir){//为防止乱码使用函数iconv$file_name=iconv("utf-8","gb2312",$file_


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SublimeText3 Linux new version
SublimeText3 Linux latest version

SublimeText3 English version
Recommended: Win version, supports code prompts!
