


PHP implementation code for batch collection and downloading of pictures of beautiful women_PHP tutorial
Design Idea
Considering that it is too troublesome to simply collect pictures from a web page, we directly collect its list page, get the URL of the list and then collect them one by one, but use PHP to match the list The URL of the page is too troublesome. There are many invalid URLs on the first list page, which is really a problem for me, a newbie with regular expressions. After looking at the structure of the list page, I decided to use jquery to obtain the URL. The universal selector of jquery has become powerful again.
jquery gets the url, and then ajax passes the url—> Corresponding PHP file, traverses url parameters—-> Single page collection and saving images
jquery program
<script> <BR>$(document).ready(function(){ <BR>var hrefs =''; <BR>$('.f_folder>a' ).each(function(i){ <BR>var href = $('.f_folder:eq('+i+')>a:eq(0)').attr('href'); <BR>if (href!='undefined'){ <BR>hrefs +=href+','; <BR>} <BR>}) <BR>$.getJSON("http://www.****.com/ 365/getimg.php?hrefs="+hrefs+"&callback=?", function(data){ <BR>//alert(data.info); <BR>}); <BR>}); <BR>< ;/script> <BR></script>
Here the url is spliced into ',' separated strings to pass the url. Getjson is used for cross-domain needs. For several common issues with getjson, please refer to
PHP collection program
// Grab 365 pictures
error_reporting(E_ALL ^ E_NOTICE);
set_time_limit(0);//Set PHP timeout
/* *
* Get the current time
*/
function getMicrotime() {
list ($usec, $sec) = explode(" ", microtime());
return ((float) $usec + (float) $sec);
}
$stime = getMicrotime();
$callback = $_GET['callback'];
$hrefs = $_GET['hrefs' ];
$urlarray = explode(',',$hrefs);
//Get all images of the specified url
function getimgs($url){
$dirname = basename( $url,".php");
if(!file_exists($dirname)){
mkdir('365/'.$dirname.'');
}
clearstatcache();
$data = file_get_contents($url);
preg_match_all("/(href|src)=(["|']?)([^ "'>]+.(jpg|png|PNG| JPG|gif))2/i", $data, $matches);
//$matches[3] = array_unique($matches[3]);
unset($data);
$ i=0;
if(count($matches[3])>0){
foreach($matches[3] as $k=>$v){
// Simply determine whether it is a standard url, not a relative path
if(substr($v,0,4)=='http'){
$ext = pathinfo($v,PATHINFO_EXTENSION);/ /Image extensions
if(!file_exists('365/'.$dirname.'/'.$k.'.'.$ext)){
file_put_contents('365/'.$dirname .'/'.$k.'.'.$ext,file_get_contents($v));
$i++;
}else{
unset($v);
}
clearstatcache();
}else{
unset($v);
}
}
unset($matches);
return $i;
}
}
foreach($urlarray as $k=>$v){
if($v!=''){
$j +=getimgs($v);
}
}
$etime = getMicrotime();
echo "A total of ".$j." pictures were collected";
echo "Time spent".($etime-$stime)." seconds";
Considering performance issues: the variables used in the getimgs method are unset after use in order to release memory.
Several knowledge points designed
Judge whether it is a standard valid image url
if(substr($v,0,4)=='http') This is just simple Determine whether the matched image URL is a standard URL, because the collected images may have a relative path. Here I will directly give up the collection of this kind of image. Of course, you can also restore this kind of image to a standard image path. There is another The problem is that even in the standard URL format, such a picture may not be able to be collected, because you don’t know if the picture still exists. Maybe the picture URL is invalid. If you want to more strictly judge whether the picture URL is real and valid, you can recommend it. My previous "PHP Methods to Determine whether a Remote URL is Valid" has three methods to verify whether it is a valid URL.
Get the image format
$ext = pathinfo($v,PATHINFO_EXTENSION);//Image extension
The pathinfo method is used here. In summary, there are 7 methods to obtain it. To find the format of the file, recommended article: "Seven Methods for PHP to Determine Image Format"
Download and save locally
file_put_contents('365/'.$dirname.'/'.$k .'.'.$ext,file_get_contents($v));
file_put_contents() function writes a string to a file.
The same function as calling fopen(), fwrite() and fclose() in sequence.
The file_get_contents() function reads the entire file into a string.
Because the server supports file_get_contents, if the server disables this function, you can use curl. This tool is more powerful than file_get_contents. It is recommended to study "CURL Learning and Application (with Multi-Threading)", you can use curl. Multi-thread download and storage, the effect is even more awesome
Clear file operation cache
clearstatcache() function clears the file status cache. The clearstatcache() function caches return information from certain functions to provide higher performance. But sometimes, such as when you check the same file multiple times in a script and the file is in danger of being deleted or modified during the execution of the script, you need to clear the file status cache in order to get the correct results. To do this, use the clearstatcache() function. Official manual:
Program execution time calculation
/* *
* Get the current time
*/
function getMicrotime() {
list ($usec, $sec) = explode(" ", microtime());
return ((float) $usec + (float) $sec);
}
You can refer to this blog article; "Get PHP page execution time, database read and write times, function call times, etc. [THINKPHP]"
Finally take a look at the effect;

Collected 214 pictures in 409 seconds, downloaded and saved one picture in about 2 seconds, the total size of the picture is about 62M, so It seems:
60*60 can download about 1800 pictures of beautiful women in one hour.

PHP is a server-side scripting language used for dynamic web development and server-side applications. 1.PHP is an interpreted language that does not require compilation and is suitable for rapid development. 2. PHP code is embedded in HTML, making it easy to develop web pages. 3. PHP processes server-side logic, generates HTML output, and supports user interaction and data processing. 4. PHP can interact with the database, process form submission, and execute server-side tasks.

PHP has shaped the network over the past few decades and will continue to play an important role in web development. 1) PHP originated in 1994 and has become the first choice for developers due to its ease of use and seamless integration with MySQL. 2) Its core functions include generating dynamic content and integrating with the database, allowing the website to be updated in real time and displayed in personalized manner. 3) The wide application and ecosystem of PHP have driven its long-term impact, but it also faces version updates and security challenges. 4) Performance improvements in recent years, such as the release of PHP7, enable it to compete with modern languages. 5) In the future, PHP needs to deal with new challenges such as containerization and microservices, but its flexibility and active community make it adaptable.

The core benefits of PHP include ease of learning, strong web development support, rich libraries and frameworks, high performance and scalability, cross-platform compatibility, and cost-effectiveness. 1) Easy to learn and use, suitable for beginners; 2) Good integration with web servers and supports multiple databases; 3) Have powerful frameworks such as Laravel; 4) High performance can be achieved through optimization; 5) Support multiple operating systems; 6) Open source to reduce development costs.

PHP is not dead. 1) The PHP community actively solves performance and security issues, and PHP7.x improves performance. 2) PHP is suitable for modern web development and is widely used in large websites. 3) PHP is easy to learn and the server performs well, but the type system is not as strict as static languages. 4) PHP is still important in the fields of content management and e-commerce, and the ecosystem continues to evolve. 5) Optimize performance through OPcache and APC, and use OOP and design patterns to improve code quality.

PHP and Python have their own advantages and disadvantages, and the choice depends on the project requirements. 1) PHP is suitable for web development, easy to learn, rich community resources, but the syntax is not modern enough, and performance and security need to be paid attention to. 2) Python is suitable for data science and machine learning, with concise syntax and easy to learn, but there are bottlenecks in execution speed and memory management.

PHP is used to build dynamic websites, and its core functions include: 1. Generate dynamic content and generate web pages in real time by connecting with the database; 2. Process user interaction and form submissions, verify inputs and respond to operations; 3. Manage sessions and user authentication to provide a personalized experience; 4. Optimize performance and follow best practices to improve website efficiency and security.

PHP uses MySQLi and PDO extensions to interact in database operations and server-side logic processing, and processes server-side logic through functions such as session management. 1) Use MySQLi or PDO to connect to the database and execute SQL queries. 2) Handle HTTP requests and user status through session management and other functions. 3) Use transactions to ensure the atomicity of database operations. 4) Prevent SQL injection, use exception handling and closing connections for debugging. 5) Optimize performance through indexing and cache, write highly readable code and perform error handling.

Using preprocessing statements and PDO in PHP can effectively prevent SQL injection attacks. 1) Use PDO to connect to the database and set the error mode. 2) Create preprocessing statements through the prepare method and pass data using placeholders and execute methods. 3) Process query results and ensure the security and performance of the code.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Dreamweaver Mac version
Visual web development tools

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 Chinese version
Chinese version, very easy to use

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool