How to Web Scrape

Linda Hamilton
Linda HamiltonOriginal
2024-10-17 19:08:02395browse

How to Web Scrape

Web Scraping with PHP

Question:

How can I extract the title, an image, and text or description from a specified URL without external libraries in PHP?

Answer:

To simplify this task, consider utilizing the simple_html_dom library. The following example demonstrates how to obtain the title and first image using this library:

<code class="php">require 'simple_html_dom.php';

$html = file_get_html('http://www.google.com/');
$title = $html->find('title', 0);
$image = $html->find('img', 0);

echo $title->plaintext . "<br>\n";
echo $image->src;</code>

If you prefer to avoid external libraries, you can extract data using regular expressions, though this approach is not recommended for HTML.

<code class="php">$data = file_get_contents('http://www.google.com/');

preg_match('/<title>([^<]+)<\/title>/i', $data, $matches);
$title = $matches[1];

preg_match('/<img[^>]*src=["\']([^\'"']+)["\'][^>]*>/i', $data, $matches);
$img = $matches[1];

echo $title . "<br>\n";
echo $img;</code>

The above is the detailed content of How to Web Scrape. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn