首頁  >  文章  >  後端開發  >  如何進行網頁抓取

如何進行網頁抓取

Linda Hamilton
Linda Hamilton原創
2024-10-17 19:08:02278瀏覽

How to Web Scrape

Web Scraping with PHP

Question:

How can I extract the title, an image, and text or description from a specified URL without external libraries in PHP?

Answer:

To simplify this task, consider utilizing the simple_html_dom library. The following example demonstrates how to obtain the title and first image using this library:

<code class="php">require 'simple_html_dom.php';

$html = file_get_html('http://www.google.com/');
$title = $html->find('title', 0);
$image = $html->find('img', 0);

echo $title->plaintext . "<br>\n";
echo $image->src;</code>

If you prefer to avoid external libraries, you can extract data using regular expressions, though this approach is not recommended for HTML.

<code class="php">$data = file_get_contents('http://www.google.com/');

preg_match('/<title>([^<]+)<\/title>/i', $data, $matches);
$title = $matches[1];

preg_match('/<img[^>]*src=["\']([^\'"']+)["\'][^>]*>/i', $data, $matches);
$img = $matches[1];

echo $title . "<br>\n";
echo $img;</code>

以上是如何進行網頁抓取的詳細內容。更多資訊請關注PHP中文網其他相關文章!

陳述:
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn