用phpQuery像jquery一样解析html代码,phpqueryjquery
简介
如何在php中方便地解析html代码,估计是每个phper都会遇到的问题。用phpQuery就可以让php处理html代码像jQuery一样方便。
项目地址:https://code.google.com/p/phpquery/
github地址:https://github.com/TobiaszCudnik/phpquery
DEMO
下载库文件:https://code.google.com/p/phpquery/downloads/list
我下的是onefile版:phpQuery-0.9.5.386-onefile.zip
官方demo:https://code.google.com/p/phpquery/source/browse/branches/dev/demo.php
然后在项目中引用。
html文件test.html:
<span><</span><span>div </span><span>class</span><span>="thumb"</span><span> id</span><span>="Thumb-13164-3640"</span><span> style</span><span>="position: absolute; left: 0px; top: 0px;"</span><span>></span> <span><</span><span>a </span><span>href</span><span>="/Spiderman-City-Drive"</span><span>></span> <span><</span><span>img </span><span>src</span><span>="/thumb/12/Spiderman-City-Drive.jpg"</span><span> alt</span><span>=""</span><span>></span> <span><</span><span>span </span><span>class</span><span>="GameName"</span><span> id</span><span>="GameName-13164-3640"</span><span> style</span><span>="display: none;"</span><span>></span>Spiderman City Drive<span></</span><span>span</span><span>></span> <span><</span><span>span </span><span>class</span><span>="GameRating"</span><span> id</span><span>="GameRating-13164-3640"</span><span> style</span><span>="display: none;"</span><span>></span> <span><</span><span>span </span><span>style</span><span>="width: 68.14816px;"</span><span>></</span><span>span</span><span>></span> <span></</span><span>span</span><span>></span> <span></</span><span>a</span><span>></span> <span></</span><span>div</span><span>></span> <span><</span><span>div </span><span>class</span><span>="thumb"</span><span> id</span><span>="Thumb-13169-5946"</span><span> style</span><span>="position: absolute; left: 190px; top: 0px;"</span><span>></span> <span><</span><span>a </span><span>href</span><span>="/Spiderman-City-Raid"</span><span>></span> <span><</span><span>img </span><span>src</span><span>="/thumb/12/Spiderman-City-Raid.jpg"</span><span> alt</span><span>=""</span><span>></span> <span><</span><span>span </span><span>class</span><span>="GameName"</span><span> id</span><span>="GameName-13169-5946"</span><span> style</span><span>="display: none;"</span><span>></span>Spiderman - City Raid<span></</span><span>span</span><span>></span> <span><</span><span>span </span><span>class</span><span>="GameRating"</span><span> id</span><span>="GameRating-13169-5946"</span><span> style</span><span>="display: none;"</span><span>></span> <span><</span><span>span </span><span>style</span><span>="width: 67.01152px;"</span><span>></</span><span>span</span><span>></span> <span></</span><span>span</span><span>></span> <span></</span><span>a</span><span>></span> <span></</span><span>div</span><span>></span>
php处理:
<?<span>php </span><span>include</span>('phpQuery-onefile.php'<span>); </span><span>$filePath</span> = 'test.html'<span>; </span><span>$fileContent</span> = <span>file_get_contents</span>(<span>$filePath</span><span>); </span><span>$doc</span> = phpQuery::newDocumentHTML(<span>$fileContent</span><span>); phpQuery</span>::selectDocument(<span>$doc</span><span>); </span><span>$data</span> = <span>array</span><span>( </span>'name' => <span>array</span>(), 'href' => <span>array</span>(), 'img' => <span>array</span><span>() ); </span><span>foreach</span> (pq('a') <span>as</span> <span>$t</span><span>) { </span><span>$href</span> = <span>$t</span> -> getAttribute('href'<span>); </span><span>$data</span>['href'][] = <span>$href</span><span>; } </span><span>foreach</span> (pq('img') <span>as</span> <span>$img</span><span>) { </span><span>$data</span>['img'][] = <span>$domain</span> . <span>$img</span> -> getAttribute('src'<span>); } </span><span>foreach</span> (pq('.GameName') <span>as</span> <span>$name</span><span>) { </span><span>$data</span>['name'][] = <span>$name</span> -><span> nodeValue; } </span><span>var_dump</span>(<span>$data</span><span>); </span>?>
上面的代码中包含了取属性和innerText内容(通过nodeValue取)。
输出:
<span>array</span> (size=3<span>) </span>'name' => <span>array</span> (size=2<span>) </span>0 => <span>string</span> 'Spiderman City Drive' (length=20<span>) </span>1 => <span>string</span> 'Spiderman - City Raid' (length=21<span>) </span>'href' => <span>array</span> (size=2<span>) </span>0 => <span>string</span> 'http://www.gahe.com/Spiderman-City-Drive' (length=40<span>) </span>1 => <span>string</span> 'http://www.gahe.com/Spiderman-City-Raid' (length=39<span>) </span>'img' => <span>array</span> (size=2<span>) </span>0 => <span>string</span> 'http://www.gahe.com/thumb/12/Spiderman-City-Drive.jpg' (length=53<span>) </span>1 => <span>string</span> 'http://www.gahe.com/thumb/12/Spiderman-City-Raid.jpg' (length=52)
强大的是pq选择器,语法类似jQuery,很方便。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 Chinese version
Chinese version, very easy to use

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Dreamweaver CS6
Visual web development tools

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Zend Studio 13.0.1
Powerful PHP integrated development environment