search

Home  >  Q&A  >  body text

网页爬虫 - 请问PHP怎么使用xpath解析html内容呢?

在网上查看了很多相关资料,但都是PHP用xpath解析xml的,请问PHP有没有相关的函数或是类库能解析html吗?谢谢

天蓬老师天蓬老师2774 days ago432

reply all(2)I'll reply

  • 伊谢尔伦

    伊谢尔伦2017-04-10 14:55:16

    直接用zend-dom吧,方便多了!
    http://framework.zend.com/manual/2.3/en/modules/zend.dom.query.html
    引入不用教了吧?

    reply
    0
  • 怪我咯

    怪我咯2017-04-10 14:55:16

    $url = 'http://www.baidu.com';
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_FILE, fopen('php://stdout', 'w'));
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($ch, CURLOPT_URL, $url);
    $html = curl_exec($ch); 
    curl_close($ch);
    
    // create document object model
    $dom = new DOMDocument();
    // load html into document object model
    @$dom->loadHTML($html);
    // create domxpath instance
    $xPath = new DOMXPath($dom);
    // get all elements with a particular id and then loop through and print the href attribute
    $elements = $xPath->query('//*[@id="lg"]/img/@src');
    foreach ($elements as $e) {
      echo ($e->nodeValue);
    }

    差不多这样的

    reply
    0
  • Cancelreply