Home >Backend Development >PHP Tutorial >使用PHP XPath采集的时候,如何保留nodeValue里的html符号

使用PHP XPath采集的时候,如何保留nodeValue里的html符号

WBOY
WBOYOriginal
2016-06-06 20:29:451783browse

代码如下:

<code>$html = 


    <meta charset="UTF-8">
    <title>Test</title>


<div id="content">
  <p>
    <span>
      abcdefghijklmn<br>opqrstuvwxyz
    </span>
  </p>
</div>



EOF;
// create document object model
$dom = new DOMDocument();
// load html into document object model
@$dom->loadHTML($html);
// create domxpath instance
$xPath = new DOMXPath($dom);
// get all elements with a particular id and then loop through and print the href attribute
$elements = $xPath->query('//*[@id="content"]/p/span');
$content = $elements->item(0)->nodeValue;
echo $content;</code>

内容里的<br>会被去除,使用什么操作比如有没有$e->innerHtml之类的,可以保留html标签。

8.18 更新:

<code>$html = 


    <meta charset="UTF-8">
    <title>Test</title>


<div id="content">
  <p>
    <span class="aaa">
      abcdefghijklmn<br><span>opq</span>rstuvwxyz
    </span>
  </p>
</div>



EOF;

// create document object model
$dom = new DOMDocument();
// load html into document object model
@$dom->loadHTML($html);
// create domxpath instance
$xPath = new DOMXPath($dom);
// get all elements with a particular id and then loop through and print the href attribute
$elements = $xPath->query('//*[@id="content"]/p/span');
$nodeName = $elements->item(0)->nodeName;
// $content = $elements->item(0)->nodeValue;
$content = $dom->saveXml($elements->item(0));
$content = $dom->saveHtml($elements->item(0));
$content = preg_replace(array("#^#isU", "#{$nodeName}>$#isU"), array('', ''), $content);
echo $content;</code>

回复内容:

代码如下:

<code>$html = 


    <meta charset="UTF-8">
    <title>Test</title>


<div id="content">
  <p>
    <span>
      abcdefghijklmn<br>opqrstuvwxyz
    </span>
  </p>
</div>



EOF;
// create document object model
$dom = new DOMDocument();
// load html into document object model
@$dom->loadHTML($html);
// create domxpath instance
$xPath = new DOMXPath($dom);
// get all elements with a particular id and then loop through and print the href attribute
$elements = $xPath->query('//*[@id="content"]/p/span');
$content = $elements->item(0)->nodeValue;
echo $content;</code>

内容里的<br>会被去除,使用什么操作比如有没有$e->innerHtml之类的,可以保留html标签。

8.18 更新:

<code>$html = 


    <meta charset="UTF-8">
    <title>Test</title>


<div id="content">
  <p>
    <span class="aaa">
      abcdefghijklmn<br><span>opq</span>rstuvwxyz
    </span>
  </p>
</div>



EOF;

// create document object model
$dom = new DOMDocument();
// load html into document object model
@$dom->loadHTML($html);
// create domxpath instance
$xPath = new DOMXPath($dom);
// get all elements with a particular id and then loop through and print the href attribute
$elements = $xPath->query('//*[@id="content"]/p/span');
$nodeName = $elements->item(0)->nodeName;
// $content = $elements->item(0)->nodeValue;
$content = $dom->saveXml($elements->item(0));
$content = $dom->saveHtml($elements->item(0));
$content = preg_replace(array("#^#isU", "#{$nodeName}>$#isU"), array('', ''), $content);
echo $content;</code>

自己找到了办法。。。

<code>$content = $elements->item(0)->nodeValue;

// >> 改成 >>

$content = $dom->saveXml($elements->item(0));</code>
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn