Home  >  Article  >  Backend Development  >  How to Highlight Keywords in HTML While Ignoring Tags?

How to Highlight Keywords in HTML While Ignoring Tags?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-11-12 22:40:02946browse

How to Highlight Keywords in HTML While Ignoring Tags?

How to Ignore HTML Tags in preg_replace

In your code snippet, you attempt to use preg_replace to highlight searched keywords within HTML text. However, this approach can lead to HTML structure disruption when the keyword matches content within HTML tags.

Instead of using regular expressions, it is recommended to leverage XPath and DOMDocument for this task. Consider the following approach:

  1. Create a DOMDocument Object: Parse the HTML text into a DOMDocument object using loadXML.
  2. Use DOMXPath for Search: Create a DOMXPath object and use it to query for elements containing the search term.
  3. Ignore HTML Tags in Search: Use an XPath expression like //*[contains(., "{$search}")]/*[FALSE = contains(., "{$search}")]/.. to identify parent elements containing the search text while excluding HTML tags.
  4. Process Search Results: Extract the matching text nodes and wrap them in the desired tags dynamically.
  5. Save the Modified HTML: Save the updated DOMDocument back to the HTML string.

Code Example:

$str = '...'; // HTML String
$search = 'text that span';

$doc = new DOMDocument;
$doc->loadXML($str);
$xp = new DOMXPath($doc);

$anchor = $doc->getElementsByTagName('body')->item(0);
if (!$anchor) {
    throw new Exception('Anchor element not found.');
}

$r = $xp->query('//*[contains(., "'.$search.'")]/*[FALSE = contains(., "'.$search.'")]/..', $anchor);
if (!$r) {
    throw new Exception('XPath failed.');
}

foreach ($r as $i => $node) {
    $textNodes = $xp->query('.//child::text()', $node);
    $range = new TextRange($textNodes);
    while (FALSE !== $start = strpos($range, $search)) {
        $base = $range->split($start);
        $range = $base->split(strlen($search));
        $ranges[] = $base;
    }

    foreach ($ranges as $range) {
        foreach ($range->getNodes() as $node) {
            $span = $doc->createElement('span');
            $span->setAttribute('class', 'search_hightlight');
            $node = $node->parentNode->replaceChild($span, $node);
            $span->appendChild($node);
        }
    }
}

echo $doc->saveXML();

This approach allows you to effectively highlight search terms while disregarding HTML tags, preserving the structural integrity of your HTML content.

The above is the detailed content of How to Highlight Keywords in HTML While Ignoring Tags?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn