Home >Backend Development >PHP Tutorial >How to Replace Text URLs with Hyperlinks While Excluding URLs within HTML Tags?

How to Replace Text URLs with Hyperlinks While Excluding URLs within HTML Tags?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-10-28 06:28:021118browse

How to Replace Text URLs with Hyperlinks While Excluding URLs within HTML Tags?

Overcoming the Challenge of Replacing Text URLs while Excluding URLs within HTML Tags

Problem: Converting text URLs into hyperlinks can be a useful task, but it becomes challenging when images or other elements within HTML tags also contain URLs. In a specific instance, the user seeks a way to replace text URLs with anchor tags while avoiding replacing URLs embedded within image source attributes.

Solution:

The key to addressing this issue lies in using an XPath expression to select only those text nodes that contain URLs but are not descendants of anchor elements.

Here's a refined version of the XPath expression:

$xPath = new DOMXPath($dom);
$texts = $xPath->query(
    '/html/body//text()[
        not(ancestor::a) and (
        contains(.,"http://") or
        contains(.,"https://") or
        contains(.,"ftp://") )]'
);

This expression effectively excludes text nodes that are contained within anchor tags, ensuring that only plain text URLs are targeted for conversion.

Replacing Text URLs without Affecting Image URLs:

To avoid replacing URLs embedded within image source attributes, a non-standard but efficient approach is employed. Instead of splitting text nodes apart, a document fragment is used to replace the entire text node with the modified version.

Here's the code that performs this task:

foreach ($texts as $text) {
    $fragment = $dom->createDocumentFragment();
    $fragment->appendXML(
        preg_replace(
            &quot;~((?:http|https|ftp)://(?:\S*?\.\S*?))(?=\s|\;|\)|\]|\[|\{|\}|,|\&quot;|'|:|\<|$|\.\s)~i&quot;,
            '<a href=&quot;&quot;></a>',
            $text->data
        )
    );
    $text->parentNode->replaceChild($fragment, $text);
}

In this code, the preg_replace function is used to search for URLs in the text node and replace them with their corresponding anchor tag versions.

Example:

Consider the following HTML:

<code class="html"><html>
<body>
<p>
    This is a text with a <a href=&quot;http://example.com/1&quot;>link</a>
    and another <a href=&quot;http://example.com/2&quot;>http://example.com/2</a>
    and also another http://example.com with the latter being the
    only one that should be replaced. There is also images in this
    text, like <img src=&quot;http://example.com/foo&quot;/> but these should
    not be replaced either. In fact, only URLs in text that is no
    a descendant of an anchor element should be converted to a link.
</p>
</body>
</html></code>

Applying the above solution will convert the text URLs to anchor tags while leaving the image URL untouched, producing the following output:

<code class="html"><html><body>
<p>
    This is a text with a <a href=&quot;http://example.com/1&quot;>link</a>
    and another <a href=&quot;http://example.com/2&quot;>http://example.com/2</a>
    and also another <a href=&quot;http://example.com&quot;>http://example.com</a> with the latter being the
    only one that should be replaced. There is also images in this
    text, like <img src=&quot;http://example.com/foo&quot;/> but these should
    not be replaced either. In fact, only URLs in text that is no
    a descendant of an anchor element should be converted to a link.
</p>
</body></html></code>

The above is the detailed content of How to Replace Text URLs with Hyperlinks While Excluding URLs within HTML Tags?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn