Home >Backend Development >PHP Tutorial >How to Use Regex to Avoid Modifying Text Inside HTML Tags During Replacement?

How to Use Regex to Avoid Modifying Text Inside HTML Tags During Replacement?

Barbara Streisand
Barbara StreisandOriginal
2024-12-01 22:54:11709browse

How to Use Regex to Avoid Modifying Text Inside HTML Tags During Replacement?

Regex to Match Outside of HTML Tags for Selective Tagging

To prevent matches within HTML tags while using preg_replace to add tags to specific words in an HTML page, it is crucial to define a regular expression that excludes these areas.

Original Pattern:

preg_replace("/(asf|gfd|oyws)/", '<span>

Weakness:

The pattern above will also match instances of the target words within HTML tags, which is undesirable.

Enhanced Pattern:

/(asf|foo|barr)(?=[^>]*(<|$))/

Breakdown:

  • (asf|foo|barr): Matches the target words.
  • (?=): Lookahead assertion that ensures the match occurs before the closing HTML tag (
  • 1*: Matches zero or more characters (excluding the closing HTML tag

How it Works:

This pattern matches the target words only if they are not immediately followed by the closing HTML angle bracket. It effectively restricts the matching to outside of HTML tags, preventing unintentional modifications within them.

Example:

Consider the following HTML:

<p>I am making a preg_replace on HTML page. My pattern is aimed to add surrounding tag to some words in HTML. However, sometimes my regular expression modifies HTML tags. For example, when I try to replace this text:</p>

<pre class="brush:php;toolbar:false"><a href="example.com" alt="yasar home page">yasar</a>

Using the enhanced pattern, the target word "yasar" will be matched and tagged, while the instance within the "alt" attribute of the anchor tag will be left untouched:

<p>I am making a preg_replace on HTML page. My pattern is aimed to add surrounding tag to some words in HTML. However, sometimes my regular expression modifies HTML tags. For example, when I try to replace this text:</p>

<pre class="brush:php;toolbar:false"><a href="example.com" alt="yasar home page">yasar</a>
So that yasar reads

  1. >

    The above is the detailed content of How to Use Regex to Avoid Modifying Text Inside HTML Tags During Replacement?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn