Remplacez le texte dans une chaîne et ignorez les correspondances dans les balises HTML

Question

Pour une chaîne donnée (généralement un paragraphe), je souhaite remplacer certains mots/expressions, mais les ignorer s'ils sont entourés de balises d'une manière ou d'une autre. Cela doit également être insensible à la casse. Prenez ceci par exemple : vous pouvez trouver un lien icilienet beaucoup de choses dans des styles différents. La plate-forme publique peut apparaître en gras :

P粉594941301 · Answer

J'ai trouvé une mention concernant regex négatif lookahead et après m'être cassé la tête, j'ai obtenu cette regex (en supposant que vous ayez VALID appariement de balises html)

// made function a bit ugly just to try to show how it comes together
public function replaceTextOutsideTags($sourceText = null, $toReplace = 'inner text', $dummyText = '(REPLACED TEXT HERE)')
{
  $string = $sourceText ?? "Inner text
  You can find a link here link and a lot 
  of things in different styles. Public platform can appear in bold: 
  public platform, and we also have italics here too: italics. 
  While I like soft pillows I am picky about soft pillows. 
  While I want to find fox, I din't want foxes to show up.
  The text "shiny fruits" is in a span tag:  one of the shiny fruits.
  The inner text like this inner inner text  here to test too, event inner text
  omg thats sad... or not
  ";
  // it would be nice to use [[:punct:]] but somehow regex thinks that < and > are also punctuation marks
  $punctuation = "\.,!\?:;\|/="#"; // this part might take additional attention but you get the point
  $stringPart = "\b$toReplace\b";
  $excludeSequence = "(?![\w
\s>$punctuation]*?";
  $excludeOutside = "$excludeSequence)"; // note on closing )
  $pattern = "/" . $stringPart . $excludeOutside . $excludeTag . "/im";
  
  return preg_replace($pattern, $dummyText, $string);
}

Exemple de sortie avec les paramètres par défaut

"""
     (REPLACED TEXT HERE)

     You can find a link here link and a lot 

     of things in different styles. Public platform can appear in bold: 

     public platform, and we also have italics here too: italics. 

     While I like soft pillows I am picky about soft pillows. 

     While I want to find fox, I din't want foxes to show up.

     The text "shiny fruits" is in a span tag:  one of the shiny fruits.

     The (REPLACED TEXT HERE) like this inner inner text  here to test too, event (REPLACED TEXT HERE)

     omg thats sad... or not     
     """

Pas à pas maintenant

Aucun match ultérieur (ne serait-ce que pillowS，我们就不需要 pillow)
Si le texte est suivi d'une w 单词符号、s 空格或 n 换行符和 允许以开始结束标记 结尾的标点符号 - 我们不需要这个匹配，这里出现了否定的先行 (?![wns>$标点符号]*?。在这里我们可以确定匹配不会进入新标签，因为 < 不在描述的序列中（$excludeOutside variable de n'importe quelle longueur)


$excludeTag 变量与 $excludeOutside 基本相同，但适用于 $toReplace 可以是 html 标签本身的情况，例如 一个


Veuillez noter que ce code ne peut pas écraser le texte avec < 或 > et que l'utilisation de ces symboles peut provoquer un comportement inattendu

Remplacez le texte dans une chaîne et ignorez les correspondances dans les balises HTML

répondre à tous(1)je répondrai

Veuillez noter que ce code ne peut pas écraser le texte avec `<` 或 `>` et que l'utilisation de ces symboles peut provoquer un comportement inattendu