Ersetzen Sie Text in einer Zeichenfolge und ignorieren Sie Übereinstimmungen in HTML-Tags

Question

Für eine bestimmte Zeichenfolge (normalerweise einen Absatz) möchte ich einige Wörter/Phrasen ersetzen, diese jedoch ignorieren, wenn sie zufällig auf irgendeine Weise von Tags umgeben sind. Dabei muss auch die Groß-/Kleinschreibung beachtet werden. Nehmen Sie zum Beispiel Folgendes: Hier finden Sie einen Linkund viele Dinge in verschiedenen Stilen. Öffentliche Plattformen können in Fettdruck angezeigt werden:

P粉594941301 · Answer

我发现了关于正则表达式否定前瞻的提及，并且在打破我的想法之后得到这个正则表达式（假设你有VALID html标签配对）

// made function a bit ugly just to try to show how it comes together
public function replaceTextOutsideTags($sourceText = null, $toReplace = 'inner text', $dummyText = '(REPLACED TEXT HERE)')
{
  $string = $sourceText ?? "Inner text
  You can find a link here link and a lot 
  of things in different styles. Public platform can appear in bold: 
  public platform, and we also have italics here too: italics. 
  While I like soft pillows I am picky about soft pillows. 
  While I want to find fox, I din't want foxes to show up.
  The text "shiny fruits" is in a span tag:  one of the shiny fruits.
  The inner text like this inner inner text  here to test too, event inner text
  omg thats sad... or not
  ";
  // it would be nice to use [[:punct:]] but somehow regex thinks that < and > are also punctuation marks
  $punctuation = "\.,!\?:;\|/="#"; // this part might take additional attention but you get the point
  $stringPart = "\b$toReplace\b";
  $excludeSequence = "(?![\w
\s>$punctuation]*?";
  $excludeOutside = "$excludeSequence)"; // note on closing )
  $pattern = "/" . $stringPart . $excludeOutside . $excludeTag . "/im";
  
  return preg_replace($pattern, $dummyText, $string);
}

带有默认参数的示例输出

"""
     (REPLACED TEXT HERE)

     You can find a link here link and a lot 

     of things in different styles. Public platform can appear in bold: 

     public platform, and we also have italics here too: italics. 

     While I like soft pillows I am picky about soft pillows. 

     While I want to find fox, I din't want foxes to show up.

     The text "shiny fruits" is in a span tag:  one of the shiny fruits.

     The (REPLACED TEXT HERE) like this inner inner text  here to test too, event (REPLACED TEXT HERE)

     omg thats sad... or not     
     """

现在一步一步

没有后续匹配（如果只有 pillowS，我们就不需要 pillow）
如果文本后跟任意长度的 \w 单词符号、\s 空格或换行符和 允许以开始结束标记 结尾的标点符号- 我们不需要这个匹配，这里出现了否定的先行 (?![\w \s>$标点符号]*?。在这里我们可以确定匹配不会进入新标签，因为 < 不在描述的序列中（$excludeOutside 变量）


$excludeTag 变量与 $excludeOutside 基本相同，但适用于 $toReplace 可以是 html 标签本身的情况，例如 一个


请注意，此代码无法使用 < 或 > 覆盖文本，并且使用这些符号可能会导致意外行为

Ersetzen Sie Text in einer Zeichenfolge und ignorieren Sie Übereinstimmungen in HTML-Tags

Antworte allen(1)Ich werde antworten

请注意，此代码无法使用 `<` 或 `>` 覆盖文本，并且使用这些符号可能会导致意外行为