我是新手。我试图在下面的任一行中找到全名,并且没有 Obituary for
<h2>Obituary for John Doe</h2> <h1>James Michael Lee</h1>
我的正则表达式是这样的。
(<h1>(.+?)<\/h1>|<h2>Obituary\sfor\s(.+?)<\/h2>)
我得到的仍然是 John Doe 的 讣告。如何删除
的
讣告?
P粉7757887232024-04-02 10:10:01
你能在不使用正则表达式的情况下做这样的事情吗?
/** * @description : Function extracts names from html header tags * @example : "Obituary for John Doe
James Michael Lee
" -> ["John Doe", "James Michael Lee"] * @param $html string * @return []string : list of full names */ function extractFullNames($html) { $regex = '/(.*?)<\/h[1-2]>/'; preg_match_all($regex, $html, $matches); $names = $matches[1]; $names = array_map('trim', $names); $names = array_map('strip_tags', $names); $names = array_map('strtolower', $names); $names = array_map('ucwords', $names); $names = array_map('removeObituary', $names); return $names; } /** * @description : Function used to remove "Obituary For" if present * @example : "Obituary For John Doe" -> "John Doe" * @param $name string * @return string : name without "Obituary For" */ function removeObituary($name) { $name = str_replace("Obituary For ", "", $name); return $name; } // Test cases $html = ' Obituary for John Doe
James Michael Lee
'; $names = extractFullNames($html); $expected = ['John Doe', 'James Michael Lee']; echo "Expected: " . implode(', ', $expected) . "\n"; echo "Actual: " . implode(', ', $names);
P粉3948122772024-04-02 09:09:58
条条大路通罗马,你或许可以这样做:
|2>Obituary\sfor\s)\K[^><]+
请在 regex101 查看此演示。匹配项将位于 $out[0]
中。
\K
重置 开头报告比赛。有关详细信息,请参阅 SO 正则表达式常见问题解答。