Home  >  Article  >  Backend Development  >  Regular expression to exclude text between brackets from matching

Regular expression to exclude text between brackets from matching

WBOY
WBOYforward
2024-02-09 21:51:09860browse

Regular expression to exclude text between brackets from matching

php editor Banana introduces you to a powerful and commonly used string processing tool - regular expressions. Regular expressions can exclude text between brackets when matching strings, which is very useful when processing complex text data. Through simple syntax rules, we can easily filter and extract strings, improving the efficiency and accuracy of data processing. Whether you are developing web pages, processing data, or performing text analysis, mastering regular expressions is an essential skill. Let's learn regular expressions together and improve our string processing capabilities!

Question content

Given the following text:

{field1} == value1 && {field2} == value2  && ({field3} == value3 && {field4} == value4) && {field5} == value5

I'm trying to create a regex that matches all && on that text, but excludes what's between the brackets (so the && between value3 and field4 should be ignored). I've been able to do this using the following regex: (\&{2})(?![^\(]*\)) [This works and is doing what I need] But The problem is that I'm using golang, which doesn't support negative lookahead. Is there any way to do without negative lookahead? Parentheses cannot be nested.

Basically I want to split by && but ignore what is between the brackets and get something like:

[&&, &&, &&]
[{field1} == value1, {field2} == value2, ({field3} == value3 && {field4} == value4), {field5} == value5]

Thanks!

Workaround

You can use a technique by which you match everything you don't want, but don't capture it, and match and capture everything you want (i.e. save to capture group). You can use the following regular expression to do this.

\([^\]]*\)|(&&)

reads "Match the string enclosed in brackets or (|) matches the string "&&" and save it to capturing group 1". The idea is to ignore (in code) uncaught matches.

For the following strings, displays matches (represented by "m"s) and captures (represented by "c"s).

{f1} == v1 && {f2} == v2  && ({f3} == v3 && {f4} == v4) && {f5} == v5
           mm                mmmmmmmmmmmmmmmmmmmmmmmmmm mm
           cc                                           cc

Demo

At the beginning of the string ({), neither \([^\]]*\) nor (&&) match, so the characters The string pointer advances one bit to f. Again, there is no match, so the string pointer advances one character to 1. This will continue (without a match) until the first & is reached. \([^\]]*\) does not match, but (&&) does. We are interested in this match because it has been captured (capturing group 1).

The string pointer then moves forward one character at a time, no longer finding any matches, until it reaches (. At this point, ({f3} == v3 && {f4} == v4) matches \([^\]]*\), but since it is not captured, we don't care about it. This continues until the end of the string is reached.

Alternatively, we can use regular expressions

\([^\]]*\)|&&

and check the first character of each match. If it is ( we give up considering the match.

When each occurrence of && is found, we may (in code) replace it with another (possibly empty) string, getting its offset in the string (for some purpose) or simply increment a counter of the number of matches in the string. This of course depends on why we want to match these strings.

The above is the detailed content of Regular expression to exclude text between brackets from matching. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete