現在有一個字串:
<code>str1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))' </code>
或
<code>str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))' </code>
我需要通過正則,匹配字串中最裡層的括號及其中的內容(不匹配引號內的括號),即:
<code>str1 => (status_id = "C" OR level_id = "D") str2 => (level_id = "D" AND subject_id = "(Cat)") </code>
那麼,這種超複雜的正則該怎麼寫?
如果正規實現不了,那麼JS怎麼來實現?
補充,對於 str1
,我找到了這樣的正則可以滿足匹配:
<code>\([^()]+\) </code>
但對於str2, 依然沒有辦法,期待大家解答!
現在有一個字串:
<code>str1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))' </code>
或
<code>str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))' </code>
我需要通過正則,匹配字串中最裡層的括號及其中的內容(不匹配引號內的括號),即:
<code>str1 => (status_id = "C" OR level_id = "D") str2 => (level_id = "D" AND subject_id = "(Cat)") </code>
那麼,這種超複雜的正則該怎麼寫?
如果正規實現不了,那麼JS怎麼來實現?
補充,對於 str1
,我找到了這樣的正則可以滿足匹配:
<code>\([^()]+\) </code>
但對於str2, 依然沒有辦法,期待大家解答!
對於str2,我找到了這樣的
<code>\([^()]*\"[^"]*\"[^()]*\)</code>
看了一下需求我根本沒考慮用正規,好像太複雜了...直接上傳統方法吧;
可以使用運算優先級的思想,即用棧的資料結構來取得內部括號的內容;
技術重點:
符合最內層的括號
引號內的內容不作為匹配標準
照著這個思路開始設計算法:
該算法是計算出要匹配的子字符串的 startIndex
和 endIndex
然後用 substring()
方法獲得子字符串;
當匹配到一個"("
字符的時,入棧,當我們匹配到第一個 ")"
時,棧出,即兩個索引之間的字符串為目標字串;
符合一個 """
時,則停止配對 "("
,直到搜尋到下一個 """
時,才繼續開始搜尋 "("
"""
拍腦袋想出來的演算法,有不足之處歡迎補充。
//這樣,試試/(([^()]*?"[^"()]*([^"()]+)[^()]*?"[^() ]*)+)|([^()]+)/
分析需求 > 找到每個需求點的解決方案 > 整合解決方案 = 解決問題
需要配對
( a )
其中
a 包含的字符有兩種可能,用
a1和
a2
a1
含有一個或多個
b " c " b
其中
b 是一段不包含
",
( 或
)
其中
c 是一段不包含
"
a2
中不含
( 或
)
正規表示式:
2.2 =>
a2=
[^()]*2.1.1 =>
b=
[^()"]*b
=[^()"]*
]*2.1 =>
a1
=(b"c"b)+
=(b"c")+b
=
([^()"]*"[^"]*") +[^()"]*(a1)|(a2)
=(([^()"]*"[^"]*")+[^ ()"]*)|([^()]*)
<code>/\(([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*\)|\([^\(\)]*\)/</code>
<code class="javascript">var reg = /\(([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*\)|\([^\(\)]*\)/; '(the (quick "brown" fox "jumps over, (the) lazy" dog ))' .match(reg)[0] //"(quick "brown" fox "jumps over, (the) lazy" dog )" '(the ("(quick)" brown fox "jumps (over, the)" lazy) dog )' .match(reg)[0]; //"("(quick)" brown fox "jumps (over, the)" lazy)" '(the (quick brown fox (jumps "over", ((the) "lazy"))) dog )' .match(reg)[0]; //"(the)"</code>
<code>substr=str.match(/\([^()]+\)/g)[0] </code>
得到最裡面括號及其中的值,後判斷該值前一位是否是 “,後一位是否是 ”:
<code>index=str.indexOf(str.match(/\([^()]+\)/g)[0]) length=str.match(/\([^()]+\)/g)[0].length str.substr(index+length,1) str.substr(index-1,1) </code>
如果不存在,則是需要的答案,如果存在,則先將str中substr替換掉,後在match一下,最後在替換回來:
<code>str.replace(substr,"&&&") str.replace(substr,"&&&").match(/\([^()]+\)/g)[0] str.replace(substr,"&&&").match(/\([^()]+\)/g)[0].replace("&&&",substr) </code>
本题难点在需要对""进行递归统计,例如
<code>(level_id = "D AND subject_id = "(Cat)"")</code>
(cat)是符合要求的.
<code>\([^()]*?\"((?:[^\"\"]|\"(?1)\")*+)\"[^()]*?\)|\([^()]*?\) </code>
真爱生命,远离正则,该正则可以满足你的要求,php能用(php支持递归)java及Python无法使用.
推荐一个思路,找到(的index,切字符串处理
手机发不出正则 黑线
楼主的【^()】里如果不匹配()则继续
把不匹配(的条件去掉,把贪婪的+改成*?即可
!代码
console.log('(subject_id = “A” OR (status_id = “Open” AND (status_id = “C” OR level_id = “D”)))'.match(/(1*)/))
希望对你有帮助
用正则匹配会比较复杂,建议 把干扰串 "( 和 )" 替换掉,比如 "[, ]",再用简单的正则替换,之后再换回来。
正则用 Python 实现如下:
<code>import re str1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))' str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))' pat = re.compile(r"""(?<=[^"]) \([^()]+? ("\(.+?\)")* \) (?=[^"]) """, re.X) print pat.search(str1).group(0) print pat.search(str2).group(0)</code>
输出为:
<code>(status_id = "C" OR level_id = "D") (level_id = "D" AND subject_id = "(Cat)") </code>