首页 >后端开发 >php教程 >javascript - 正则表达式匹配最里层括号的内容

javascript - 正则表达式匹配最里层括号的内容

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB原创
2016-08-04 09:19:461586浏览

现在有一个字符串:

<code>str1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))'
</code>

或者

<code>str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))'
</code>

我需要通过正则,匹配字符串中最里层的括号及其中的内容(不匹配引号内的括号),即:

<code>str1 => (status_id = "C" OR level_id = "D")

str2 => (level_id = "D" AND subject_id = "(Cat)")
</code>

那么,这种超复杂的正则应该怎么写?

如果正则实现不了,那么JS怎么来实现?


补充,对于 str1,我找到了这样的正则可以满足匹配:

<code>\([^()]+\)
</code>

但是对于str2, 依然没有办法,期待大家解答!

回复内容:

现在有一个字符串:

<code>str1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))'
</code>

或者

<code>str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))'
</code>

我需要通过正则,匹配字符串中最里层的括号及其中的内容(不匹配引号内的括号),即:

<code>str1 => (status_id = "C" OR level_id = "D")

str2 => (level_id = "D" AND subject_id = "(Cat)")
</code>

那么,这种超复杂的正则应该怎么写?

如果正则实现不了,那么JS怎么来实现?


补充,对于 str1,我找到了这样的正则可以满足匹配:

<code>\([^()]+\)
</code>

但是对于str2, 依然没有办法,期待大家解答!

对于str2,我找到了这样的

<code>\([^()]*\"[^"]*\"[^()]*\)</code>

看了一下需求我根本没考虑用正则,好像太复杂了...直接上传统方法吧;
可以使用运算优先级的思想,即用的数据结构来取得内部括号的内容;
技术要点:

  1. 匹配最内层的括号

  2. 引号内的内容不作为匹配标准

照着这个思路开始设计算法:
该算法是计算出要匹配的子字符串的 startIndexendIndex 然后用 substring() 方法获得子字符串;

  • 当匹配到一个 "(" 字符的时,入栈,当我们匹配到第一个 ")" 时,出栈,即两个索引之间的子字符串为目标字符串;

  • 匹配到一个 "\"" 时,则停止匹配 "(" ,直到搜索到下一个 "\"" 时,才继续开始搜索 "("

拍脑袋想出来的算法,有不足之处欢迎补充。

//这样,试试
/\(([^\(\)]*?"[^\"\(\)]*([^\"\(\)]+\)[^\(\)]*?\"[^\(\)]*)+)|([^\(\)]+\)/


补充:

分析需求 > 找到每个需求点的解决方案 > 整合解决方案 = 解决问题

分析需求:

  1. 需要匹配 ( a ) 的形式

  2. 其中 a 包含的字符有两种可能,用a1a2表示

    1. a1含有一个或多个 b " c " b 形式的字符串,

      1. 其中 b 是一段不包括 ", () 的字符串

      2. 其中 c 是一段不包括 " 的字符串

    2. a2中不含有 ()

逆向推导:

2.2 => a2 = [^\(\)]*
2.1.1 => b = [^\(\)\"]*
2.1.2 => c = [^\"]*
2.1 => a1= (b\"c\"b)+ = (b\"c\")+b =([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*
1 => \(a\) = \(a1\)|\(a2\) = \(([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*\)|\([^\(\)]*\)

正则表达式:

<code>/\(([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*\)|\([^\(\)]*\)/</code>

验证:

<code class="javascript">var reg = /\(([^\(\)\"]*\"[^\"]*\")+[^\(\)\"]*\)|\([^\(\)]*\)/;

'(the (quick "brown" fox "jumps over, (the) lazy" dog ))'
    .match(reg)[0]
//"(quick "brown" fox "jumps over, (the) lazy" dog )"

'(the ("(quick)" brown fox "jumps (over, the)" lazy) dog )'
    .match(reg)[0];
//"("(quick)" brown fox "jumps (over, the)" lazy)"

'(the (quick brown fox (jumps "over", ((the) "lazy"))) dog )'
    .match(reg)[0];
//"(the)"</code>

那就这么改:

<code>substr=str.match(/\([^()]+\)/g)[0]
</code>

得到最里面括号及其中的值,后判断该值前一位是否是 “,后一位是否是 ”:

<code>index=str.indexOf(str.match(/\([^()]+\)/g)[0])
length=str.match(/\([^()]+\)/g)[0].length
str.substr(index+length,1)
str.substr(index-1,1)
</code>

如果不存在,则是需要的答案,如果存在,则先将str中substr替换掉,后在match一下,最后在替换回来:

<code>str.replace(substr,"&&&")
str.replace(substr,"&&&").match(/\([^()]+\)/g)[0]
str.replace(substr,"&&&").match(/\([^()]+\)/g)[0].replace("&&&",substr)
</code>

本题难点在需要对""进行递归统计,例如

<code>(level_id = "D AND subject_id = "(Cat)"")</code>

(cat)是符合要求的.

<code>\([^()]*?\"((?:[^\"\"]|\"(?1)\")*+)\"[^()]*?\)|\([^()]*?\)
</code>

真爱生命,远离正则,该正则可以满足你的要求,php能用(php支持递归)java及Python无法使用.

推荐一个思路,找到(的index,切字符串处理

手机发不出正则 黑线
楼主的【^()】里如果不匹配()则继续
把不匹配(的条件去掉,把贪婪的+改成*?即可

!代码

console.log('(subject_id = “A” OR (status_id = “Open” AND (status_id = “C” OR level_id = “D”)))'.match(/(1*)/))
希望对你有帮助
"javascript


  1. () ↩

用正则匹配会比较复杂,建议 把干扰串 "( 和 )" 替换掉,比如 "[, ]",再用简单的正则替换,之后再换回来。

正则用 Python 实现如下:

<code>import re

str1 = '(subject_id = "A" OR (status_id = "Open" AND (status_id = "C" OR level_id = "D")))'
str2 = '(subject_id = "A" OR subject_id = "Food" OR (subject_id = "C" OR (status_id = "Open" AND (status_id = "C" OR (level_id = "D" AND subject_id = "(Cat)")))))'

pat = re.compile(r"""(?</code>

输出为:

<code>(status_id = "C" OR level_id = "D")
(level_id = "D" AND subject_id = "(Cat)")
</code>
声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn