首页  >  问答  >  正文

python特定段落的文本匹配

a='''
[Scene: Central Perk, Chandler, Joey, Phoebe, and Monica are there.]
Monica: There's nothing to tell! He's just some guy I work with!
Joey: C'mon, you're going out with the guy! There's gotta be something wrong with him!
Chandler: All right Joey, be nice.? So does he have a hump? A hump and a hairpiece?
Phoebe: Wait, does he eat chalk?
[Scene: Chandler, Joey,abcsde.]
Phoebe: Just, 'cause, I don't want her to go through what I went through with Carl- oh!
Monica: Okay, everybody relax. This is not even a date. It's just two people going out to dinner and- not having sex.
Chandler: Sounds like a date to me.
[Scene: Joey.]
'''

我有一段文本a,如上,
我想取得每个场景的对话文本,保存成lsit,每个场景的区分是[Scene: 加一句英文.],如上面加粗的部分
然后用正则表达式写,
paragraphs = re.findall('[Scene: w+.](.*?)[Scene: w+.]',a,re.S)

我发现没有匹配出内容来,paragraphs是个空的,
请问错误的原因在哪,该如何去匹配每一场景的对话内容?
谢谢。

PHP中文网PHP中文网2687 天前682

全部回复(1)我来回复

  • 滿天的星座

    滿天的星座2017-05-18 10:59:26

    错误有几点
    没有使用原生字符串
    没有转义[

    以下是我修改后的代码。

    paragraphs = re.findall(r"\[Scene: [\w\s,]+\.]\s([^[]+)\s(?=\[Scene: [\w\s,]+\.])", a, re.S)
    

    python正则表达式指南
    http://www.cnblogs.com/huxi/a...

    回复
    0
  • 取消回复