Home  >  Q&A  >  body text

java - regular expression problem

I want to use regular expressions to extract the following information. How to write it?

123 男 北京          张三
343 女 河北 石家庄   李四
2343 男 山东         王五



提取 男 张三
     女 李四
     男 王五
ringa_leeringa_lee2684 days ago883

reply all(1)I'll reply

  • PHP中文网

    PHP中文网2017-06-14 10:55:06

    In fact, for Chinese, especially Chinese in this format, I don’t recommend using regular expressions, although it can be achieved with difficulty:

    # coding: utf8
    import re
    filename = '2.txt'
    patern = re.compile(r'^\d+ (\S+).*?(\S+)')
    with open(filename) as f:
        for i in f:
            result = patern.findall(i[:-1])
        
            if result and len(result[0]) == 2:
                print result[0][0], result[0][1]
                
    # 输出:
    男 北京
    女 河北
    男 山东

    You can also use the split method (suggestion):

    # coding: utf8
    filename = '2.txt'
    with open(filename) as f:
        for i in f:
            result = i.split()
            print result[1], result[-1]
        
    # 输出:
    男 北京
    女 河北
    男 山东

    reply
    0
  • Cancelreply