search

Home  >  Q&A  >  body text

python - 为什么正则表达式抓取不到数据?

已经把需要抓取的html写入txt,但是无法使用正则表达式抓取数据,抓取结果永远是空的,请问为什么会出现这种问题?

def zhengze():

material=open(r"c:\test.txt","r")
print(material.read())
pattern=re.compile(r"<p>")
joke=re.search(pattern,material.read())
print(joke)

这行代码应该是抓取整个html里所有的"<p>",但返回结果一直是none,为什么?

高洛峰高洛峰2889 days ago724

reply all(2)I'll reply

  • 怪我咯

    怪我咯2017-04-18 09:19:41

    material.read() You used it twice!!!
    You are sure that you can read the data the second time. The read() method is to read the entire file. After reading, the file pointer should reach the end of the file. If you read it again, it will return ''. It is recommended to write like this:

    def zhengze():
        material=open(r"c:\test.txt","r")
        res = material.read()
        print(res)
        pattern=re.compile(r"<p>")
        joke=re.search(pattern,res)
        print(joke)

    reply
    0
  • 伊谢尔伦

    伊谢尔伦2017-04-18 09:19:41

    First save the data after read() in the variable.

    reply
    0
  • Cancelreply