Python crawls the data and gets a list, but how to remove the span tag in it?

Question

I used p6ython3.6 to crawl some data, but what was finally displayed was a list containing span tags. When I used get_text, contents, etc., an error would be reported. Why is this? The initial result returned is as follows: {code...} My code is as follows: {code...}

仅有的幸福 · Answer

I don’t remember the API of bs very clearly. There should be a function that can directly obtain the text. It should be get_text()这个函数吧。由于你用的是find_all(). Then I need to traverse the returned result again, that’s it

rs = list()
for data in soup.find("p",{"class":"list-main-eventset-finan"}).find_all("li"):
    contents=data.find("i",{"class":"cell date"}).find_all("span")
    for content in contents:
        rs.append(content.get_text())

In addition, you can also use regular expressions to match directly (.*?)<this pattern. But you have to traverse the contens list as above.

phpcn_u1582 · Answer

The questioner can try the text_content() method

ringa_lee · Answer

Regular expressions or split+SUBSTRING can also be used, use them flexibly

Python crawls the data and gets a list, but how to remove the span tag in it?

reply all(3)I'll reply