我要的是这个里面的内容
<p class="talk-article__body talk-transcript__body">
PYTHON code:
neirong=soup.find('p',{'class':'talk-article__body talk-transcript__body'})
But the returned result is empty. Is this selector written incorrectly?
某草草2017-05-27 17:41:42
neirong=soup.find_all('p',class_='talk-article__body talk-transcript__body')
https://www.crummy.com/softwa...
阿神2017-05-27 17:41:42
Refer to the instructions given in: https://www.crummy.com/softwa..., the correct way to use it is: neirong=soup.find('p',class_='talk-article__body talk-transcript__body')
In order to get p
包含的内容,进一步调用neirong.contents
just
怪我咯2017-05-27 17:41:42
The content you see from the browser is dynamically generated by js, and it cannot be matched using bs. I found that the strange class names I saw were basically generated by js
曾经蜡笔没有小新2017-05-27 17:41:42
Personally, when using BeautifulSoup to parse web pages, if the author intends to use CSS features to position elements, it is best to use soup.select(). This method can use the value of the class as a parameter or the tag. Attribute can be used as a parameter, which is very convenient. It is best used to search for a single tag. At the same time, the parameter supports css selector strings, such as: soup.select("#id > .class a.title").
soup.find() method seems not to be used much at present. I wonder if BeautifulSoup4 has deprecated it. Now generally as long as find appears, it is find_all() and other methods.
Please refer to the Chinese document of "Super Soup" for the above details: http://beautifulsoup.readthed...