python - Beautiful Soup当标签同级时候要怎么取值？

Question

遇到一个平级标签的页面，如下显示： {代码...} 我最终要取得的值是 测试标题一 {代码...} ... 测试标题四 {代码...} 我原本使用的是 {代码...} 这样是可以把需要的h2取到，但想要在继续循环去那些小标题时候，由...

PHP中文网 · Answer

h2_a = soup.find_all('h2')
for i_a in h2_a:
    if i_a.a:
        print (i_a.text,'，',i_a.a['href'])
    else:
        print (i_a.text)

python3下的。python2的print不知道怎么写了，不知是否符合你的要求

PHP中文网 · Answer

I can't spell Chinese for the bad OS.

I thinke that we can solve this question using re.

import re
resList = b = re.findall(r'(.*?)
([\w\W]*?)(?=(()|()))',html.replace('
',''))

then: suppose a in resList, a[0] is the parent title, and a[1] is the sub content.
try it.

阿神 · Answer

soup.find_all('h2', class_=None)
这样就可以直接查找到你需要的了。