遇到一个平级标签的页面，如下显示： {代码...} 我最终要取得的值是测试标题一 {代码...} ... 测试标题四 {代码...} 我原本使用的是 {代码...} 这样是可以把需要的h2取到，但想要在继续循环去那些小标题时候，由...

リーリー Python3 の下。 python2 の print の書き方がわかりませんが、要件を満たしているかどうかはわかりません

OS が悪いため中国語のスペルがわかりません。この質問は re を使用して解決できると思います。リーリーの場合: a 内の resList を考えます。 a[0] が親タイトル、 a[1] がサブコンテンツです。試してみてください。

soup.find_all('h2', class_=None) この方法で、必要なものを直接見つけることができます。

python - Beautiful Soup当标签同级时候要怎么取值？

遇到一个平级标签的页面，如下显示：

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
</head>
<body>

    <h2>1. 测试标题一</h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/1374.html" target="_blank">测试一小标题1</a></h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/1410.html" target="_blank">测试一小标题2</a></h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/1520.html" target="_blank">测试一小标题3</a></h2>
    <h2>2. 测试标题二</h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/779.html" target="_blank">测试二小标题1</a></h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/842.html" target="_blank">测试二小标题2</a></h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/997.html" target="_blank">测试二小标题3</a></h2>
    <h2>3. 测试标题三</h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/2301.html" target="_blank">测试三小标题1</a></h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/1976.html" target="_blank">测试三小标题2</a></h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/1905.html" target="_blank">测试三小标题3</a></h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/2440.html" target="_blank">测试三小标题4</a></h2>
    <h2>4. 测试标题四</h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/1722.html" target="_blank">测试四小标题1</a></h2>
    <h2 class="lesson-info-h2"><a href="http://www.xxx.xxx.com/1518.html" target="_blank">测试四小标题2</a></h2>

</body>
</html>

我最终要取得的值是

测试标题一

测试一小标题1，小标题1的链接
测试一小标题2，小标题2的链接

...

测试标题四

测试四小标题1，小标题1的链接
测试四小标题2，小标题1的链接

我原本使用的是

h2_a = soup.find_all('h2')
for i_a in h2_a:
    print i_a

这样是可以把需要的h2取到，但想要在继续循环去那些小标题时候，由于得到type(i_a)为<class 'bs4.element.Tag'>
就不知道要怎么取了。

问大神给指点一下。

巴扎黑2813日前971

python - Beautiful Soup当标签同级时候要怎么取值？

全員に返信(3)返信します