詳解用python的BeautifulSoup分析html方法-Python教學-PHP中文網

首頁

後端開發

Python教學

詳解用python的BeautifulSoup分析html方法

高洛峰

Mar 31, 2017 am 11:36 AM

python

1) 搜尋tag：

find(tagname)        # 直接搜尋名為tagname的tag 如：find('head')
find (list)           # 搜尋在list中的tag，如: find(['head', 'body'])
find(dict)       {'head':True, 'body':True})
find(re.compile('')) # 搜尋符合正規則的tag, 如:find(re.compile('^p')) 搜尋以p開頭的tag
find(lambda)         # 搜尋函數返回結果為true的tag, 如:find(lambda name: if len(name) == 1) 搜尋長度為1的tag
find(True)           # 搜尋所有tag

2) 搜尋文字（text）

3) recursive, limit:

from bs4 import BeautifulSoup
import re
 
doc = ['<title>Page title</title>',
       '<p>This is paragraph <b>one</b>.',
       '</p><p>This is paragraph <b>two</b>.',
       '']
soup = BeautifulSoup(''.join(doc))
 
print soup.prettify()+"\n"
print soup.findAll('b')
 
print soup.findAll(text=re.compile("paragraph"))
print soup.findAll(text=True)
print soup.findAll(text=lambda(x):len(x)</p>

以上是詳解用python的BeautifulSoup分析html方法的詳細內容。更多資訊請關注PHP中文網其他相關文章！

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

Python：探索其主要應用程序Apr 10, 2025 am 09:41 AM

Python在web開發、數據科學、機器學習、自動化和腳本編寫等領域有廣泛應用。 1)在web開發中，Django和Flask框架簡化了開發過程。 2)數據科學和機器學習領域，NumPy、Pandas、Scikit-learn和TensorFlow庫提供了強大支持。 3)自動化和腳本編寫方面，Python適用於自動化測試和系統管理等任務。