from bs4 import BeautifulSoup
html='''
<a class="vip_class fl font14" href="/site/baidu" target="_blank">baidu</a>
<a href="/site/google/
" target="_blank">google</a>
<a href="/mobile/list/?" target="_blank">android</a>
<a href="/mobile/list/?" target="_blank">ios</a>
'''
soup = BeautifulSoup(html,'lxml')
links=soup.findAll("a")
print(links)
比如只想找到 href对链接中有mobile关键字的 链接, 有没有办法在findAll中就取出来?
高洛峰2017-04-17 17:58:42
You can use 2 methods:
alls = soup.findAll("a", href=re.compile("mobile"))#支持正则
# alls = soup.select("a[href*=\"mobile\"]")#第二种用css selector虽然不符合题主意思,但是一起说了
update:
Successful screenshot: