使用BeautifulSoup的find和find_all函数获取标签的问题

Question

在爬虫时遇到下面的导航树：——div.center
    ——div.ft_ggbox_1 balck_ggbox_1
        ——div.black_jubao_right black_jubao_right_xxbh black_jub

三叔 · Answer

首先通过find_all找到所有a标签, 然后通过列表表达式将所有a包含的href保存到列表中

soup = BeautifulSoup(html_string)
atag = soup.find_all('a')
hrefs = [item.get('href') for item in atags if item.get('href')]