PHPz2017-04-18 10:05:34
<p class="l_post l_post_bright j_l_post clearfix " data-field='{"author":{"user_id":348570172, "user_name":"\u6446\u6446\u821e\u66f2","props":null},"content":{"post_id":31489927386,"is_anonym":false,"forum_id":874949,"thread_id":2108034524,"content":"912904081@qq.com\u8c22\u8c22\u6492","post_no":94,"type":"0","comment_num":0,"props":null,"post_index":0,"pb_tpoint":null}}'> <p class="d_author"> <ul class="p_author">
...
</p>
What I want to crawl is the user_name and content in the outermost tag of this p. There are many, many tags in the middle. I just crawled down all the tags in this p. I want to know how to keep the outermost tag that I need
天蓬老师2017-04-18 10:05:34
r = requests.get("http://tieba.baidu.com/p/2108034524?pn=4")
soup = BeautifulSoup(r.content, "lxml")
users = soup.find_all("p", class_="l_post")
for user in users:
print(user["data-field"])
# 其他处理
Then process the extracted content