BeautifulSoup: Combine top-level text with classic tag lookup functionality?

Question

I'm trying to use BeautifulSoup to extract information from a non-uniformly structured block of html. I'm looking for a way to combine blocks of text between tags in the search/filter output. For example, from html: DescriptionSection1

line1
line2

P粉905144514 · Answer

To get the output, you can first select and then select its next_sibling.

Example

from bs4 import BeautifulSoup html = ''' Description Section1

line1

line2

line3

Section2 Content2 ''' soup = BeautifulSoup(html) data = [] for e in soup.select('strong'): data.extend([e,e.next_sibling.strip()]) data

Output

[Description, 'Section1', Section2, 'Content2']

BeautifulSoup: Combine top-level text with classic tag lookup functionality?

reply all(1)I'll reply

Example

Output