How does the Python crawler crawl the content between span and span and store them in the dictionary respectively?

Question

I want to capture the house profiles separately and store them in the dictionary as independent columns, but there is no way to directly extract the inline elements using a for loop.
This is my code:

曾经蜡笔没有小新 · Answer

Actually, it is very simple. You can see that there is a pattern in this. The pattern lies in the separator |. I wrote a DEMO

something  = '''房屋概况：住宅  |1室1厅1卫|46m²| (高层)/共18层

                        |南北

                        | 豪华装修

                    ''';

soup  = BeautifulSoup(something, 'lxml')
plaintext = soup.select('li')[0].get_text().strip()

Get all the inner content through get_text(), and then remove the spaces. You can use split to divide it later, and I won’t write the rest.
If you have any questions, please communicate.

给我你的怀抱 · Answer

I feel that this html code is written wrong, the content text of the label is outside the label

There are only two correct label contents:

House Overview:
46m²

巴扎黑 · Answer

<p>innerText</p>

滿天的星座 · Answer

In your case, I think it is most convenient to use a for loop plus regular expressions, if all templates are fixed like this

黄舟 · Answer

用pyquery吧

from pyquery import PyQuery as Q

Q(text).find('.house-info li').text()

How does the Python crawler crawl the content between span and span and store them in the dictionary respectively?

reply all(5)I'll reply

I feel that this html code is written wrong, the content text of the label is outside the label

There are only two correct label contents: