1. When I add the <dl> tag, I get empty content. How should I write the matching rules? I can get the desired content without adding the <dl> tag.
2. Question code
pattern = re.compile(r'<dl>.*?<dd><a href="(.*?)">(.*?)</a></dd>.*?</dl>')
3. You can get the content you want without adding the <dl> tag
4. Attach the web page source code
<dl>
<dt>《明末工程师》正文</dt>
<dd><a href="/book/1440/xx">第一章 穿越后的窘境</a></dd>
</dl>
黄舟2017-05-18 10:51:18
# 你可能需要加个模式
# re.S 使 . 匹配包括换行在内的所有字符
pattern = re.compile(r'<dl>.*?<dd><a href="(.*?)">(.*?)</a></dd>.*?</dl>', re.S)
print re.findall(pattern, a)