<p class="red" id='123' onclick="do()">
<h1>"哈哈"</h1>
<a href="1" title="123"></a>
</p>
比如这段代码,我想替换所有标签<>内的双引号以及单引号为"aaa",而其他地方的引号不受影响该怎么写
<p class=aaaredaaa id=aaa123aaa onclick=aaado()aaa>
<h1>"哈哈"</h1>
<a href=aaa1aaa title=aaa123aaa></a>
</p>
巴扎黑2017-04-18 09:18:45
First match the outer '<....>', then match the inner ' and ":
import re
ss = '''
<p class="red" id='123' onclick="do()">
<h1>"哈哈"</h1>
<a href="1" title="123"></a>
</p>
'''
def quoterepl(matchobj):
pattern = re.compile('\'|"')
return pattern.sub('aaa', matchobj.group(0))
print re.sub('<[^<>]+?>', quoterepl, ss)
伊谢尔伦2017-04-18 09:18:45
Update, I understand the meaning of the question. I originally misunderstood it as replacing the things inside the quotation marks, but now I understand that I need to replace the quotation marks themselves.
<([^<>]*)['"]([^<>]*)>
Because we only changed the quotation marks and left the other parts unchanged, we actually don’t want to capture the quotation marks but to capture things other than the quotation marks and then replace them with
<aaa>
This replaces the quotation marks with aaa. Please note that only one quotation mark can be replaced at a time. I suggest matching the contents of the < tag first
<([^<>]*=[^<>]*)>
Then match
['"]
replaced with
aaa
——The following is the wrong original answer——
Give me an example
class="(.*?)"
The explanation is to add double quotes after class= and add as few characters as possible until another double quote