py code:
Page html structure:
Return result:
[u'xe7x89x88xe3x80x80xe3x80x80xe6x9cxacxefxbcx9a']
What character encoding is this? How to convert it to the version number in html?
phpcn_u15822017-05-18 11:04:10
This encoding is the way Chinese characters are stored in lists. You can try printing out the members of the list individually:
print softcontent[0]