Home  >  Article  >  Backend Development  >  Python implements Chinese conversion url encoding

Python implements Chinese conversion url encoding

高洛峰
高洛峰Original
2017-03-03 13:07:451572browse

The example of this article describes the method of implementing Chinese conversion url encoding in Python. Share it with everyone for your reference, the details are as follows:

Today I have to deal with things on Baidu Tieba. If you want to make a list of keywords, just add them directly to the list every time you need them. However, if the URL added to the list is in Chinese (such as 'Lijiang'), the address code of the URL is '%E4%B8%BD%E6%B1%9F', so a conversion is required. Here we use the module urllib.

>>> import urllib
>>> data = '丽江'
>>> print data
丽江
>>> data
'\xe4\xb8\xbd\xe6\xb1\x9f'
>>> urllib.quote(data)
'%E4%B8%BD%E6%B1%9F'

So what do we want to transfer back?

>>> urllib.unquote('%E4%B8%BD%E6%B1%9F')
'\xe4\xb8\xbd\xe6\xb1\x9f'
>>> print urllib.unquote('%E4%B8%BD%E6%B1%9F')
丽江

Careful students will find that %C0%F6%BD%AD appears in the Tieba URL instead of '%E4%B8%BD%E6 %B1%9F' is actually an encoding problem. Baidu's is gbk, and other general websites such as Google are utf8. So it can be achieved using the following statements.

>>> import sys,urllib 
>>> s = '丽江'
>>> urllib.quote(s.decode(sys.stdin.encoding).encode('gbk'))
'%C0%F6%BD%AD'
>>> urllib.quote(s.decode(sys.stdin.encoding).encode('utf8'))
'%E4%B8%BD%E6%B1%9F'
>>>

For more articles related to python implementing Chinese conversion url encoding, please pay attention to the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn