Home  >  Article  >  Backend Development  >  How to Decode UTF-8 URL Encoded Strings in Python 2.7?

How to Decode UTF-8 URL Encoded Strings in Python 2.7?

Barbara Streisand
Barbara StreisandOriginal
2024-11-04 07:33:02879browse

How to Decode UTF-8 URL Encoded Strings in Python 2.7?

Url Decode UTF-8 in Python

Problem: Given a URL encoded in UTF-8 format, how can it be decoded to its intended string representation in Python 2.7?

Solution:

The problem stems from the presence of UTF-8 encoded bytes that are escaped with URL quoting. To correctly decode this data, a two-step process is required:

  1. URL Decoding: Use urllib.parse.unquote() in Python 3 or urllib.unquote() in Python 2 to convert the URL-encoded bytes back to their original representation.
  2. UTF-8 Decoding: For Python 2, the decoded bytestring needs to be explicitly converted to a text string using decode('utf8').
<code class="python">from urllib.parse import unquote

url = 'example.com?title=%D0%BF%D1%80%D0%B0%D0%B2%D0%BE%D0%B2%D0%B0%D1%8F+%D0%B7%D0%B0%D1%89%D0%B8%D1%82%D0%B0'
decoded_url = unquote(url)

print(decoded_url)  # Output: example.com?title=правовая+защита</code>

This approach seamlessly handles the decoding from percent-encoded data to UTF-8 bytes and finally to text.

The above is the detailed content of How to Decode UTF-8 URL Encoded Strings in Python 2.7?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn