Home  >  Article  >  Backend Development  >  How to Decode UTF-8 Encoded URLs in Python?

How to Decode UTF-8 Encoded URLs in Python?

Linda Hamilton
Linda HamiltonOriginal
2024-11-04 06:51:02898browse

How to Decode UTF-8 Encoded URLs in Python?

Decoding UTF-8 Encoded URLs in Python

In Python, decoding a URL encoded with UTF-8 can be a straightforward task. Consider a scenario where you have a URL string like "example.com?title=правовая защита" that needs to be decoded to "example.com?title==правовая защита".

The key to decoding such URLs lies in understanding the encoding method. In this case, the data is UTF-8 encoded bytes that have been escaped with URL quoting. To decode this data, we will use Python's urllib.parse.unquote() function, which handles decoding from percent-encoded data to UTF-8 bytes and then to text seamlessly.

<code class="python">from urllib.parse import unquote

url = unquote(url)</code>

This code will decode the URL to its intended form:

example.com?title=правовая+защита

For Python 2, the equivalent function is urllib.unquote(), but this returns a bytestring that requires manual decoding:

<code class="python">from urllib import unquote

url = unquote(url).decode('utf8')</code>

By following these steps, you can effectively decode UTF-8 encoded URLs in Python, allowing you to access and utilize the intended data.

The above is the detailed content of How to Decode UTF-8 Encoded URLs in Python?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn