Home > Article > Backend Development > How to Read and Write Unicode Text Files in Python?
Reading and Writing Unicode Text Files in Python
When dealing with Unicode characters in text files, Python offers multiple approaches to ensure proper handling. One common issue arises when trying to read and write Unicode strings using the encode() and decode() methods.
To overcome this challenge, it's recommended to specify the file encoding when opening it. With the introduction of the io module in Python 2.6, the io.open function became available, allowing us to specify the desired encoding:
<code class="python">import io # Assuming the file is encoded in UTF-8 f = io.open("test", mode="r", encoding="utf-8") unicodeString = f.read()</code>
In Python 3.x, the io.open function is an alias for the built-in open function, eliminating the need for an import.
Another option is to use open() from the codecs standard library module:
<code class="python">import codecs f = codecs.open("test", "r", "utf-8") unicodeString = f.read()</code>
However, this approach may lead to compatibility issues when mixing read() and readline() operations.
To write Unicode strings to a file in UTF-8 encoding, you can use the following code snippet:
<code class="python"># assumes unicodeString is a Unicode string outputFile = io.open("output.txt", mode="w", encoding="utf-8") outputFile.write(unicodeString)</code>
By following these guidelines, you can ensure that Unicode characters are handled correctly when reading and writing text files in Python.
The above is the detailed content of How to Read and Write Unicode Text Files in Python?. For more information, please follow other related articles on the PHP Chinese website!