Home > Article > Backend Development > Solution to the problem of garbled Chinese name files in Python2
Python2 does not support Chinese by default. Generally, we add #-*-coding:utf-8-*- at the beginning of the program to solve this problem, but when I use the open() method When opening the file, the Chinese name is displayed as garbled characters.
Let me first talk about the encoding issues in Python. Strings in Python are roughly divided into two forms: str and Unicode. The commonly used encoding types of str are utf-8, gb2312, gbk, etc. Etc., Python uses Unicode as the base type for encoding. What str records is a byte array, which is just a certain encoding storage format. What format it is finally output to a file or printed out depends entirely on how it is decoded by the decoding encoding; Unicode is a symbol set similar to Abstract coding, which only stipulates the binary code of the symbol, but does not specify how this binary code should be stored. That is, it is only an internal representation and cannot be saved directly, so a storage form needs to be specified when storing, such as utf-8 wait.
The functions that have encoding conversion in Python are:
decode(char_set) implements char_set decoding into Unicodeencode(char_set) implements Unicode encoding into char_set
Looking at the Python documentation, you will find:
In this method of open(filename, 'w'), the parameter filename must be Unicode encoded parameters.
I added #-*-coding:utf-8-*- before to set the encoding to utf-8. When calling this method to pass parameters, the variable filename needs to be decoded into Unicode.
For example, filename='Chinese.txt', when using open(), write open(filename.decode('utf-8'), 'w'), so that the Chinese file name created will not be garbled. Problem.
The above is the detailed content of Solution to the problem of garbled Chinese name files in Python2. For more information, please follow other related articles on the PHP Chinese website!