Home  >  Article  >  Operation and Maintenance  >  Share the solution to the problem of garbled characters when decompressing zip files under Linux

Share the solution to the problem of garbled characters when decompressing zip files under Linux

黄舟
黄舟Original
2017-06-02 10:25:192889browse

This article mainly introduces how to solve the zipfile decompressiongarbled problem under linux. It has certain reference value and interested friends can refer to it. one time.

Cause

Since the zip format does not specify the encoding format, the encoding in the zip file generated under Windows is GBK/GB2312, etc., so , causing these zip files to appear garbled when decompressed under Linux, because the default encoding under Linux is UTF8.

Solution 1

Use 7z to decompress.

Installationp7zip and convmv

# fedora
$ su -c 'yum install p7zip convmv'
# ubuntu
$ sudo apt-get install p7zip convmv

Execute the command to decompress

# 使用7z解压缩
$ LANG=C 7za x your-zip-file.zip
# 递归转码
$ convmv -f GBK -t utf8 --notest -r .

Solution 2

The files compressed on Windows use the system default encoding Chinese to compress the files. Since the zip file does not declare its encoding, unzip on Linux is generally decompressed with the default encoding, and the Chinese file name will be garbled.

Although someone reported this as a bug in 2005, the official website of info-zip did not include automatic identification of encoding in the plan. Maybe they did not think this was a problem. Sun adopted the same approach to the zip encoding problem that has existed in Java for N years.

There are 2 ways to solve the problem:

1. Unzip through the unzip line command and specify the character set

unzip -O CP936 xxx.zip (GBK, GB18030 can also be used)

What’s interesting is that there is no description of this option in the unzip manual, and unzip --help has a simple line for this parameter. illustrate.

2. In the environment variable , specify the unzip parameter, and always display and decompress the file in the specified character set

Add 2 to /etc/environment Line

UNZIP="-O CP936"
ZIPINFO="-O CP936"

In this way, the archive file manager (file-roller) of the Gnome desktop can use unzip to decompress Chinese normally, but the file-roller itself cannot set the encoding and pass it to unzip.

The above is the detailed content of Share the solution to the problem of garbled characters when decompressing zip files under Linux. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn