


ZipInputStream failed to decompress Chinese file name? How to set the character set correctly?
ZipInputStream decompression of Chinese file names and solutions
Many developers often encounter character encoding problems when using ZipInputStream
to decompress Zip compressed packages containing Chinese file names or folder names, which leads to decompression failures and prompts errors like "malformed input off : 1, length : 1". This article will analyze this problem in depth and provide effective solutions.
The root of the problem is that the character set specified in the ZipInputStream
constructor is not used to decompress the file name encoding inside the file, but to read the central directory information of the Zip file itself. The encoding of the central directory depends on the environment in which the compressed package is created and the operating system. Windows systems usually use GB2312 or GBK encoding, while macOS and Linux systems usually use UTF-8 encoding.
Therefore, if a Zip compression package is created under Windows, its central directory is likely to be encoded using GBK. Even if the UTF-8 encoding is specified in the code, ZipInputStream
still fails to parse the GBK-encoded central directory correctly, resulting in decompression failure.
Solution:
For Zip packages created by Windows systems, it is recommended to use GBK encoding to read the central directory:
FileInputStream input = new FileInputStream(targetPath); ZipInputStream zipInputStream = new ZipInputStream(new BufferedInputStream(input), Charset.forName("GBK"));
Since GBK encoding is compatible with GB2312, this method also applies to the central directory of GB2312 encoding.
Cross-platform solutions:
To write more robust cross-platform compatible code, the Apache Commons Compress library is recommended. This library provides more powerful compression/decompression functions, which can automatically handle Zip compression packages of different encodings, effectively avoiding decompression failures due to character set differences. It can automatically detect and process central directory information of different codes, simplify the development process, and improve the reliability of the code.
The above is the detailed content of ZipInputStream failed to decompress Chinese file name? How to set the character set correctly?. For more information, please follow other related articles on the PHP Chinese website!

Start Spring using IntelliJIDEAUltimate version...

When using MyBatis-Plus or other ORM frameworks for database operations, it is often necessary to construct query conditions based on the attribute name of the entity class. If you manually every time...

Java...

How does the Redis caching solution realize the requirements of product ranking list? During the development process, we often need to deal with the requirements of rankings, such as displaying a...

Conversion of Java Objects and Arrays: In-depth discussion of the risks and correct methods of cast type conversion Many Java beginners will encounter the conversion of an object into an array...

Solutions to convert names to numbers to implement sorting In many application scenarios, users may need to sort in groups, especially in one...

Detailed explanation of the design of SKU and SPU tables on e-commerce platforms This article will discuss the database design issues of SKU and SPU in e-commerce platforms, especially how to deal with user-defined sales...

How to set the SpringBoot project default run configuration list in Idea using IntelliJ...


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Chinese version
Chinese version, very easy to use

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

Notepad++7.3.1
Easy-to-use and free code editor