The Python version I am using is 3.5.2
. I try to use the zipfile
module’s zipfile.ZipFile.open
method to open a text file in a compressed package. Even if the rU
parameter required in the document is used, it is still opened in binary data format, which is puzzling.
Code:
>>> import zipfile
>>> zf = zipfile.ZipFile('/Users/chiqingjun/Downloads/top-1m.csv.zip')
>>> zf.namelist()
['top-1m.csv']
>>> f = zf.open(zf.namelist()[0], mode='rU')
>>> f
<zipfile.ZipExtFile name='top-1m.csv' mode='rU' compress_type=deflate>
>>> f.readline()
b'1,google.com\n'
# 仍然是二进制数据
Official documentation (version 3.5.2):
巴扎黑2017-06-22 11:53:42
In fact, the final output binary has nothing to do with zipfile
, but is related to py3.5
. You can decode the output result to get the character type
content = f.readline()
print(content.decode('utf8'))
女神的闺蜜爱上我2017-06-22 11:53:42
The documentation has said that rU
is the universal newline character
, and this mode will be removed in 3.6.
It is necessary to read the byte content of compressed files in binary. How to transcode later is decided by the programmer.