Home  >  Article  >  Backend Development  >  Python Tutorial: How to split and merge large files using Python?

Python Tutorial: How to split and merge large files using Python?

WBOY
WBOYforward
2023-04-22 11:43:081958browse

Sometimes, we need to send a large file to others, but due to the limitations of the transmission channel, such as the limit on the size of email attachments, or the network condition is not very good, we need to split the large file into small files and send them multiple times. , the receiving end then merges these small files. Today I will share how to split and merge large files using Python.

Ideas and Implementation

If it is a text file, it can be divided by the number of lines. Whether it is a text file or a binary file, it can be split according to the specified size.

Using Python's file reading and writing function, you can split and merge files, set the size of each file, and then read bytes of the specified size and write them into a new file. The receiving end reads the small files in sequence. File, write the read bytes into a file in order, and then the merge can be completed.

Split

size = 1024 * 1000 * 10# 10MB
with open("bigfile", "rb") as reader:
part = 1
while True:
part_content = reader.read(size)
if not part_content:
print("split done.")
break
with open(f"bigfile_part{part}","wb") as writer:
writer.write(part_content)

Merge

total_parts = 5
with open("bigfile","wb") as writer:
for i in range(5):
with open(f"bigfile_part{i}", "rb") as reader:
writer.write(reader.read())

Use a third-party library

Although you can write it yourself, but Someone else has written it, why not save some time and use it directly? Just install it directly with pip:

pip install filesplit

Split

from filesplit.split import Split
split = Split("./data.rar", "./output")
split.bysize(size = 1024*1000*10) # 每个文件最多 10MB

After execution, we can see the split files in the output folder:

一文教会你如何用 Python 分割合并大文件

You can also split according to the number of file lines:

split.bylinecount(linecount = 10000) # 每个文件最多 10000 行

Merge

Merge requires small files in the folder To merge, the tool requires that there must be a manifest file in the folder. Its format is as follows:

filename,filesize,header
data_1.rar,10000000,False
data_2.rar,10000000,False
data_3.rar,10000000,False
data_4.rar,10000000,False
data_5.rar,1304145,False

The code to merge the files only needs to specify the directory to be merged, the target directory, and the merged file name. The code is as follows:

from filesplit.merge import Merge
merge = Merge(inputdir = "./output", outputdir="./merge", outputfilename = "merged.rar")
merge.merge()

After execution, you can see the merged file in the merge directory:

一文教会你如何用 Python 分割合并大文件

The above is the detailed content of Python Tutorial: How to split and merge large files using Python?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete