Home  >  Article  >  Backend Development  >  Python downloads large files, which method is faster?

Python downloads large files, which method is faster?

王林
王林forward
2023-04-14 21:19:012096browse

Python downloads large files, which method is faster?

Usually, we use the requests library to download. This library is so convenient to use.

Method 1

Use the following streaming code, Python memory usage will not increase regardless of the size of the downloaded file:

def download_file(url):
local_filename = url.split('/')[-1]
# 注意传入参数 stream=True
with requests.get(url, stream=True) as r:
r.raise_for_status()
with open(local_filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=8192): 
f.write(chunk)
return local_filename

If you have a need for chunk encoding , then the chunk_size parameter should not be passed in, and there should be an if judgment.

def download_file(url):
local_filename = url.split('/')[-1]
# 注意传入参数 stream=True
with requests.get(url, stream=True) as r:
r.raise_for_status()
with open(local_filename, 'w') as f:
for chunk in r.iter_content(): 
if chunk:
f.write(chunk.decode("utf-8"))
return local_filename

iter_content[1] The function itself can also be decoded, just pass in the parameter decode_unicode = True. In addition, search the top Python background of the official account and reply "Advanced" to get a surprise gift package.

Please note that the number of bytes returned using iter_content is not exactly chunk_size, it is a random number that is usually larger and is expected to vary on each iteration.

Method 2

Use Response.raw[2] and shutil.copyfileobj[3]

import requests
import shutil

def download_file(url):
local_filename = url.split('/')[-1]
with requests.get(url, stream=True) as r:
with open(local_filename, 'wb') as f:
shutil.copyfileobj(r.raw, f)

return local_filename

This streams the file to disk without using too much memory, and the code is simpler.

Note: According to the documentation, Response.raw will not decode, so you can manually replace the r.raw.read method if needed

response.raw.read = functools.partial(response.raw.read, decode_content=True)

Speed

Method two is faster. If method one is 2-3 MB/s, method two can reach nearly 40 MB/s.

References

[1]iter_content: https://requests.readthedocs.io/en/latest/api/#requests.Response.iter_content

[2]Response.raw: https://requests.readthedocs.io/en/latest/api/#requests.Response.raw

[3]shutil.copyfileobj: https://docs.python.org/3/library/shutil.html#shutil.copyfileobj

The above is the detailed content of Python downloads large files, which method is faster?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete