Home >Backend Development >Python Tutorial >Python downloads large files, which method is faster?

Python downloads large files, which method is faster?

王林forward: 2023-04-14 21:19:012249browse

Usually, we use the requests library to download. This library is so convenient to use.

Method 1

Use the following streaming code, Python memory usage will not increase regardless of the size of the downloaded file:

def download_file(url):
local_filename = url.split('/')[-1]
# 注意传入参数 stream=True
with requests.get(url, stream=True) as r:
r.raise_for_status()
with open(local_filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=8192): 
f.write(chunk)
return local_filename

If you have a need for chunk encoding , then the chunk_size parameter should not be passed in, and there should be an if judgment.

def download_file(url):
local_filename = url.split('/')[-1]
# 注意传入参数 stream=True
with requests.get(url, stream=True) as r:
r.raise_for_status()
with open(local_filename, 'w') as f:
for chunk in r.iter_content(): 
if chunk:
f.write(chunk.decode("utf-8"))
return local_filename

iter_content^[1] The function itself can also be decoded, just pass in the parameter decode_unicode = True. In addition, search the top Python background of the official account and reply "Advanced" to get a surprise gift package.

Please note that the number of bytes returned using iter_content is not exactly chunk_size, it is a random number that is usually larger and is expected to vary on each iteration.

Method 2

Use Response.raw^[2] and shutil.copyfileobj^[3]

import requests
import shutil

def download_file(url):
local_filename = url.split('/')[-1]
with requests.get(url, stream=True) as r:
with open(local_filename, 'wb') as f:
shutil.copyfileobj(r.raw, f)

return local_filename

This streams the file to disk without using too much memory, and the code is simpler.

Note: According to the documentation, Response.raw will not decode, so you can manually replace the r.raw.read method if needed

response.raw.read = functools.partial(response.raw.read, decode_content=True)

Speed

Method two is faster. If method one is 2-3 MB/s, method two can reach nearly 40 MB/s.

References

[1]iter_content: https://requests.readthedocs.io/en/latest/api/#requests.Response.iter_content

[2]Response.raw: https://requests.readthedocs.io/en/latest/api/#requests.Response.raw

[3]shutil.copyfileobj: https://docs.python.org/3/library/shutil.html#shutil.copyfileobj

The above is the detailed content of Python downloads large files, which method is faster?. For more information, please follow other related articles on the PHP Chinese website!

Python html if https

Statement：

This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete

Previous article：Whoosh: A lightweight search tool for PythonNext article：Whoosh: A lightweight search tool for Python

See more

Python downloads large files, which method is faster?

Method 1

Method 2

Speed

References

Related articles