Python实现批量下载文件-Python教程-PHP中文网

首页

后端开发

Python教程

Python实现批量下载文件

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 06, 2016 am 11:16 AM

python

Python实现批量下载文件

#!/usr/bin/env python
# -*- coding:utf-8 -*-

from gevent import monkey
monkey.patch_all()
from gevent.pool import Pool
import requests
import sys
import os

def download(url):
 chrome = 'Mozilla/5.0 (X11; Linux i86_64) AppleWebKit/537.36 ' + 
 '(KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36'
 headers = {'User-Agent': chrome}
 filename = url.split('/')[-1].strip()
 r = requests.get(url.strip(), headers=headers, stream=True)
 with open(filename, 'wb') as f:
 for chunk in r.iter_content(chunk_size=1024):
 if chunk:
f.write(chunk)
f.flush()
 print filename,"is ok"

def removeLine(key, filename):
 os.system('sed -i /%s/d %s' % (key, filename))

if __name__ =="__main__":
 if len(sys.argv) == 2:
 filename = sys.argv[1]
 f = open(filename,"r")
 p = Pool(4)
 for line in f.readlines():
 if line:
 p.spawn(download, line.strip())
 key = line.split('/')[-1].strip()
 removeLine(key, filename)
f.close()
p.join()
else:
 print 'Usage: python %s urls.txt' % sys.argv[0]

其他网友的方法：

from os.path import basename
from urlparse import urlsplit
def url2name(url):
  return basename(urlsplit(url)[2])

def download(url, localFileName = None):
  localName = url2name(url)
  req = urllib2.Request(url)
  r = urllib2.urlopen(req)
  if r.info().has_key('Content-Disposition'):
    # If the response has Content-Disposition, we take file name from it
    localName = r.info()['Content-Disposition'].split('filename=')[1]
    if localName[0] == '"' or localName[0] == "'":
      localName = localName[1:-1]
  elif r.url != url:
    # if we were redirected, the real file name we take from the final URL
    localName = url2name(r.url)
  if localFileName:
    # we can force to save the file as specified name
    localName = localFileName
  f = open(localName, 'wb')
  f.write(r.read())
  f.close()

download(r'你要下载的python文件的url地址')

以上便是本文给大家分享的全部内容了，小伙伴们可以测试下哪种方法效率更高呢。

声明

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系admin@php.cn

为什么数组通常比存储数值数据列表更高？May 05, 2025 am 12:15 AM

ArraySareAryallyMoremory-Moremory-forigationDataDatueTotheIrfixed-SizenatureAntatureAntatureAndirectMemoryAccess.1）arraysStorelelementsInAcontiguxufulock，ReducingOveringOverheadHeadefromenterSormetormetAdata.2）列表，通常

如何将Python列表转换为Python阵列？May 05, 2025 am 12:10 AM

ToconvertaPythonlisttoanarray,usethearraymodule:1)Importthearraymodule,2)Createalist,3)Usearray(typecode,list)toconvertit,specifyingthetypecodelike'i'forintegers.Thisconversionoptimizesmemoryusageforhomogeneousdata,enhancingperformanceinnumericalcomp

您可以将不同的数据类型存储在同一Python列表中吗？举一个例子。May 05, 2025 am 12:10 AM

Python列表可以存储不同类型的数据。示例列表包含整数、字符串、浮点数、布尔值、嵌套列表和字典。列表的灵活性在数据处理和原型设计中很有价值，但需谨慎使用以确保代码的可读性和可维护性。

Python中的数组和列表之间有什么区别？May 05, 2025 am 12:06 AM

Pythondoesnothavebuilt-inarrays;usethearraymoduleformemory-efficienthomogeneousdatastorage,whilelistsareversatileformixeddatatypes.Arraysareefficientforlargedatasetsofthesametype,whereaslistsofferflexibilityandareeasiertouseformixedorsmallerdatasets.

通常使用哪种模块在Python中创建数组？May 05, 2025 am 12:02 AM

theSostCommonlyusedModuleForCreatingArraysInpyThonisnumpy.1）NumpyProvidEseffitedToolsForarrayOperations，Idealfornumericaldata.2）arraysCanbeCreatedDusingsnp.Array（）for1dand2Structures.3）

您如何将元素附加到Python列表中？May 04, 2025 am 12:17 AM

toAppendElementStoApythonList，usetheappend（）方法forsingleements，Extend（）formultiplelements，andinsert（）forspecificpositions.1）useeAppend（）foraddingoneOnelementAttheend.2）useextendTheEnd.2）useextendexendExendEnd（

您如何创建Python列表？举一个例子。May 04, 2025 am 12:16 AM

TocreateaPythonlist,usesquarebrackets[]andseparateitemswithcommas.1)Listsaredynamicandcanholdmixeddatatypes.2)Useappend(),remove(),andslicingformanipulation.3)Listcomprehensionsareefficientforcreatinglists.4)Becautiouswithlistreferences;usecopy()orsl

讨论有效存储和数值数据的处理至关重要的实际用例。May 04, 2025 am 12:11 AM

金融、科研、医疗和AI等领域中，高效存储和处理数值数据至关重要。 1)在金融中，使用内存映射文件和NumPy库可显着提升数据处理速度。 2)科研领域，HDF5文件优化数据存储和检索。 3)医疗中，数据库优化技术如索引和分区提高数据查询性能。 4)AI中，数据分片和分布式训练加速模型训练。通过选择适当的工具和技术，并权衡存储与处理速度之间的trade-off，可以显着提升系统性能和可扩展性。

See all articles