search
HomeBackend DevelopmentPython Tutorial11 postures downloaded using Python, each more advanced than the last

11 postures downloaded using Python, each more advanced than the last

Finally, you'll learn how to overcome various challenges you may encounter, such as downloading redirected files, downloading large files, completing a multi-threaded download, and other strategies.

1. Using requests

You can use the requests module to download files from a URL.

Consider the following code:

11 postures downloaded using Python, each more advanced than the last

You simply get the URL using the get method of the requests module and store the result into a variable called "myfile" middle. Then, write the contents of this variable to the file.

2. Using wget

11 postures downloaded using Python, each more advanced than the last

#You can also use Python’s wget module to download files from a URL. You can install the wget module using pip by following the command:

Consider the following code, which we will use to download the logo image for Python.

11 postures downloaded using Python, each more advanced than the last

In this code, the URL and path (where the image will be stored) are passed to the download method of the wget module.

3. Downloading Redirected Files

In this section, you will learn how to use requests to download a file from a URL that will be redirected to another with a .pdf The URL of the file. The URL looks like this:

11 postures downloaded using Python, each more advanced than the last

To download this pdf file, use the following code:

11 postures downloaded using Python, each more advanced than the last

In this paragraph In the code, the first step we specify is the URL. Then, we use the get method of the request module to get the URL. In the get method, we set allow_redirects to True, which will allow redirections in the URL and the redirected content will be assigned to the variable myfile.

Finally, we open a file to write the obtained content.

4. Download large files in chunks

Consider the following code:

11 postures downloaded using Python, each more advanced than the last

First, we use the get of the requests module as before method, but this time, we will set the stream property to True.

Next, we create a file named PythonBook.pdf in the current working directory and open it for writing.

Then, we specify the chunk size to be downloaded each time. We've set it to 1024 bytes, then iterated through each chunk and written those chunks to the file until the end of the chunk.

Isn't it beautiful? Don't worry, we will display a progress bar of the download process later.

5. Download multiple files (parallel/batch download)

To download multiple files at the same time, please import the following module:

11 postures downloaded using Python, each more advanced than the last

We imported the os and time modules to check how long it takes to download the file. The ThreadPool module allows you to run multiple threads or processes using a pool.

Let's create a simple function that sends the response in chunks to a file:

11 postures downloaded using Python, each more advanced than the last

This URL is a two-dimensional array that specifies your The path and URL of the page to download.

11 postures downloaded using Python, each more advanced than the last

Just like we did in the previous section, we pass this URL to requests.get. Finally, we open the file (the path specified in the URL) and write the page content.

Now, we can call this function for each URL individually, or we can call this function for all URLs at the same time. Let's call this function for each URL individually in a for loop, paying attention to the timer:

11 postures downloaded using Python, each more advanced than the last

Now, replace the for loop with the following lines of code:

11 postures downloaded using Python, each more advanced than the last

Run the script.

6. Download using the progress bar

The progress bar is a UI component of the clint module. Enter the following command to install the clint module:

11 postures downloaded using Python, each more advanced than the last

Consider the following code:

11 postures downloaded using Python, each more advanced than the last

In this code, we first The requests module is imported, and then we import the progress component from clint.textui. The only difference is in the for loop. When writing content to a file, we use the bar method of the progress bar module.

7. Use urllib to download a web page

In this section, we will use urllib to download a web page.

The urllib library is Python’s standard library, so you don’t need to install it.

The following lines of code can easily download a web page:

11 postures downloaded using Python, each more advanced than the last

Here specify what you want to save the file for and the URL of where you want to store it .

11 postures downloaded using Python, each more advanced than the last

In this code, we use the urlretrieve method and pass the URL of the file, and the path to save the file. The file extension will be .html.

8. Downloading through a proxy

If you need to use a proxy to download your files, you can use the ProxyHandler of the urllib module. Please look at the following code:

11 postures downloaded using Python, each more advanced than the last

In this code, we create the proxy object and open the proxy by calling urllib's build_opener method and pass in the proxy object . Then we create a request to get the page.

In addition, you can also use the requests module as described in the official documentation:

11 postures downloaded using Python, each more advanced than the last

You only need to import the requests module and create your proxy object. Then, you can get the file.

9. Using urllib3

urllib3 is an improved version of the urllib module. You can download and install it using pip:

11 postures downloaded using Python, each more advanced than the last

We will use urllib3 to get a web page and store it in a text file.

Import the following modules:

11 postures downloaded using Python, each more advanced than the last

When processing files, we use the shutil module.

Now, we initialize the URL string variable like this:

11 postures downloaded using Python, each more advanced than the last

Then, we use urllib3’s PoolManager, which keeps track of the necessary connection pools.

11 postures downloaded using Python, each more advanced than the last

Create a file:

11 postures downloaded using Python, each more advanced than the last

Finally, we send a GET request to get the URL and open a file, Then write the response to the file:

11 postures downloaded using Python, each more advanced than the last

10. Download files from S3 using Boto3

To download files from Amazon S3, you can use the Python boto3 module .

Before starting, you need to install the awscli module using pip:

11 postures downloaded using Python, each more advanced than the last

For AWS configuration, run the following command:

11 postures downloaded using Python, each more advanced than the last

Now enter your details as follows:

11 postures downloaded using Python, each more advanced than the last

To download files from Amazon S3, you need to import boto3 and botocore. Boto3 is an Amazon SDK that allows Python to access Amazon web services (such as S3). Botocore provides a command line service for interacting with Amazon web services.

Botocore comes with awscli. To install boto3, run the following command:

11 postures downloaded using Python, each more advanced than the last

Now, import these two modules:

11 postures downloaded using Python, each more advanced than the last

From Amazon When downloading a file, we need three parameters:

  • Bucket name
  • The name of the file you need to download
  • The name of the file after downloading

Initialize variables:

11 postures downloaded using Python, each more advanced than the last

Now, we initialize a variable to use the session’s resources. To do this, we will call boto3's resource() method and pass in the service, which is s3:

11 postures downloaded using Python, each more advanced than the last

Finally, use the download_file method to download the file and pass in the variable:

11 postures downloaded using Python, each more advanced than the last

11. Using asyncio

The asyncio module is mainly used to handle system events. It works around an event loop that waits for an event to occur and then reacts to that event. The reaction can be to call another function. This process is called event processing. The asyncio module uses coroutines for event handling.

To use asyncio event handling and coroutine functionality, we will import the asyncio module:

11 postures downloaded using Python, each more advanced than the last

Now, define the asyncio coroutine method like this:

11 postures downloaded using Python, each more advanced than the last

The keyword async indicates that this is a native asyncio coroutine. Inside the coroutine, we have an await keyword, which returns a specific value. We can also use return keyword.

Now, let’s create a piece of code to download a file from a website using a coroutine:

11 postures downloaded using Python, each more advanced than the last

In this code, we create an asynchronous coroutine function , which will download our file and return a message.

Then, we use another asynchronous coroutine to call main_func, which will wait for the URL and form all the URLs into a queue. asyncio's wait function waits for the coroutine to complete.

Now, in order to start the coroutine, we have to put the coroutine into the event loop using asyncio's get_event_loop() method, and finally, we execute the event loop using asyncio's run_until_complete() method.

Downloading files using Python is fun. Hope this tutorial is useful to you!

The above is the detailed content of 11 postures downloaded using Python, each more advanced than the last. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
Python 文本终端 GUI 框架,太酷了Python 文本终端 GUI 框架,太酷了Apr 12, 2023 pm 12:52 PM

Curses首先出场的是 Curses[1]。CurseCurses 是一个能提供基于文本终端窗口功能的动态库,它可以: 使用整个屏幕 创建和管理一个窗口 使用 8 种不同的彩色 为程序提供鼠标支持 使用键盘上的功能键Curses 可以在任何遵循 ANSI/POSIX 标准的 Unix/Linux 系统上运行。Windows 上也可以运行,不过需要额外安装 windows-curses 库:pip install windows-curses 上面图片,就是一哥们用 Curses 写的 俄罗斯

五个方便好用的Python自动化脚本五个方便好用的Python自动化脚本Apr 11, 2023 pm 07:31 PM

相比大家都听过自动化生产线、自动化办公等词汇,在没有人工干预的情况下,机器可以自己完成各项任务,这大大提升了工作效率。编程世界里有各种各样的自动化脚本,来完成不同的任务。尤其Python非常适合编写自动化脚本,因为它语法简洁易懂,而且有丰富的第三方工具库。这次我们使用Python来实现几个自动化场景,或许可以用到你的工作中。1、自动化阅读网页新闻这个脚本能够实现从网页中抓取文本,然后自动化语音朗读,当你想听新闻的时候,这是个不错的选择。代码分为两大部分,第一通过爬虫抓取网页文本呢,第二通过阅读工

用Python写了个小工具,再复杂的文件夹,分分钟帮你整理!用Python写了个小工具,再复杂的文件夹,分分钟帮你整理!Apr 11, 2023 pm 08:19 PM

糟透了我承认我不是一个爱整理桌面的人,因为我觉得乱糟糟的桌面,反而容易找到文件。哈哈,可是最近桌面实在是太乱了,自己都看不下去了,几乎占满了整个屏幕。虽然一键整理桌面的软件很多,但是对于其他路径下的文件,我同样需要整理,于是我想到使用Python,完成这个需求。效果展示我一共为将文件分为9个大类,分别是图片、视频、音频、文档、压缩文件、常用格式、程序脚本、可执行程序和字体文件。# 不同文件组成的嵌套字典 file_dict = { '图片': ['jpg','png','gif','webp

用 WebAssembly 在浏览器中运行 Python用 WebAssembly 在浏览器中运行 PythonApr 11, 2023 pm 09:43 PM

长期以来,Python 社区一直在讨论如何使 Python 成为网页浏览器中流行的编程语言。然而网络浏览器实际上只支持一种编程语言:JavaScript。随着网络技术的发展,我们已经把越来越多的程序应用在网络上,如游戏、数据科学可视化以及音频和视频编辑软件。这意味着我们已经把繁重的计算带到了网络上——这并不是JavaScript的设计初衷。所有这些挑战提出了对新编程语言的需求,这种语言可以提供快速、可移植、紧凑和安全的代码执行。因此,主要的浏览器供应商致力于实现这个想法,并在2017年向世界推出

一文读懂层次聚类(Python代码)一文读懂层次聚类(Python代码)Apr 11, 2023 pm 09:13 PM

首先要说,聚类属于机器学习的无监督学习,而且也分很多种方法,比如大家熟知的有K-means。层次聚类也是聚类中的一种,也很常用。下面我先简单回顾一下K-means的基本原理,然后慢慢引出层次聚类的定义和分层步骤,这样更有助于大家理解。层次聚类和K-means有什么不同?K-means 工作原理可以简要概述为: 决定簇数(k) 从数据中随机选取 k 个点作为质心 将所有点分配到最近的聚类质心 计算新形成的簇的质心 重复步骤 3 和 4这是一个迭代过程,直到新形成的簇的质心不变,或者达到最大迭代次数

从头开始构建,DeepMind新论文用伪代码详解Transformer从头开始构建,DeepMind新论文用伪代码详解TransformerApr 09, 2023 pm 08:31 PM

2017 年 Transformer 横空出世,由谷歌在论文《Attention is all you need》中引入。这篇论文抛弃了以往深度学习任务里面使用到的 CNN 和 RNN。这一开创性的研究颠覆了以往序列建模和 RNN 划等号的思路,如今被广泛用于 NLP。大热的 GPT、BERT 等都是基于 Transformer 构建的。Transformer 自推出以来,研究者已经提出了许多变体。但大家对 Transformer 的描述似乎都是以口头形式、图形解释等方式介绍该架构。关于 Tra

用 Python 实现导弹自动追踪,超燃!用 Python 实现导弹自动追踪,超燃!Apr 12, 2023 am 08:04 AM

大家好,我是J哥。这个没有点数学基础是很难算出来的。但是我们有了计算机就不一样了,依靠计算机极快速的运算速度,我们利用微分的思想,加上一点简单的三角学知识,就可以实现它。好,话不多说,我们来看看它的算法原理,看图:由于待会要用pygame演示,它的坐标系是y轴向下,所以这里我们也用y向下的坐标系。算法总的思想就是根据上图,把时间t分割成足够小的片段(比如1/1000,这个时间片越小越精确),每一个片段分别构造如上三角形,计算出导弹下一个时间片走的方向(即∠a)和走的路程(即vt=|AC|),这时

集成GPT-4的Cursor让编写代码和聊天一样简单,用自然语言编写代码的新时代已来集成GPT-4的Cursor让编写代码和聊天一样简单,用自然语言编写代码的新时代已来Apr 04, 2023 pm 12:15 PM

集成GPT-4的Github Copilot X还在小范围内测中,而集成GPT-4的Cursor已公开发行。Cursor是一个集成GPT-4的IDE,可以用自然语言编写代码,让编写代码和聊天一样简单。 GPT-4和GPT-3.5在处理和编写代码的能力上差别还是很大的。官网的一份测试报告。前两个是GPT-4,一个采用文本输入,一个采用图像输入;第三个是GPT3.5,可以看出GPT-4的代码能力相较于GPT-3.5有较大能力的提升。集成GPT-4的Github Copilot X还在小范围内测中,而

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment