


Introduction to the basic crawler process Request and Response
The crawler based on python wants to obtain data from the website, which is the process from request to response. We disguise the browser to send a Request request to the server, and the server will respond with a Response after accepting the information.
In the previous article we explainedWhat is a crawler and an introduction to the basic process of a crawler, today I will give What everyone brings is a detailed introduction to the basic process, what are Request and Response.
Request
1. What is Request?
#The browser sends information to the server where the URL is located. This process is called HTTP Request.
2.What is included in the Request?
Request method: The main types of request methods are GET and POST, as well as HEAD, PUT, DELETE, etc. The request parameters of the GET request will be displayed after the URL link. For example, if we open Baidu and search for "pictures", we will see that the requested URL link is https://www.baidu.com/s?wd=picture. The request parameters of the POST request will be stored in the Request and will not appear behind the URL link. For example, if we log in to Zhihu and enter the user name and password, we will see the Network page of the browser developer tools. The Request request has Form Data's key-value pair information stores our login information there, which helps protect the security of our account information; Request URL: The full name of URL is Uniform Resource Locator, which is what we call a URL. For example, a picture, a music file, a web document, etc. can be determined by a unique URL. The information it contains indicates the location of the file and how the browser should process it; Request Headers: When the request header contains the request Header information, such as User-Agent (specify the browser's request header), Host, Cookies and other information; Request body: The request body is the additional data carried by the request, such as the login information data submitted by the login form.
Response
1. What is Response?
After the server receives the information sent by the browser, it can process it accordingly based on the content of the information sent by the browser, and then send the message back to the browser. This process is called HTTP Response.
2.What is included in the Response?
Response status: There are many response statuses, such as 200 for success, 301 for jump page, 404 for page not found, 502 for server error; Response Headers: such as content type , content length, server information, cookie settings, etc.; Response body: The most important part of the response body, including the content of the requested resource, such as web page HTML code, image binary data, etc.
Simple demonstration
import requests # 导入requests库,需要安装 # 模拟成浏览器访问的头 headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'} resp = requests.get('https://www.baidu.com',headers=headers) print(resp.text) # 打印出网页源代码 print(resp.status_code) # 打印出状态码
After running successfully, you can see the printed html source code and 200 status code. This basically implements the crawler's Request and Response process.
The above is the detailed content of Introduction to the basic crawler process Request and Response. For more information, please follow other related articles on the PHP Chinese website!

To maximize the efficiency of learning Python in a limited time, you can use Python's datetime, time, and schedule modules. 1. The datetime module is used to record and plan learning time. 2. The time module helps to set study and rest time. 3. The schedule module automatically arranges weekly learning tasks.

Python excels in gaming and GUI development. 1) Game development uses Pygame, providing drawing, audio and other functions, which are suitable for creating 2D games. 2) GUI development can choose Tkinter or PyQt. Tkinter is simple and easy to use, PyQt has rich functions and is suitable for professional development.

Python is suitable for data science, web development and automation tasks, while C is suitable for system programming, game development and embedded systems. Python is known for its simplicity and powerful ecosystem, while C is known for its high performance and underlying control capabilities.

You can learn basic programming concepts and skills of Python within 2 hours. 1. Learn variables and data types, 2. Master control flow (conditional statements and loops), 3. Understand the definition and use of functions, 4. Quickly get started with Python programming through simple examples and code snippets.

Python is widely used in the fields of web development, data science, machine learning, automation and scripting. 1) In web development, Django and Flask frameworks simplify the development process. 2) In the fields of data science and machine learning, NumPy, Pandas, Scikit-learn and TensorFlow libraries provide strong support. 3) In terms of automation and scripting, Python is suitable for tasks such as automated testing and system management.

You can learn the basics of Python within two hours. 1. Learn variables and data types, 2. Master control structures such as if statements and loops, 3. Understand the definition and use of functions. These will help you start writing simple Python programs.

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 Linux new version
SublimeText3 Linux latest version

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

Dreamweaver Mac version
Visual web development tools

Atom editor mac version download
The most popular open source editor