Introduction and in-depth understanding of HTTP protocol-Linux Operation and Maintenance-php.cn

Home

Operation and Maintenance

Linux Operation and Maintenance

Introduction and in-depth understanding of HTTP protocol

巴扎黑

Aug 23, 2017 pm 03:56 PM

httpapplicationSummarize

Summarizes my understanding of some content related to http protocol that I encountered in actual work scenarios.

Request & Response

Request format

For example: GET /api/index.json HTTP/1.1

For example: Accept: */*; User-Agent: Mozilla/4.0;……

[] For example: id=1×tamp=xxxxxx

Response format

For example: HTTP/1.1 200 OK

For example: Content-Type: application/json;……

[] For example: {"id": 1,"username":"testuser"}

Status Code

There are nearly 60 http status codes. I mainly record some common status codes generated under abnormal circumstances. We will encounter it more or less in daily applications, which helps us understand and discover problems.

206 - Used when downloading with breakpoints. The client requested a part of the content and the server successfully returned this part of the content to it. This status is used at this time.

301 - Permanent jump, the original address no longer exists, and the url is pointed to another address. This is mainly related to search engines and affects the crawler's retrieval behavior.

302 - Temporary jump, the server will return a new URL to the client, and the client can continue to access this URL to obtain content.

304 - The resource has not changed and the client can use locally cached content, which is common for static content access.

413 - The request entity is too large. A common situation is to upload a large file, but exceed the server (such as nginx) limit. Or the request header or request body exceeds the settings of the back-end server (such as tomcat) (for example, there are too many cookies under the current domain name, exceeding the request header limit)

416 - Related to breakpoint resumption, client request The range exceeds the file size on the server.

500 - Internal server error and cannot return normal results. For example, the most common application throws a null pointer exception that is not handled.

502 - Gateway error. A common situation is that the reverse proxy backend server (such as resin or tomcat) is not started.

503 - Service unavailable. For example, the server load is too high or the server has stopped serving.

504 - Gateway timeout. For example, the request duration exceeds the server's response time limit.

　Headers

HTTP headers are divided into two categories: request header (Request Header) and response header (Response Header). The following are some headers we often use.

　1. Cache control

In Internet website applications, caches are almost everywhere. In http-based services, we can also control Some content that does not change frequently is cached on the client side, so that the cached content can be reused in multiple visits, speeding up access, and improving user experience. The http protocol stipulates some http message headers for cache control:

Cache-Control(HTTP/1.1)/Pragma(HTTP/1.0): Indicates whether the client caches and how long the cache time is long. The default value is private, which means the content is cached in the user's private space. For example: Cache-Control: max-age=86400, must-revalidate, this tells the client that the requested resource is cached for one day (max-age unit is seconds, relative time), and must be re-checked after expiration.

Expires: Specify how long the client (if no forced refresh is required) can directly read the local cache without sending a request to the server.

Note:

Priority: Cache-Control > Expires;

Detailed parameter description: http://condor.depaul.edu/dmumaugh/readings/handouts/ SE435/HTTP/node24.html

The different behaviors of different browsers (refresh, back, enter in the address bar, etc.) may have differences in implementation;

Last-Modified/If-Modified -Since: Last-Modified is the last modified timestamp of the resource returned by the server to the client. In this way, the client will bring the If-Modified-Since parameter to verify whether the resource has been updated during the next request (such as forced refresh). No If updated, the server will return a 304 status code, and the client will directly access the locally cached resources. At this time, there is only request overhead and no network transmission overhead. Note: The timestamp must be Greenwich Mean Time (GMT), for example: Last-Modified:Sat, 19 Oct 2013 09:20:15 GMT

ETag/If-None-Match: ETag is based on file attributes The resource identifier generated through a certain algorithm is also used to determine whether the resource requested by the client has been updated. If the server returns an ETag value to the client, the next time the client requests it, it will bring the If-None-Match parameter to verify whether the resource is updated. If it is not updated, a 304 status code will be returned. (The effect is basically the same as Last-Modified)

Note:

ETag needs to be calculated, which is a consumption for servers with tight computing resources, so some websites do not use ETag directly;

If the server is behind a load balancer, requests for the same resource may be distributed to different backend machines. Since the calculation of ETag depends on file attributes, files with the same content on different machines may generate different ETags, which may Failed to pass ETag verification for files whose original content has not changed. There are two solutions here: one is that etag calculation does not depend on the local machine, such as directly calculating the md5 value of the file content; the other is to distribute the same URL request to the same back-end machine on the load balancer.

In our actual business scenarios, http caching has great uses. Here are some:

Make full use of the client’s resources, such as some static files that the client needs to access frequently. Such as LOGO, advertising images, etc., can be cached locally on the client. This can reduce network requests, speed up client display, and reduce the pressure on server requests.

When some of our static content, such as news, blogs, etc., are crawled by search engine crawlers, by controlling the cache parameters, we can reduce the crawler's crawling frequency and reduce unnecessary waste of resources.

If our static resources use CDN, then setting up http cache can save a file on the CDN node, reducing the number of CDN returns to the origin, reducing network delay and origin server pressure.

　2. Breakpoint request

Accept-Ranges: When the server supports breakpoint download, it will return this response header to the client. When the client knows this, it can send a breakpoint request. .

Content-Length: The length of the response information, telling the client how much data is returned by the current request. It should be noted here that when submitting a request using the head method, no specific data will be returned, but the Content-Length will return the size of the complete data.

Range/Content-Range: The client submits a header named Range when requesting, telling the server which part of the data it wants to request. For example: Range: bytes=0-1023 means requesting bytes 0 to 1023. Then the server returns the content of these 1024 bytes to the client, and Content-Range will be included in the response header. That is: Content-Range: bytes 0-1023/4096, this 4096 is the total file size. The client's next request can start from the 1024th byte, Range: bytes=1024-xxxx

　3. Encoding

Accept-Encoding/Content-Encoding: The former is supported by the client Received message encoding type. The default is identity, optional values include gzip, compress, etc. The latter is the content encoding type of the server-side response information, and compression is commonly used. The benefits of compression are obvious. It can greatly reduce the cost of network transmission. Compared with the CPU consumption caused by server-side compression, the reduction of network transmission is obviously more practical. Common forms: Content-Encoding: gzip, deflate, compress. Usually we can compress and transmit response results such as html, js, css, xml, and json.

Transfer-Encoding: response header. The transfer encoding type of the response message specifies the form of network transmission. Generally, it is in the following form: Transfer-Encoding: chunked. When the server generates dynamic content and does not know the specific length of the response information, it can transmit it in designated chunks and return as much data as it processes, so there is no need to wait until the data is ready and return it all at once. Combined with the above content encoding, such as gzip, it can be compressed in blocks and transmitted. In addition, please note that when using this encoding to transmit, we cannot see the Content-Length because the content has not been fully generated.

　4. Others

X-Forward-For: request header. Used to identify the user’s real IP, especially when accessing the server through a proxy (forward or reverse) or when the server is under load Equalize the situation behind the device. Format: X-forward-For: client, proxy1, proxy2,... The leftmost one is the IP closest to the client.

User-Agent: request header. The request header used by the server to identify the client's basic information. Generally, this is useful when identifying search crawlers. In some scenarios, this can also be used to do some client statistics.

Referer: request header. When the client accesses the server, this Referer specifies the source of the request, such as which website it is linked from. We often use this in some statistics. In addition, another important use is to filter illegal request sources in scenarios that require resource anti-hotlinking (however, this referer can be forged by the client).

Location: response header. This Location header will be included in the response header of the 301/302 status code to instruct the client to use the new address to access the required resources.

Connection: request/response header. In http/1.1, the client and server keep the connection by default, that is, Connection: keep-alive. If either party does not want to keep the connection, you can put this The value is set to close. By default, the client and server will maintain a long connection, so that the client can use this connection to send multiple http requests, reducing the consumption caused by frequent connection creation. For this parameter, more settings may be required on the server side, such as the connection keep-alive time and some network parameter settings of the server kernel (for tcp).

Session and Cookie

HTTP requests are stateless requests, but in our Internet applications, it is often necessary to identify user status information to complete some interactive operations. For example, user authentication needs to record user login status, and shopping cart applications need to remember user selections. Products, advertising applications need to record users’ historical browsing behavior, etc. Session and cookies will be used here.

session: refers to the interaction state between the client and the server during the http request-response process. This information is stored on the server side, such as memory, database, etc. Each session has a unique identifier, which is generated by the server. This identifier must also be saved on the client, so that the client can bring this identifier with the next request to facilitate the server to determine the client's status.

Client support for session:

Save the session id through cookie and send it to the server when requesting.

Communicate with the server by carrying the session id in the url parameters.

Communicate with the server by carrying the session id in the hidden field of the form.

Session sharing problem:

In distributed applications, our http server is usually installed behind a reverse proxy or load balancing device, which will face a session sharing problem. . That is to say, multiple requests from the same user may be distributed to multiple different machines. If we save the session in the local memory of the machine, we cannot share the user's session among multiple machines. Generally speaking, we can solve this problem in two ways:

Store the session in distributed memory (eg: memcached) or centralized storage (eg: database).

Distribute the requests of the same user to the same machine on the reverse proxy or load balancing device (here we need to deal with the problem of request redistribution after the machine goes down).

Cookie: Maintain stateful information on the client. Each cookie content belongs to a specific domain (domain) and path (path). For security reasons, cookies in different domains or paths cannot be shared.

Session cookie: No expiration time is specified, it is stored in memory and will expire after the browser is closed.

Persistent cookie: Specifies the expiration time and is saved locally in the browser.

For details, please refer to: http://en.wikipedia.org/wiki/HTTP_cookie

It should be noted that cookies will have some security issues.

Here I just summarized my understanding of some content related to the http protocol that I encountered at work. There are still many things that need to be explored in the http protocol, and we also need to continue to explore and understand the http protocol. It will bring great convenience to our development applications.

Finally, I recommend two very NB http debugging tools: fiddler (windows) and charles (mac) have http proxy function. For http applications that are not browser-based (such as mobile app), you can use these two A tool to monitor http requests.

The above is the detailed content of Introduction and in-depth understanding of HTTP protocol. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

What is Maintenance Mode in Linux? ExplainedApr 22, 2025 am 12:06 AM

MaintenanceModeinLinuxisaspecialbootenvironmentforcriticalsystemmaintenancetasks.Itallowsadministratorstoperformtaskslikeresettingpasswords,repairingfilesystems,andrecoveringfrombootfailuresinaminimalenvironment.ToenterMaintenanceMode,interrupttheboo

Linux: A Deep Dive into Its Fundamental PartsApr 21, 2025 am 12:03 AM

The core components of Linux include kernel, file system, shell, user and kernel space, device drivers, and performance optimization and best practices. 1) The kernel is the core of the system, managing hardware, memory and processes. 2) The file system organizes data and supports multiple types such as ext4, Btrfs and XFS. 3) Shell is the command center for users to interact with the system and supports scripting. 4) Separate user space from kernel space to ensure system stability. 5) The device driver connects the hardware to the operating system. 6) Performance optimization includes tuning system configuration and following best practices.

Linux Architecture: Unveiling the 5 Basic ComponentsApr 20, 2025 am 12:04 AM

The five basic components of the Linux system are: 1. Kernel, 2. System library, 3. System utilities, 4. Graphical user interface, 5. Applications. The kernel manages hardware resources, the system library provides precompiled functions, system utilities are used for system management, the GUI provides visual interaction, and applications use these components to implement functions.

Linux Operations: Utilizing the Maintenance ModeApr 19, 2025 am 12:08 AM

Linux maintenance mode can be entered through the GRUB menu. The specific steps are: 1) Select the kernel in the GRUB menu and press 'e' to edit, 2) Add 'single' or '1' at the end of the 'linux' line, 3) Press Ctrl X to start. Maintenance mode provides a secure environment for tasks such as system repair, password reset and system upgrade.

Linux: How to Enter Recovery Mode (and Maintenance)Apr 18, 2025 am 12:05 AM

The steps to enter Linux recovery mode are: 1. Restart the system and press the specific key to enter the GRUB menu; 2. Select the option with (recoverymode); 3. Select the operation in the recovery mode menu, such as fsck or root. Recovery mode allows you to start the system in single-user mode, perform file system checks and repairs, edit configuration files, and other operations to help solve system problems.

Linux's Essential Components: Explained for BeginnersApr 17, 2025 am 12:08 AM

The core components of Linux include the kernel, file system, shell and common tools. 1. The kernel manages hardware resources and provides basic services. 2. The file system organizes and stores data. 3. Shell is the interface for users to interact with the system. 4. Common tools help complete daily tasks.

Linux: A Look at Its Fundamental StructureApr 16, 2025 am 12:01 AM

The basic structure of Linux includes the kernel, file system, and shell. 1) Kernel management hardware resources and use uname-r to view the version. 2) The EXT4 file system supports large files and logs and is created using mkfs.ext4. 3) Shell provides command line interaction such as Bash, and lists files using ls-l.

Linux Operations: System Administration and MaintenanceApr 15, 2025 am 12:10 AM

The key steps in Linux system management and maintenance include: 1) Master the basic knowledge, such as file system structure and user management; 2) Carry out system monitoring and resource management, use top, htop and other tools; 3) Use system logs to troubleshoot, use journalctl and other tools; 4) Write automated scripts and task scheduling, use cron tools; 5) implement security management and protection, configure firewalls through iptables; 6) Carry out performance optimization and best practices, adjust kernel parameters and develop good habits.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

SublimeText3 English version

Recommended: Win version, supports code prompts!

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Mac version

God-level code editing software (SublimeText3)

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Atom editor mac version download

The most popular open source editor

Hot Topics

Where is the login entrance for gmail email?

7635

CakePHP Tutorial

1390

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

148