Home >Web Front-end >JS Tutorial >In-depth analysis of browser caching mechanism (picture and text)

In-depth analysis of browser caching mechanism (picture and text)

不言
不言forward
2018-11-14 09:55:032469browse

This article brings you an in-depth analysis of the browser caching mechanism. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

1. Foreword

Regarding page performance optimization, browser caching must be a topic that cannot be avoided. The most intuitive way to judge the performance of a website is to look at the speed of opening the web page, and to improve the response of the web page One way to improve speed is to use caching. An excellent caching strategy can shorten the distance of web page request resources and reduce delays. Since cache files can be reused, it can also reduce bandwidth and reduce network load. Therefore, it is particularly important to understand the browser's caching mechanism.

2. Cache Type

Cache can be divided into two categories at a macro level: private cache and shared cache. Shared caches are caches that can be cached by proxies at all levels. Private cache is a cache that is exclusive to users and cannot be cached by proxies at all levels.

Microscopically, it can be divided into the following categories:

1. Browser cache

The significance of cache existence is when the user clicks the back button Or it can respond faster when you visit a certain page again. Especially in websites with multi-page applications, if you use the same image on multiple pages, caching this image becomes particularly useful. The browser first initiates a Web request to the proxy server, and then forwards the request to the source server. Among them, browser cache includes strong cache and negotiated cache , which are described in detail below. The main focus of this article is on browser caching.

2.CDN cache

CDN cache is generally deployed by website administrators themselves in order to make their websites easier to expand and obtain better performance. Normally, the browser first initiates a Web request to the CDN gateway. Behind the gateway server is one or more load balancing source servers, which will dynamically forward the request to the appropriate source server based on their load requests. From the perspective of the browser, the entire CDN is an origin server. From this perspective, the caching mechanism between the browser and the server is also applicable under this architecture.

3. Proxy server caching

The proxy server is an intermediate server between the browser and the origin server. When the proxy forwards the response, the caching proxy will pre-save a copy (cache) of the resource to the proxy. on the server. When the proxy receives a request for the same resource again, it does not fetch the resource from the origin server but returns the previously cached resource as a response.

4. Database caching

Database caching means that when the relationship between web applications is relatively complex and there are many tables in the database, frequent database queries can easily cause the database to be overwhelmed. In order to improve query performance, the queried data is cached in memory. The next time you query, it is returned directly from the memory cache to improve response efficiency.

5. Application layer caching

Application layer caching refers to the caching we do at the code level. Through code logic, the data or resources that have been requested are cached, and when the data is needed again, the available cached data is selected through logical processing.

3. Analysis of the Caching Process

The way the browser communicates with the server is the response mode, that is: the browser initiates an HTTP request – the server responds to the request, How does the browser determine Should a resource be cached? How to cache it? After the browser initiates the request to the server for the first time and gets the request result, it stores the request result and cache identifier in the browser cache. The browser processes the cache based on the response header returned when the resource is requested for the first time. To confirm . The specific process is as follows:

In-depth analysis of browser caching mechanism (picture and text)

We can know from the above picture:

  • Every time the browser initiates a request, it will first browse Search the browser cache for the result of the request and the cache identifier

  • Every time the browser gets the returned request result, it will store the result and cache identifier in the browser cache

The above two conclusions are the key to the browser cache mechanism. It ensures that the cache is stored and read for each request. As long as we understand the usage rules of the browser cache, then all the problems will be solved. It’s easy to solve, and this article will also conduct a detailed analysis around this point. In order to facilitate everyone's understanding, here we divide the caching process into two parts according to whether we need to re-initiate an HTTP request to the server, namely strong caching and negotiation caching.

4. Strong Cache

Strong Cache: Will not send a request to the server and read resources directly from the cache. You can see this request in the network option of the chrome console A status code of 200 is returned, and the size displays from disk cache or from memory cache.

In-depth analysis of browser caching mechanism (picture and text)

Take the request from my Jianshu blog as an example. Requests with a gray status code represent the use of forced caching. The Size value corresponding to the request represents the location of the cache, which are from memory cache and from disk cache respectively. . Friends may have doubts here: What do

from memory cache and from disk cache represent respectively? When will from disk cache be used, and when will from memory cache be used?

from memory cache means using the cache in the memory, and from disk cache means using the cache in the hard disk. The order in which the browser reads the cache is memory –> disk. In the browser, the browser will directly store the files such as js and pictures in the memory cache after parsing and executing them. Then when the page is refreshed, it only needs to be read directly from the memory cache (from memory cache); while the css file will be stored in the memory cache. into the hard disk file, so each time you render the page, you need to read the cache from the hard disk (from disk cache).

# Related headers:
1.Expires: The expiration time in the response header. When the browser loads the resource again, if it is within this expiration time, the strong cache will be hit. Its value is a time string in GMT format of an absolute time, such as Expires:Thu,21 Jan 2018 23:39:02 GMT
2.Cache-Control: In HTTP/1.1, Cache-Control It is the most important rule and is mainly used to control web page caching. For example, when Cache-Control:max-age=300, it means that if the resource is loaded again within 5 minutes of the correct return time of this request (the browser will also record it), the strong cache will be hit. The following six attribute values ​​are common:

public

: All content will be cached (both client and proxy servers can cache). Specifically, the response can be cached by any intermediate node, such as Browser private: All content can only be cached by the client, the default value of Cache-Control. Specifically, it means that intermediate nodes do not allow caching. For Browser no-cache: The client caches content. Whether to use the cache needs to be verified by negotiating the cache. Indicates that the cache control method of Cache-Control is not used for pre-verification, but the Etag or Last-Modified field is used to control the cache. It should be noted that the name no-cache is a bit misleading. After setting no-cache, it does not mean that the browser will no longer cache data, but when the browser uses cached data, it needs to first confirm whether the data is still consistent with the server.

no-store

: All content will not be cached, that is, neither forced caching nor negotiated caching is used

max-age

: max-age=xxx (xxx is numeric) means that the cached content will expire after xxx seconds

s-maxage (unit is s): Same as max-age, only Used for shared caching (such as CDN caching). For example, when s-maxage=60, during these 60 seconds, even if the CDN content is updated, the browser will not make a request. max-age is used for normal caching, while s-maxage is used for proxy caching. s-maxage has a higher priority than max-age

. If s-maxage exists, the max-age and Expires headers will be overwritten.

In-depth analysis of browser caching mechanism (picture and text)

Comparison between Expires and Cache-Control: In fact, there is not much difference between the two. The difference is that Expires is http1.0 Product, Cache-Control is a product of http1.1. If both exist at the same time, Cache-Control has a higher priority than Expires
; in some environments that do not support HTTP1.1, Expires will come into play. . So Expires is actually an outdated product, and its existence at this stage is just a way of writing compatibility. The basis for strong caching to determine whether to cache comes from whether it exceeds a certain time or a certain time period, and does not care whether the server-side file has been updated. This may cause the loaded file to not be the latest content on the server-side. Then How do we know whether the server-side content has been updated

? At this time we need to use the negotiation cache strategy.

5. Negotiation Caching

Negotiation caching is a process in which after forcing the cache to expire, the browser carries the cache identifier to initiate a request to the server, and the server decides whether to use the cache based on the cache identifier. Mainly There are the following two situations

:
  • Negotiation cache takes effect and returns 304 and Not Modified

In-depth analysis of browser caching mechanism (picture and text)

  • ## Negotiate cache invalidation, return 200 and request result

In-depth analysis of browser caching mechanism (picture and text)

Related headers:

1.Last-Modified and If-Modified-Since

The browser is in the first When accessing a resource for the first time, when the server returns the resource, it adds the Last-Modified header to the response header. The value is the last modification time of the resource on the server. The browser caches the file and header after receiving it;

Last-Modified: Fri, 22 Jul 2016 01:47:00 GMT

The next time the browser requests this resource, the browser detects the Last-Modified header, so it adds the If-Modified-Since header, and the value is the value in Last-Modified; when the server receives this resource request again, it will respond based on the If-Modified header. The value in Modified-Since is compared with the last modification time of this resource in the server. If there is no change, 304 and an empty response body are returned and read directly from the cache. If the time of If-Modified-Since is less than the last modification time of this resource in the server, The modification time indicates that the file has been updated, so the new resource file and 200 are returned.

In-depth analysis of browser caching mechanism (picture and text)

But last-modified has some shortcomings:

①Some Some servers cannot obtain the precise modification time

②The file modification time has changed, but the file content has not changed

Since it is determined based on the file modification time Is the cache still insufficient? Can the cache strategy be determined directly based on whether the file content has been modified? ----ETag and If-None-Match

2.ETag and If-None-Match

Etag is the response returned by the server when the resource was last loaded. Header is a unique identifier for the resource. As long as the resource changes, Etag will be regenerated. When the browser loads resources and sends a request to the server next time, it will put the Etag value returned last time into the If-None-Match in the request header. The server only needs to compare the If-None-Match sent by the client with its own server. Whether the ETag of the resource is consistent can be used to determine whether the resource has been modified relative to the client. If the server finds that the ETag does not match, it will directly send the new resource (including the new ETag) to the client in the form of a regular GET 200 return packet; if the ETag is consistent, it will directly return 304 to notify the client directly. Just use local cache.

In-depth analysis of browser caching mechanism (picture and text)

Comparison between the two:
First of all, in terms of accuracy, Etag is better than Last-Modified. The time unit of Last-Modified is seconds. If a file changes multiple times within 1 second, then their Last-Modified does not actually reflect the modification, but the Etag will change every time to ensure accuracy; if it is load-balanced Server, the Last-Modified generated by each server may also be inconsistent.
Secondly, in terms of performance, Etag is inferior to Last-Modified. After all, Last-Modified only needs to record time, while Etag requires the server to calculate a hash value through an algorithm.
Third, in terms of priority, server verification gives priority to Etag

6. Caching mechanism

Forced caching takes precedence over negotiated caching. If forced caching (Expires and Cache-Control) takes effect, the cache will be used directly. If it does not take effect, negotiated caching (Last-Modified / If-Modified-Since and Etag / If-None-Match) will be used. The negotiated cache is decided by the server whether to use it. Cache, if the negotiated cache is invalid, then the cache of the request is invalid, return 200, return the resource and cache identifier, and then store it in the browser cache; if it takes effect, return 304 and continue to use the cache. The specific flow chart is as follows:

In-depth analysis of browser caching mechanism (picture and text)

7. The impact of user behavior on browser cache

If the resource has been cached by the browser, before the cache expires, When requesting again, by default, it will first check whether the strong cache hits. If the strong cache hits, the cache will be read directly. If the strong cache does not hit, a request will be sent to the server to check whether the negotiated cache hits. If the negotiated cache hits, the browser will be told that it is still OK. Read from the cache, otherwise the latest resource will be returned from the server. This is the default processing method, which may be changed by browser behavior:

  1. Address bar access and link jump are normal user behaviors and will trigger the browser cache mechanism;

  2. F5 refresh, the browser will set max-age=0, skip strong cache judgment, and negotiate cache judgment;

  3. ctrl F5 Refresh, skip strong cache and negotiation cache, and pull resources directly from the server.

The above is the detailed content of In-depth analysis of browser caching mechanism (picture and text). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete