Home  >  Article  >  Web Front-end  >  What is browser cache? What kind of mechanism is there?

What is browser cache? What kind of mechanism is there?

不言
不言forward
2018-11-17 16:43:142106browse

The content of this article is about what browser cache is? What kind of mechanism is there? , has certain reference value, friends in need can refer to it, I hope it will be helpful to you.

Regarding browser caching, I believe many developers really love and hate it. On the one hand, it greatly improves the user experience, but on the other hand, sometimes "wrong" things are displayed because the cache is read, and during the development process, every effort is made to disable the cache. So what kind of magical thing is the browser cache?

What is browser caching:

Simply put, browser caching is to cache a requested Web resource (such as an html page) , pictures, js, data, etc.) make a copy and store it in the browser. The cache keeps a copy of the output content based on incoming requests. When the next request comes, if it is the same URL, the cache will decide according to the caching mechanism whether to directly use the copy to respond to the access request, or to send the request again to the source server. What is more common is that the browser will cache the web pages that have been visited on the website. When the URL address is visited again, if the web page has not been updated, the web page will not be downloaded again, but the locally cached web page will be used directly. Only when the website clearly identifies that the resource has been updated will the browser download the web page again.

What is browser cache? What kind of mechanism is there?For example, after a page request, web resources are cached. In subsequent repeated requests, many resources are read directly from the cache (from cache) instead of requesting the server again.

Why use caching:

(1) Reduce network bandwidth consumption

Whether for website operators or users, bandwidth They all represent money, and excessive bandwidth consumption will only make it cheaper for network operators. When the web cache copy is used, only minimal network traffic is generated, which can effectively reduce operating costs.

(2) Reduce server pressure

After setting the validity period for network resources, users can reuse the local cache, reducing requests to the source server and indirectly reducing server pressure. At the same time, search engine crawler robots can also reduce the frequency of crawling based on the expiration mechanism, which can also effectively reduce the pressure on the server.

(3) Reduce network delay and speed up page opening speed

Bandwidth is very important for individual website operators, but for large Internet companies, it may sometimes be limited due to too much money. Really don't care. So does web caching still have a role? The answer is yes. For end users, the use of cache can significantly speed up page opening and achieve a better experience.

Browser-side caching rules:

For browser-side caching, these rules are in the HTTP protocol header and the Meta of the HTML page defined in the label. They specify whether the browser can directly use the copy in the cache or needs to go to the source server to obtain an updated version from the two dimensions of Freshness and Verification Value.

Freshness (expiration mechanism) : This is the validity period of the cache copy. A cached copy must meet the following conditions, and the browser will consider it to be valid and new enough:

  1. Contains complete expiration time control header information (HTTP protocol header), and is still within the validity period;

 2. The browser has already used this cached copy and has checked the freshness in a session;

If one of the above two situations is met, the browser will obtain it directly from the cache Copy and render.

Verification value (verification mechanism): When the server returns a resource, it sometimes carries the entity tag Etag (Entity Tag) of the resource in the control header information, which can be used as a browser Request the verification ID of the process again. If the verification identifiers do not match, it means that the resource has been modified or expired, and the browser needs to re-obtain the resource content.

Browser cache control:

(1) Using the HTML Meta tag

Web developers can add the

node of the HTML page Add the tag, the code is as follows
<meta>  

The function of the above code is to tell the browser that the current page is not cached, and it needs to be fetched from the server every time it is accessed. but! There is a pit here...

In fact, this form of disabling caching has very limited uses:

a. Only IE can recognize the meaning of this meta tag, and other mainstream browsers only recognize it The meta tag of "Cache-Control: no-store".

b. If the meaning of the meta tag is recognized in IE, it will not necessarily add Pragma to the request field, but it will indeed cause the current page to send a new request every time (only for pages, pages The resources on are not affected).

 (2) Use cache-related HTTP message headers

Here I need to introduce the relevant knowledge of HTTP to you first. The complete HTTP protocol interaction process of a URI is composed of HTTP request and HTTP response. For details about HTTP, please refer to "Hypertext Transfer Protocol — HTTP/1.1", "HTTP Protocol Detailed Explanation", etc.

In the message headers of HTTP requests and responses, common cache-related message headers are:

What is browser cache? What kind of mechanism is there?

In our discussion After having a certain understanding of some fields of HTTP request headers and response headers, we will discuss the relationships and differences between different fields:

· Cache-Control and Expires

Cache-Control has the same function as Expires. They both indicate the validity period of the current resource and control whether the browser directly fetches data from the browser cache or resends the request to the server to retrieve the data. It's just that Cache-Control has more options and more detailed settings . If set at the same time, its priority is higher than Expires.

· Last-Modified/ETag and Cache-Control/Expires

When configuring Last-Modified/ETag, will the browser access the resources of the same URI again? Will send a request to the server to ask whether the file has been modified. If not, the server will only send a 304 back to the browser, telling the browser to fetch the data directly from its local cache; if it has been modified, the entire data Resend it to the browser;

Cache-Control/Expires is different. If it is detected that the local cache is still within the valid time range, the browser will directly use the local copy and will not send any request. When the two are used together, Cache-Control/Expires has a higher priority than Last-Modified/ETag. That is, when the local copy is found to be still valid according to Cache-Control/Expires, it will not send another request to the server to ask for the modification time (Last-Modified) or entity identification (Etag).

Generally, Cache-Control/Expires will be used together with Last-Modified/ETag, because even if the server sets the cache time, when the user clicks the "Refresh" button, the browser will ignore the cache and continue to send data to the server. Send a request, then Last-Modified/ETag will be able to make good use of 304, thereby reducing response overhead.

· Last-Modified and ETag

You may think that using Last-Modified is enough to let the browser know whether the local cache copy is new enough, why is Etag needed? What about (entity identification)? The emergence of Etag in HTTP1.1 is mainly to solve several problems that are difficult to solve with Last-Modified:

  1. The last modification of the Last-Modified annotation can only be accurate to seconds. , if some files are modified multiple times within 1 second, it will not be able to accurately mark the freshness of the files

  2. If some files will be generated regularly, Sometimes the content does not change, but the Last-Modified has changed, resulting in the file being unable to be cached

  3. It is possible that the server does not accurately obtain the file modification time, or it differs from the proxy server time Inconsistencies, etc.

Etag is the unique identifier of the corresponding resource on the server side that is automatically generated by the server or generated by the developer, and can control caching more accurately. Last-Modified and ETag can be used together. The server will verify the ETag first. If they are consistent, it will continue to compare Last-Modified and finally decide whether to return 304. For the server generation rules of Etag and related content about strong and weak Etag, you can refer to "Interactive Encyclopedia-Etag" and "HTTP Header definition", which will not be discussed in detail here.

Note:

1. Etag is the unique identifier of the corresponding resource on the server side that is automatically generated by the server or generated by the developer. It can control the cache more accurately, but you need to pay attention to the distributed The last-modified files between multiple machines in the system must be consistent to avoid load balancing to different machines, causing comparison failures. Yahoo recommends that distributed systems try to turn off Etag as much as possible (the Etag generated by each machine will be different, because except for last- modified and inode are also difficult to keep consistent).

2. Last-Modified/If-Modified-Since must be used with Cache-Control, and Etag/If-None-Match must also be used with Cache-Control.

Browser HTTP request process:

First request:

What is browser cache? What kind of mechanism is there?

Request again:

 What is browser cache? What kind of mechanism is there?

User behavior and cache:

Browser cache behavior and user behavior Related, the specific situation is as follows:

What is browser cache? What kind of mechanism is there?

Uncacheable requests:

Of course, not all requests can be cached and cannot be cached by the browser. The request is as follows:

1. The HTTP information header contains Cache-Control: no-cache, pragma: no-cache (HTTP1.0), or Cache-Control: max-age=0, etc. to tell the browser Requests that do not need to be cached

2. Dynamic requests that require input content to be determined based on cookies, authentication information, etc. cannot be cached

3. Requests that are securely encrypted by HTTPS (someone has also found through testing , IE actually adds Cache-Control: max-age information to the header, and firefox can cache HTTPS resources after adding Cache-Control: Public to the header. Please refer to "Seven Misunderstandings of HTTPS")

4. POST requests cannot be cached

5. Requests that do not contain Last-Modified/Etag or Cache-Control/Expires in the HTTP response header cannot be cached


The above is the detailed content of What is browser cache? What kind of mechanism is there?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete