Home  >  Article  >  Web Front-end  >  One article to solve 'caching'

One article to solve 'caching'

coldplay.xixi
coldplay.xixiOriginal
2020-10-28 17:05:022111browse

javascript栏目介绍搞定缓存。

One article to solve 'caching'

前言

缓存是指:为了降低服务器端的访问频率,减少通信数量,前端将获取的数据信息保存下来,当再次需要时,就使用所保存的数据。

缓存对用户体验和通信成本都会造成很大的影响,所以要尽可能地去灵活使用缓存机制。

缓存的工作原理

HTTP缓存是一个以时间为维度的缓存。

浏览器在第一次请求中缓存了响应,而后续的请求可以从缓存提取第一次请求的响应。从而达到:减少时延而且还能降低带宽消耗,因为可能压根就没有发出请求,所以网络的吞吐量也下降了。

工作原理

浏览器发出第一次请求,服务器返回响应。如果得到响应中有信息告诉浏览器可以缓存此响应。那么浏览器就把这个响应缓存到浏览器缓存中。

如果后续再发出请求时,浏览器会先判断缓存是否过期。如果没有过期,浏览器压根就不会向服务器发出请求,而是直接从缓存中提取结果。

比如:访问掘金站点
One article to solve 'caching'
Size中可以看出,disk cache是从硬盘中提取的缓存信息。

缓存过期了

如果缓存过期了,也并不一定向第一个请求那样服务器直接返回响应。

浏览器的缓存时间过过期了,就把该请求带上缓存的标签发送给服务器。这时如果服务器觉得这份缓存还能用,那就返回304响应码。浏览器将继续使用这份缓存。

For example: select one of the cache files in the picture above, copyrequesturldisplay in curlOne article to solve 'caching'
First add -I Get the original request and look at the etag or last-modified header.

Because after the browser cache expires, the request will be sent to the server with these headers, allowing the server to determine whether it can still be used.
For the etag header, add a if-none-match header with the value of etag to query the server. Of course, you can also add a if-modified-since header to ask for the last-modified header.
One article to solve 'caching'
returns 304. The advantage of 304 is that it does not carry the package body, which means content-length is 0, which saves a lot of bandwidth.

Shared cache

The browser cache is a private cache and is only available to one user.

The shared cache is placed on the server and can be used by multiple users. For example, a popular video and other hot resources will be placed in the cache of the proxy server to reduce the pressure on the source server and improve network efficiency.

How to tell whether this resource is cached by the proxy server or sent by the source server?

Still using Nuggets example
One article to solve 'caching'
From the picture, we can see the age header in the Response Headers of this request, unit It's seconds.

Indicates that this cache is returned by the shared cache. ageIndicates the time it exists in the shared cache. The figure is 327784, which means it has existed in the shared cache for 327784 seconds.

The shared cache also expires. Let’s take a look at how the shared cache works.
One article to solve 'caching'
As shown in the figure:
1. When client1 initiates a request, Cache is the proxy server (shared cache) and forwards this Request to origin server. The origin server returns the response and sets in the Cache-Control header that it can be cached for 100 seconds. Then a timer Age will be started in Cache, and the response will be returned to client1 with the Age:0 header.

2. After 10 seconds, client2 sends the same request. The cache in Cache has not expired, so it brings Age: 10The header returns the cached response to client2.

3. After 100 seconds, client3 sends the same request. At this time, the cache in Cache has expired, using the conditions as mentioned earlier The request header If-None-Match is sent to the origin server with the cached fingerprint. When the origin service believes that the cache is still available, it returns a 304 status code to Cache. Cache will retime, find the response from the cache and return it to Client3 with the Age: 0 header.

Caching mechanism

There are related caching mechanisms in the HTTP protocol, and these mechanisms can also be used directly in API to manage the cache. The caching mechanism of HTTP is defined in detail in RFC7234 and is divided into: Expiration Model(Expiration Model) and Validation Model(Validation Model) Two categories

  • The expiration model refers to predetermining the storage period of the response data. When the period is reached, the server will be accessed again to regain the required data
  • Verification The model means that it will poll whether the currently saved cache data is the latest data, and will only re-obtain the data when the data is updated on the server side.

In HTTP, when the cache is in an available state, it is called fresh (fresh) state, and when it is in an unavailable state, it is called stale (stale) status.

Expiration model

The expiration model can be implemented by including information on when to expire in the server's response message. Two implementation methods are defined in HTTP1.1: one method is to use Cache-Control to respond to the message header, and the other method is to use Expires to respond to the message header.

// 1
Expires: Fri, 01 Oct 2020  00:00:00 GMT
// 2
Cache-Control: max-age=3600复制代码

ExpiresThe header has existed since HTTP1.0. It uses absolute time to express expiration and uses RFC1123 Described in a defined time format. Cache-Control is defined in HTTP1.1 and represents the number of seconds that have passed since the current time.

这两个首部该使用哪个,则是由返回的数据的性质决定的。对于一开始就知道在某个特定的日期会更新的数据,比如天气预报这种每天在相同时间进行更新的数据,可以使用Expires首部来指定执行更新操作的时间。对于今后不会使用更新的数据或静态数据等,可以通过指定一个未来非常遥远的日期,使得获取的缓存数据始终保存下去。但根据HTTP1.1的规定,不允许设置超过1年以上的时间,因此未来非常遥远的时间最多也只能是1年后的日期了。

Expires: Fri, 01 Oct 2021  00:00:00 GMT复制代码

而对于不是定期更新,但如果更新频率在某种程度上是一定的,或者虽然更新频率不低但不希望频繁访问服务器端,对于这种情况可以使用Cache-Control首部。

如果ExpiresCache-Control首部同时使用时,Cache-Control首部优先判断。

上面Cache-Control示例中使用到了max-age关键字,max-age计算会使用名为Date的首部。该首部用来显示服务器端生成响应信息的时间信息。从该时间开始计算,当经过的时间超过max-age值时,就可以认为缓存已到期。

Date: Expires: Fri, 30 Sep 2020  00:00:00 GMT复制代码

Date首部表示服务器端生成响应信息的时间信息。根据HTTP协议的规定,除了几个特殊的情况之外,所有的HTTP消息都要加上Date首部。

Date首部的时间信息必须使用名为HTTP时间的格式来描述。在计算缓存时间时,会用到该首部的时间信息,这时就可以使用Date首部信息来完成时间的同步操作,做到即便客户端擅自修改日期等配置信息。

验证模型

与到期模型只根据所接收的响应信息来决定缓存的保存时间相对,验证模型采用了询问服务器的方式来判断当前时间所保存的缓存是否有效。

验证模型在检查缓存的过程中会不时地去访问网络。在执行验证模型时,需要应用程序服务器支持附带条件地请求。附带条件地请求是指前端向服务器端发送地“如果现在保存地信息有更新,请给我更新后地信息”。在整个处理的过程中,前端会发送同“过去某个时间点所获得的数据”有关的信息,随后只有在服务器端的数据发生更新时,服务器端才会返回更新的数据,不然就只会返回304(Not Modified)状态码来告知前端当前服务器端没有更新的数据。

要进行附带条件的请求,就必须向服务器端传达“前端当前保存的信息的状态”,为此需要用到最后更新日期或实体标签(Entity Tag)作为指标。顾名思义,最后更新日期表示当前数据最后一次更新的日期:而实体标签则是表示某个特定资源版本的标识符,十一串表示指纹印(Finger Print)的字符串。例如响应数据的MD5散列值等,整个字符串会随着消息内容的变化而变化。这些信息会在服务器端生成,并被包含在响应信息的首部发送给前端,前端会将其缓存一同保存下来,用于附带条件的请求。

最后更新日期和实体标签会被分别填充到Last-ModifiedETag响应消息首部返回给前端

Last-Modified: Fri, 01 Oct 2021  00:00:00 GMT
ETag: 'ff568sdf4545687fadf4dsa545e4f5s4f5se45'复制代码

前端使用最后更新日期执行附带条件的请求时,会用到Modified-Since首部。在使用实体标签时,会用到If-None-Match首部

GET /v1/user/1
If-Modified-Since: Fri, 01 Oct 2021  00:00:00 GMT

GET /v1/user/1
If-None-Match: 'ff568sdf4545687fadf4dsa545e4f5s4f5se45'复制代码

服务器端会检查前端发送过来的信息和当前信息,如果没有发生更新则返回304状态码。如果有更新,则会同应答普通请求一样,在返回200状态码的同时将更新内容一并返回给前端,这时也会带上新的最后更新日期和实体标签。当服务器返回304状态码时,响应消息为空,从而节约了传输的数据量。

HTTP协议中,ETag有强验证与弱验证两个概念。

  • 执行强验证的ETag
    ETag: 'ffsd5f46s12wef13we2f13dsd21fsd32f1'

  • 执行弱验证的ETag
    ETag: W/'ffsd5f46s12wef13we2f13dsd21fsd32f1'

强验证是指服务器端同客户端的数据不能有一个字节的差别,必须完全一样;而弱验证是指即使数据不完全一样,只要从资源意义的角度来看没有发生变化,就可以视为相同的数据。例如广告信息,虽然每次访问时这些广告的内容都会有所改变,但它们依然是相同的资源,这种情况下便可以使用弱验证。

Heuristic expiration

HTTP1.1 mentioned that when the server does not give a clear expiration time, the client can decide how long it needs to keep the cached data. At this time, the client must determine the cache expiration time based on the server's update frequency, specific conditions and other information. This method is called heuristic expiration.

For example, by observing Last-Modified, the front end finds that the last update was 1 year ago, which means that there will be no problem if the cache data is saved for a while; if It is found that the result of the visit so far is that there is only one update per day, which means that it may be feasible to save the cache for half a day. Like this, the front end can reduce the number of visits through independent judgment.

Although APIwhether the heuristic expiration method is allowed depends on the characteristics of the API, because the server has the deepest understanding of cache update and control, the server uses Cache -Control, Expires, etc. accurately return the information of "how long to save the cached data" to the front end, which is an ideal approach for both parties to the interaction. But if it does not return, the server needs to inform the front end through header information such as Last-Modified

Use Vary to specify the cache unit

during implementation You may also need to specify the Vary header when caching. When implementing caching, Vary is used to specify which request header item is used in addition to URI to determine unique data. Vary is used because even if URI is the same, the obtained data sometimes changes due to different request header content. Only the headers specified by the vary header must match the headers in the request in order to use caching. The definition of

vary:

  • "*": means that the match must fail for
  • 1 or more field- name: The specified header must match the header in the request to use the cache

One article to solve 'caching'

As shown in the figure:
1. When Client1The GET request carrying the Accept-Encoding: * header is sent to server. server returns the response encoded by gzip, and the vary: Content-Encoding header, indicating that caching can only be used when the encoding method is the same.

2. When Client2 carries the Accept-Encoding: br header, the GET request is sent to server, this The requested encoding is br. Therefore, Cache cannot use cache because it does not match the value in vary and can only forward the request to the source server server.

3. When Client3 carries the Accept-Encoding: br header, the GET request is sent to server, this When Cache has a br encoded cache, it can match the value of the vary header, so it can be returned using the cache.

Generally speaking, the Vary header is used in scenarios where HTTP interacts through a proxy server, especially when the proxy server has a caching function. However, sometimes the server cannot know whether the front-end access is through the proxy server. In this case, the server-driven content negotiation mechanism needs to be used, and the Vary header becomes a required option.

Cache-Control

Cache-ControlThe header value range is very complex. The definition of

Cache-Control is:

  • Requiredtokenvalue
  • Optional "= ”, plus a quoted value or one or more decimal numbers, which is the specified number of seconds

Cache-Control can be used in the request or is used in response. And the same value has different meanings in the request and response. The

Cache-Control value has three uses:

  • 1. Use token
  • 2 directly. token value '=' decimal number
  • 3, token value '=' corresponding header/ use token value
  • ## directly
#Application in request

In request

The value, usage and meaning of Cache-Control: @ indicates the usage after

  • max-age@2: Tells the server that the client will not receive cached Age that exceeds max-age seconds
  • max-stale@ 2: Tell the server that even if the cache is no longer fresh, the client still intends to use it when the expiration seconds do not exceed max-stale. If there is no value after max-stale, it means that the client can use it no matter how long it expires.
  • min-fresh@2: Tell the server that Age must wait at least min-fresh seconds before the cache can be used
  • no-cache@1 : Tell the server that the existing cache cannot be directly used as a response return. The existing cache can only be used unless the cache condition is obtained from the upstream server and a 304 status code is obtained.
  • no-store@1: Tell each proxy server not to cache the response to this request
  • no-transform@1: Tell the proxy server not to modify the content of the message body
  • only-if-cached@1: Tell the server to only return the cached response, otherwise if there is no cache, it will return the 504 error code

Application in the response

In the response The value and meaning of Cache-Control:

  • max-age@2: Tell the client that the cache Age exceeds max-age The cache will expire after
  • s-maxage@2: Similar to max-age, but only for shared cache and has a higher priority than max-ageandexpires
  • must-revaildate@1: Tell the client that once the cache expires, it must authenticate to the server before using
  • proxy-revalidate@1: With must-revaildateSimilar, but it is only valid for the shared cache of the proxy server
  • no-cache@3: 1. Tell the client that the cached response cannot be used directly and must be in the source before use. Server verification got 304 return code. 2. If headers are specified after no-cache, then if the client’s subsequent requests and responses do not contain these headers, the cache can be used directly
  • no-store@1: Tell all The downstream server cannot cache the response.
  • no-transform: Tells the proxy server that the content of the message body cannot be modified.
  • public@1: Indicates that the message can be cached regardless of private cache or shared cache. Response cache
  • private@3: 1. Indicates that the response cannot be used by the proxy server as a shared cache. 2. If a header is specified after priate, it tells the proxy server that it cannot cache the specified header and can cache other headers

Related free learning recommendations :javascript(Video)

The above is the detailed content of One article to solve 'caching'. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn