Home  >  Article  >  Backend Development  >  Nginx cache cleaning

Nginx cache cleaning

WBOY
WBOYOriginal
2016-08-08 09:22:271173browse

Several ways to cache cache in nginx

1. nginx’s proxy_cache functionStarting from nginx-0.7.44 version, nginx supports a more formal cache function similar to squid. This cache stores links after hashing them with md5 encoding, so it can support any link and also supports non-200 statuses such as 404/301/302. Configuration: First configure a cache space (under http): proxy_cache_path /xok/to/cache levels=1:2 keys_z inactive=5m max_size =1g clean_time=1m;proxy_temp_path parameter path also needs to be on the same partition as the proxy_cache_path above, otherwise an error will be reported. Note that this configuration is outside the server tag. Levels specifies that the cache space has two levels of hash directories. The first level directory is 1 letter, and the second level is 2 letters. The saved file name will be similar to /xok /to/cache/e/4a/0f1019b0db2f97d17c2238c0271a74ae; keys_zone gives this space a name. 50m refers to the shared memory space size of 50MB, which is the memory cache space (active); 5m of inactive refers to the default cache duration of 5 minutes; 1g of max_size is It refers to the total cache space size of 1g and the disk cache space. When the total cache size exceeds this value, the lru policy will be executed; clean_time specifies that the cache should be cleaned once a minute. location / { proxy_pass http://xok.la/; proxy_cache xok1;#Use xok1 keys_zone proxy_cache_valid 200 302 1h;#200 and 302 status codes are saved 1 hour proxy_cache_valid 301 1d;#301 status code is saved for one day proxy_cache_valid any 1m;#Others are saved for one minute}Here is a commonly used example:

location ~ ^/(xx|yy).action{
proxy_set_header Accept-Encoding "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-For Warded -For $proxy_add_x_forwarded_for;
add_header ;//See explanation 2
proxy_cache_methods GET POST;
proxy_cache_valid 200 301 302 5s;
proxy_cache_use_stale error timeout updating http_404 http_500 http_502 http_503;//See explanation 3
proxy_cache_lock on;
//See explanation 4 proxy_cache_lock_timeout 5s;
proxy_pass http://localhost;}
Explanation 1:
add_header X-Cache $upstream_cache_status; allows us to easily observe the cache hit status through the response header.

nginx provides the variable $upstream_cache_status to display the status of the cache. We can add an http header to the configuration to display this status to achieve an effect similar to Squid.


$upstream_cache_status contains the following statuses: ·MISS Miss, the request is sent to the backend

·HIT Cache hit ·EXPIRED The cache has expired and the request is sent to the backend
·UPDATING is updating the cache and will use the old response
·STALE The backend will get an expired response



Explanation 2:

proxy_cache_key $host$ uri$is_args$arg_pageNo$arg_subjectId; specifies the cache key, usually url. Pay attention to two variables here:

(1) $is_args'?' question mark
(2) $args; represents all parameters after the question mark

Here is how I write the example I think it is more important to understand that this is how to specify some parameters, pageNo, and subjectId, while ignoring other parameters. For example, when a timestamp is added, you can ignore the timestamp, otherwise the cache will never hit of.

Explanation 3:

proxy_cache_use_stale error timeout updating http_404 http_500 http_502 http_503;

proxy_cache_use_stale nginx can be used when specifying what situation occurs in the backend server Expired cache.
For example, if updating is formulated, will ensure that only one thread updates the cache, and during the process of this thread updating the cache, other threads will only respond to the expired version in the current cache.

Explanation 4:

proxy_cache_lock on;


#Define the same request and only allow one request to be sent to the backend at the same time. And write a new

# entry into the memory according to the proxy_cache_key, and the proxy_cache_lock_time will be released directly after the timeout

proxy_cache_lock on |off #default offproxy_cache_lock_timeout time #default 5s Attached next Some introductions: 1 proxy_cache

Syntax: proxy_cache zone_name; Default value: None Use fields: http, server, location Set the name of a cache zone, the same Zones can be used in different places. After 0.7.48, the cache follows the backend’s “Expires”, “Cache-Control: no-cache”, “Cache-Control: max-age=XXX” and others. However, currently nginx will ignore some cache control directives, such as: "private" and "no-store". Similarly, nginx will not process the "Vary" header during the cache process, in order to ensure that some private data is not visible to all users. , the backend must set the "no-cache" or "max-age=0" header, or the proxy_cache_key contains user-specified data such as $http_cookie_xxx. Using part of the cookie value in the proxy_cache_key can prevent private data from being cached, so the location can be specified separately. To separate private data and public data.

The caching directive relies on proxy buffers (buffers). If proxy_buffers is set to off, caching will not take effect.

2 proxy_cache_key

Syntax: proxy_cache_key line;
Default value: $scheme$proxy_host$request_uri;
Used fields: http, server, location
The directive specifies the cache key included in the cache.

proxy_cache_key "$host$request_uri$cookie_user";
proxy_cache_key "$scheme$host$request_uri";

3 proxy_cache_path

Syntax: proxy_cache_path path [levels=number] keys_z [inactive=time] [max_size=size];Default value: NoneUse field: http directive specifies cache The path and some other parameters, the cached data is stored in the file. The cached file name and key are the MD5 codes of the proxy URL. The levels parameter specifies the number of cached subdirectories, for example:
proxy_cache_path /data/nginx/cache levels=1:2 keys_z/td>

The file name is similar to: /data/nginx/cache/c/29/b7f54b2df7773722d382f4809d65029cAll active keys and metadata are stored in the shared memory area , this area is specified with the keys_zone parameter. If the cached data is not requested within the time specified by the inactive parameter, it will be deleted. The default inactive is 10 minutes. The cache manager process controls the cache size of the disk, which is defined in the max_size parameter. After its size is exceeded, the least used data will be deleted. The size of the area is set in proportion to the number of cached pages. The metadata size of a page (file) is determined according to the operating system. It is 64 bytes under FreeBSD/i386 and 128 bytes under FreeBSD/amd64. When the area is full After the key is obtained, it will be processed according to LRU (least recently used algorithm). proxy_cache_path and proxy_temp_path should be used on the same file system.

4 proxy_cache_methods

Syntax: proxy_cache_methods [GET HEAD POST];
Default value: proxy_cache_methods GET HEAD;
Used fields: http, server, location
GET/HEAD is used to decorate statements, that is, you cannot disable GET /HEAD even if you only set it using the following statement:

proxy_cache_methods POST;

5 proxy_cache_min_uses

Syntax: proxy_cache_min_uses the_number;Default: proxy_cache_min_uses 1;Used fields: http, server, location How many times The response will be cached after querying, default 1.

6 proxy_cache_valid

Syntax: proxy_cache_valid reply_code [reply_code ...] time;
Default value: None
Use fields: http, server, location
Set different cache times for different responses, for example:
proxy_cache_valid 200 302 10m; proxy_cache_valid 404 1m; proxy_cache_valid 5m; proxy_cache_valid 200 302 10m; proxy_cache_valid 301 1h; proxy_cache_valid any 1m;

7 proxy_cache_use_stale


In order to prevent cache invalidation (when multiple threads update the local cache at the same time), you can specify the 'updating' parameter, which will ensure that only one thread updates the cache, and during the process of this thread updating the cache Other threads will only respond to the expired version currently in the cache.

Code and configure configuration:

Hooks (ie callbacks) for each instruction are defined in ngx_http_proxy_module.c, which will be called when reading the configuration file . When configuring, you only need to set "HTTP_CACHE" to YES (you can find the HTTP_CACHE line in auto/options). The "proxy_cache_purge" instruction requires downloading nginx "Cache Purge" module in add-ons, and use "--add-module=" to load the code when configure.


The following attachments have some other nginx cache content, but they are rarely used now. If you are interested, you can learn more:

2. One of the traditional caches (404)
This method is to direct the 404 error of nginx to the backend, and then use proxy_store to save the page returned by the backend. location / { root /home/html/;#Home directory expires 1d;#Expiration time of the web page error_page 404 =200 /fetch$request_uri;#404 ​​directed to the /fetch directory Next } location /fetch/ {#404 Directed here internal;#Indicate that this directory cannot be directly accessed from outside expires 1d;#Expiration time of the web page alias / home/html/;#The virtual directory file system address must be consistent with location/, proxy_store will save the file to this directory proxy_pass http://xok.la/;#Backend upstream address, /fetch at the same time It is a proxy proxy_set_header Accept-Encoding '';# Let the backend not return compressed (gzip or deflate) content. Saving the compressed content will cause trouble. proxy_store on;#Specify nginx to save the file returned by the proxy proxy_temp_path /home/tmp;#Temporary directory, this directory should be in the same hard disk partition as /home/html } When using it, please note that nginx must have permission to write files to /home/tmp and /home/html. Under Linux, nginx is generally configured to run as the nobody user, so these two directories must be chown. nobody, set it to the nobody user only, of course you can also chmod 777, but all experienced system administrators will advise not to use 777 casually. 3. Traditional cache 2 (!-e) The principle is basically the same as 404 jump, but more concise: location / { root /home/html/; proxy_store on; proxy_set_header Accept-Encoding ''; proxy_temp_path /home/tmp; if ( !-f $request_filename ) { proxy_pass http://xok. la/; } } You can see that this configuration saves a lot of code compared to 404. It uses !-f to determine whether the requested file exists on the file system or not. Just proxy_pass to the backend, and the return is also saved with proxy_store. Both traditional caches have basically the same advantages and disadvantages: Disadvantage 1: Dynamic links with parameters are not supported, such as read.php?id=1, because nginx only saves the file name, so this link Only save it as read.php in the file system, so that incorrect results will be returned when users access read.php?id=2. At the same time, the home page and secondary directory http in the form of http://xok.la/ are not supported: //xok.la/download/, because nginx is very honest and will write such a request to the file system according to the link, and this link is obviously a directory, so the save fails. In these cases, rewrite is required to save correctly. Disadvantage 2: There is no mechanism for cache expiration and cleanup inside nginx. These cached files will be permanently stored on the machine. If there are a lot of things to cache, it will fill up the entire hard disk space. For this purpose, you can use a shell script to clean it regularly, and you can write dynamic programs such as php to do real-time updates. Disadvantage 3: Only 200 status codes can be cached, so status codes such as 301/302/404 returned by the backend will not be cached. If there happens to be a pseudo-static link with a large number of visits that is deleted, it will continue. Penetration causes the rear end to bear considerable pressure.Disadvantage 4: nginx will not automatically select memory or hard disk as the storage medium. Everything is determined by the configuration. Of course, there will be an operating system-level file caching mechanism in the current operating system, so there is no need to worry too much about file caching on the hard disk. IO performance problems caused by concurrent reads. nginx The disadvantage of traditional caching is also its different characteristics from caching software such as Squid, so it can also be regarded as its advantage. In production applications, it is often used as a partner with Squid. Squid is often unable to block links with ?, but nginx can block their access, such as: http://xok.la/? and http:// xok.la/ will be treated as two links on Squid, so it will cause two penetrations; while nginx will only save it once, no matter the link becomes http://xok.la/?1 or http: //xok.la/?123, cannot be cached by nginx, thus effectively protecting the back-end host. nginx will save the link form to the file system very faithfully, so that for a link, you can easily check its cache status and content on the cache machine, and you can also easily communicate with other file managers If used in conjunction with rsync, etc., it is completely a file system structure. Both of these two traditional caches can save files to /dev/shm under Linux. Generally, I do this, so that the system memory can be used for caching. If the memory is used, the speed of cleaning expired content will be faster. Much faster. When using /dev/shm/, in addition to pointing the tmp directory to the /dev/shm partition, if there are a large number of small files and directories, you must also modify the memory partition. Number of inodes and maximum capacity: mount -o size=2500M -o nr_inodes=480000 -o noatime,nodiratime -o remount /dev/shmThe above command is used on a machine with 3G memory , because the default maximum memory of /dev/shm is half of the system memory, which is 1500M, this command increases it to 2500M. At the same time, the number of shm system inodes may not be enough by default, but the interesting thing is that it can be adjusted at will, here The adjustment to 480000 is a bit conservative, but it is basically enough. 4. Cache based on memcached nginx supports memcached, but the function is not particularly strong, and the performance is still very good. location /mem/ { if ( $uri ~ "^/mem/([0-9A-Za-z_]*)$" ) { set $memcached_key "$1" ; memcached_pass 192.168.6.2:11211; } expires 70; }This configuration will http://xok .la/mem/abcIndicates memcached The key abc is used to retrieve data. nginx currently does not have any mechanism for writing to memcached, so writing data to memcached must be done using the dynamic language of the background. You can use 404 to direct to the backend to write data. 5. Based on the third-party plug-in ncachencache is a good project developed by Sina Brothers. It uses nginx and memcached to implement some functions similar to Squid caching. I have no experience in using this plug-in. Reference: http://code.google.com/p/ncache/

The above has introduced Nginx cache organization, including aspects of the content. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn