Establishment and optimization of caching mechanism for website system-Linux Operation and Maintenance-php.cn

Home

Operation and Maintenance

Linux Operation and Maintenance

Establishment and optimization of caching mechanism for website system

藏色散人

Apr 13, 2019 pm 05:24 PM

caching mechanism

After talking about the external network environment of the Web system, now we start to pay attention to the performance issues of our Web system itself.

As the number of visits to our website increases, we will encounter many challenges. Solving these problems is not just as simple as expanding the machine, but establishing and using an appropriate caching mechanism is fundamental.

In the beginning, our Web system architecture may be like this. Each link may have only one machine.

Establishment and optimization of caching mechanism for website system

1. The internal cache of MySQL database uses

MySQL’s caching mechanism. Let’s start from the inside of MySQL. The following content It will be based on the most common InnoDB storage engine.

1. Create an appropriate index

The simplest is to create an index. When the table data is relatively large, the index plays a role in quickly retrieving data, but the cost There are also some. First of all, it occupies a certain amount of disk space. Among them, the combined index is the most prominent. It needs to be used with caution. The index it generates may even be larger than the source data. Secondly, operations such as data insert/update/delete after index creation will take more time because the original index needs to be updated. Of course, in fact, our system as a whole is dominated by select query operations. Therefore, the use of indexes can still significantly improve system performance.

2. Database connection thread pool cache

If every database operation request needs to create and destroy a connection, it will undoubtedly be a huge overhead for the database. In order to reduce this type of overhead, thread_cache_size can be configured in MySQL to indicate how many threads are reserved for reuse. When there are not enough threads, they are created again, and when there are too many idle threads, they are destroyed.

In fact, there is a more radical approach, using pconnect (database long connection), once the thread is created, it will be maintained for a long time. However, when the amount of access is relatively large and there are many machines, this usage is likely to lead to "the number of database connections is exhausted", because the connections are not recycled, and eventually the max_connections (maximum number of connections) of the database are reached. Therefore, the usage of long connections usually requires the implementation of a "connection pool" service between CGI and MySQL to control the number of connections created "blindly" by the CGI machine.

3. Innodb cache settings (innodb_buffer_pool_size)

innodb_buffer_pool_size This is a memory cache area used to save indexes and data. If the machine is exclusive to MySQL, it is generally recommended to be 80 of the machine's physical memory. %. In the scenario of fetching table data, it can reduce disk IO. Generally speaking, the larger this value is set, the higher the cache hit rate will be.

4. Sub-library/table/partition.

MySQL database tables generally withstand data volume in the millions. If it increases further, the performance will drop significantly. Therefore, when we foresee that the data volume will exceed this level, it is recommended to Operations such as sub-database/table/partition. The best approach is to design the service into a sub-database and sub-table storage model from the beginning, to fundamentally eliminate risks in the middle and later stages. However, some conveniences, such as list-based queries, will be sacrificed, and at the same time, maintenance complexity will be increased. However, when the amount of data reaches tens of millions or more, we will find that they are all worth it.

2. Set up multiple MySQL database services

One MySQL machine is actually a high-risk single point, because if it hangs up, our web service will No longer available. Moreover, as the number of visits to the Web system continued to increase, one day, we found that one MySQL server could not support it, and we began to need to use more MySQL machines. When multiple MySQL machines are introduced, many new problems will arise.

1. Establish MySQL master-slave, with the slave database as a backup.

This approach is purely to solve the problem of "single point of failure". When the master database fails, switch to the slave database. However, this approach is actually a bit of a waste of resources, because the slave library is actually idle.

Establishment and optimization of caching mechanism for website system

#2. MySQL separates reading and writing, writing to the main database and reading from the slave database.

The two databases separate reading and writing. The main database is responsible for writing classes, and the slave database is responsible for reading operations. Moreover, if the main database fails, the reading operation will not be affected. At the same time, all reading and writing can be temporarily switched to the slave database (you need to pay attention to the traffic, because the traffic may be too large and the slave database will be brought down).

Establishment and optimization of caching mechanism for website system

#3. Primary and secondary backup.

The two MySQL servers are each other's slave database and the master database at the same time. This solution not only diverts traffic pressure, but also solves the problem of "single point of failure". If any unit fails, there is another set of services available.

However, this solution can only be used in the scenario of two machines. If the business is still expanding rapidly, you can choose to separate the business and establish multiple master-master and mutual-backup services.

Establishment and optimization of caching mechanism for website system

3. Establish a cache between the Web server and the database

In fact, to solve the problem of large visits, we cannot just focus on the database level. According to the "80/20 rule", 80% of requests only focus on 20% of hot data. Therefore, we should establish a caching mechanism between the web server and the database. This mechanism can use disk as cache or memory cache. Through them, most hot data queries are blocked in front of the database.

1. Page staticization

When a user visits a certain page on the website, most of the content on the page may not change for a long time. For example, a news report will almost never be modified once it is published. In this case, the static html page generated by CGI is cached locally on the disk of the web server. Except for the first time, which is obtained through dynamic CGI query database, the local disk file is returned directly to the user.

When the scale of the Web system was relatively small, this approach seemed perfect. However, once the scale of the Web system becomes larger, for example, when I have 100 Web servers. In this way, there will be 100 copies of these disk files, which is a waste of resources and difficult to maintain. At this time, some people may think that they can centralize a server to store it. Haha, why not take a look at the following caching method, which is how it does it.

2. Single memory cache

Through the example of page staticization, we can know that it is difficult to maintain the "cache" on the Web machine itself, and it will bring more Problem (in fact, through PHP's apc extension, the native memory of the web server can be manipulated through Key/value). Therefore, the memory cache service we choose to build must also be an independent service.

The choice of memory cache mainly includes redis/memcache. In terms of performance, there is not much difference between the two. In terms of feature richness, Redis is superior.

3. Memory cache cluster

When we build a single memory cache, we will face the problem of single point of failure, so we must turn it into a cluster. The simple way is to add a slave as a backup machine. However, what if there are really a lot of requests and we find that the cache hit rate is not high and more machine memory is needed? Therefore, we recommend configuring it as a cluster. For example, similar to redis cluster.

Redis cluster The Redis in the cluster are multiple sets of masters and slaves. At the same time, each node can accept requests, which is more convenient when expanding the cluster. The client can send a request to any node, and if it is the content it is "responsible for", the content will be returned directly. Otherwise, find the actual responsible Redis node, then inform the client of the address, and the client requests again.

All this is transparent to clients using the cache service.

There are certain risks when switching the memory cache service. In the process of switching from cluster A to cluster B, it is necessary to ensure that cluster B is "warmed up" in advance (the hot data in the memory of cluster B should be the same as that of cluster A as much as possible, otherwise, a large number of content requests will be requested at the moment of switching. It cannot be found in the memory cache of cluster B. The traffic directly impacts the back-end database service, which is likely to cause database downtime).

4. Reduce database “writes”

The above mechanisms all achieve the reduction of database “read” operations, but the write operation is also a big pressure. Although the write operation cannot be reduced, it can reduce the pressure by merging requests. At this time, we need to establish a modification synchronization mechanism between the memory cache cluster and the database cluster.

First put the modification request into effect in the cache, so that external queries can display normally, and then put these SQL modifications into a queue and store them. When the queue is full or every once in a while, they are merged into one request and sent to the database. Update the database.

In addition to improving the writing performance by changing the system architecture mentioned above, MySQL itself can also adjust the writing strategy to the disk by configuring the parameter innodb_flush_log_at_trx_commit. If the machine cost allows, to solve the problem from the hardware level, you can choose the older RAID (Redundant Arrays of independent Disks, disk array) or the newer SSD (Solid State Drives, solid state drives).

5. NoSQL storage

Regardless of whether the database is read or written, when the traffic increases further, the scenario of "when manpower is limited" will eventually be reached. The cost of adding more machines is relatively high and may not really solve the problem. At this time, you can consider using NoSQL database for some core data. Most NoSQL storage uses the key-value method. It is recommended to use Redis as introduced above. Redis itself is a memory cache and can also be used as a storage, allowing it to directly store data on the disk.

In this case, we will separate some of the frequently read and written data in the database and put it in our newly built Redis storage cluster, which will further reduce the pressure on the original MySQL database. At the same time, because Redis itself is a memory level Cache, the performance of reading and writing will be greatly improved.

Domestic first-tier Internet companies adopt many solutions similar to the above solutions in terms of architecture. However, the cache service used is not necessarily Redis. They will have richer other options, and even based on Develop its own NoSQL service based on its own business characteristics.

6. Empty node query problem

When we have built all the services mentioned above and think that the Web system is already very strong. We still say the same thing, new problems will still come. Empty node queries refer to data requests that do not exist in the database at all. For example, if I request to query a person's information that does not exist, the system will search from the cache at all levels step by step, and finally find the database itself, and then draw the conclusion that it cannot be found, and return it to the front end. Because caches at all levels are invalid for it, this request consumes a lot of system resources, and if a large number of empty node queries are made, it can impact system services.

The above is the detailed content of Establishment and optimization of caching mechanism for website system. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:hcoder. If there is any infringement, please contact admin@php.cn delete

Understanding Linux: The Core Components DefinedMay 01, 2025 am 12:19 AM

The core components of Linux include kernel, shell, file system, process management and memory management. 1) Kernel management system resources, 2) shell provides user interaction interface, 3) file system supports multiple formats, 4) Process management is implemented through system calls such as fork, and 5) memory management uses virtual memory technology.

The Building Blocks of Linux: Key Components ExplainedApr 30, 2025 am 12:26 AM

The core components of the Linux system include the kernel, file system, and user space. 1. The kernel manages hardware resources and provides basic services. 2. The file system is responsible for data storage and organization. 3. Run user programs and services in the user space.

Using Maintenance Mode: Troubleshooting and Repairing LinuxApr 29, 2025 am 12:28 AM

Maintenance mode is a special operating level entered in Linux systems through single-user mode or rescue mode, and is used for system maintenance and repair. 1. Enter maintenance mode and use the command "sudosystemctlisolaterscue.target". 2. In maintenance mode, you can check and repair the file system and use the command "fsck/dev/sda1". 3. Advanced usage includes resetting the root user password, mounting the file system in read and write mode and editing the password file.

Linux Maintenance Mode: Understanding the PurposeApr 28, 2025 am 12:01 AM

Maintenance mode is used for system maintenance and repair, allowing administrators to work in a simplified environment. 1. System Repair: Repair corrupt file system and boot loader. 2. Password reset: reset the root user password. 3. Package management: Install, update or delete software packages. By modifying the GRUB configuration or entering maintenance mode with specific keys, you can safely exit after performing maintenance tasks.

Linux Operations: Networking and Network ConfigurationApr 27, 2025 am 12:09 AM

Linux network configuration can be completed through the following steps: 1. Configure the network interface, use the ip command to temporarily set or edit the configuration file persistence settings. 2. Set up a static IP, suitable for devices that require a fixed IP. 3. Manage the firewall and use the iptables or firewalld tools to control network traffic.

Maintenance Mode in Linux: A System Administrator's GuideApr 26, 2025 am 12:20 AM

Maintenance mode plays a key role in Linux system management, helping to repair, upgrade and configuration changes. 1. Enter maintenance mode. You can select it through the GRUB menu or use the command "sudosystemctlisolaterscue.target". 2. In maintenance mode, you can perform file system repair and system update operations. 3. Advanced usage includes tasks such as resetting the root password. 4. Common errors such as not being able to enter maintenance mode or mount the file system, can be fixed by checking the GRUB configuration and using the fsck command.

Maintenance Mode in Linux: When and Why to Use ItApr 25, 2025 am 12:15 AM

The timing and reasons for using Linux maintenance mode: 1) When the system starts up, 2) When performing major system updates or upgrades, 3) When performing file system maintenance. Maintenance mode provides a safe and controlled environment, ensuring operational safety and efficiency, reducing impact on users, and enhancing system security.

Linux: Essential Commands and OperationsApr 24, 2025 am 12:20 AM

Indispensable commands in Linux include: 1.ls: list directory contents; 2.cd: change working directory; 3.mkdir: create a new directory; 4.rm: delete file or directory; 5.cp: copy file or directory; 6.mv: move or rename file or directory. These commands help users manage files and systems efficiently by interacting with the kernel.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

How to fix KB5055523 fails to install in Windows 11?

3 weeks agoByDDD

InZoi: How To Apply To School And University

4 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks agoByDDD

Where to find the Site Office Key in Atomfall

4 weeks agoByDDD

Hot Tools

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software