PHP Tutorial

Redis interview questions and distributed clusters

不言

Jun 02, 2018 pm 04:11 PM

redisdistributedcluster

This article mainly introduces the Redis interview questions and distributed clusters. It has certain reference value. Now I share it with you. Friends in need can refer to it

Redis interview questions and distributed clusters

1. What are the benefits of using Redis?

(1) It is fast because the data is stored in memory, similar to HashMap. The advantage of HashMap is that the time complexity of search and operation is O(1)
(2) It supports rich data types, Supports string, list, set, sorted set, hash
(3) Supports transactions, operations are atomic. The so-called atomicity means that all changes to the data are either executed or not executed
(4) Rich Features: Can be used for caching, messaging, and setting the expiration time by key. It will be automatically deleted after expiration

Topic recommendation: 2020 redis interview questions collection (latest)

2. What are the advantages of redis compared to memcached?

(1) All values in memcached are simple strings. As its replacement, redis supports richer data types
(2) Redis is much faster than memcached
( 3) redis can persist its data

3. Redis common performance problems and solutions:

(1) Master is best not to do any persistence work, such as RDB memory snapshots and AOF logs File
(2) If the data is more important, a Slave enables AOF backup data, and the policy is set to synchronize once per second
(3) For the speed of master-slave replication and the stability of the connection, Master and Slave are the best In the same LAN
(4) Try to avoid adding slave libraries to the master library that is under great pressure
(5) Do not use a graph structure for master-slave replication. It is more stable to use a one-way linked list structure, that is: Master Such a structure is convenient to solve the single point of failure problem and realize the replacement of Master by Slave. If the Master hangs up, you can immediately enable Slave1 as the Master, leaving everything else unchanged.

4. There are 20 million data in MySQL, but only 200,000 data are stored in redis. How to ensure that the data in redis are hot data

Related knowledge: The size of the redis memory data set has increased to a certain size When the time comes, a data elimination strategy will be implemented. redis provides 6 data elimination strategies:

voltile-lru: Select the least recently used data from the data set (server.db[i].expires) with an expiration time set to eliminate it

volatile-ttl: Select the data that is about to expire from the data set (server.db[i].expires) with an expiration time set for elimination

volatile-random: Randomly select data for elimination from the data set (server.db[i].expires) with expiration time set

allkeys-lru: From the data set (server.db[i] .dict) to select the least recently used data for elimination

allkeys-random: Select any data from the data set (server.db[i].dict) for elimination

no-enviction (eviction ): Prohibit eviction of data

5. What are the differences between Memcache and Redis?

1), Storage method

Memecache stores all data in the memory. It will hang up after power failure, and the data cannot exceed the memory size.

Redis is partially stored on the hard disk, which ensures data persistence.

2), Data support type

Memcache’s support for data types is relatively simple.

Redis has complex data types.

3). Different underlying models are used.

The underlying implementation methods and application protocols used to communicate with clients are different.

Redis directly built its own VM mechanism, because if the general system calls system functions, it will waste a certain amount of time to move and request.

4), value size

Redis can reach a maximum of 1GB, while memcache is only 1MB

6. What are the common performance problems of Redis? How to solve?

1).Master writes memory snapshots, and the save command schedules the rdbSave function, which will block the work of the main thread. When the snapshot is relatively large, the impact on performance will be very large, and the service will be suspended intermittently, so Master is the best Do not write memory snapshots.

2). Master AOF persistence. If the AOF file is not rewritten, this persistence method will have the smallest impact on performance, but the AOF file will continue to grow. If the AOF file is too large, it will affect the recovery of the Master restart. speed. It is best not to do any persistence work on the Master, including memory snapshots and AOF log files. In particular, do not enable memory snapshots for persistence. If the data is critical, a Slave should enable AOF backup data, and the strategy is to synchronize once per second.

3). Master calls BGREWRITEAOF to rewrite the AOF file. AOF will occupy a large amount of CPU and memory resources during rewriting, causing the service load to be too high and temporary service suspension.

4). Redis master-slave replication performance issues, for the speed of master-slave replication and the stability of the connection, it is best for Slave and Master to be in the same LAN

7, redis is most suitable Scenario

Redis is most suitable for all data in-momory scenarios. Although Redis also provides persistence function, it is actually more of a disk-backed function, which is quite different from persistence in the traditional sense. The difference, then you may have questions. It seems that Redis is more like an enhanced version of Memcached, so when to use Memcached and when to use Redis?
If you simply compare the difference between Redis and Memcached, most of them will get it. The following points of view:
1. Redis not only supports simple k/v type data, but also provides storage of data structures such as list, set, zset, and hash.
2. Redis supports data backup, that is, data backup in master-slave mode.
3. Redis supports data persistence, which can keep data in memory on disk and can be loaded again for use when restarting.
(1) Session Cache

The most commonly used scenario for using Redis is session cache. The advantage of using Redis to cache sessions over other storage (such as Memcached) is that Redis provides persistence. When maintaining a cache that doesn't strictly require consistency, most people would be unhappy if all of the user's shopping cart information was lost. Now, would they still be?

Fortunately, as Redis has improved over the years, it is easy to find how to properly use Redis to cache session documents. Even the well-known commercial platform Magento provides Redis plug-ins.

(2), Full Page Cache (FPC)

In addition to the basic session token, Redis also provides a very simple FPC platform. Back to the consistency issue, even if the Redis instance is restarted, users will not see a decrease in page loading speed because of disk persistence. This is a great improvement, similar to PHP local FPC.

Taking Magento as an example again, Magento provides a plug-in to use Redis as a full-page cache backend.

In addition, for WordPress users, Pantheon has a very good plug-in wp-redis, which can help you load the pages you have browsed as quickly as possible.

(3) Queue

One of the great advantages of Redis in the field of memory storage engines is that it provides list and set operations, which allows Redis to be used as a good message queue platform. The operations used by Redis as a queue are similar to the push/pop operations of list in local programming languages (such as Python).

If you quickly search "Redis queues" in Google, you will immediately find a large number of open source projects. The purpose of these projects is to use Redis to create very good back-end tools to meet various queue needs. . For example, Celery has a backend that uses Redis as a broker. You can view it from here.

(4), Ranking/Counter

Redis implements the operation of incrementing or decrementing numbers in memory very well. Sets and Sorted Sets also make it very simple for us to perform these operations. Redis just provides these two data structures. So, we want to get the top 10 users from the sorted set - we call them "user_scores", we just need to execute like the following:

Of course, this is assuming that you are Sort in ascending order based on your users' scores. If you want to return the user and the user's score, you need to execute it like this:

ZRANGE user_scores 0 10 WITHSCORES

Agora Games is a good example, implemented in Ruby, with its ranking list It uses Redis to store data, you can see it here.

(5), Publish/Subscribe

Last (but certainly not least) is the publish/subscribe function of Redis. There are indeed many use cases for publish/subscribe. I've seen people use it in social network connections, as triggers for publish/subscribe based scripts, and even to build chat systems using Redis' publish/subscribe functionality! (No, this is true, you can check it out).

Of all the features provided by Redis, I feel that this is the one that people like the least, although it provides users with this multi-function.

High availability distributed cluster

1. High availability

High availability (High Availability) means that when a server stops serving, it will have no impact on the business and users. . The reason for stopping the service may be due to unpredictable reasons such as network cards, routers, computer rooms, excessive CPU load, memory overflow, natural disasters, etc. In many cases, it is also called a single point problem.

(1) There are two main ways to solve single-point problems:

Main and backup methods
This is usually one host and one or more backup machines. Under normal circumstances, The lower host provides services to the outside world and synchronizes data to the standby machine. When the host machine goes down, the standby machine starts serving immediately.
Keepalived is commonly used in Redis HA, which allows the host and backup machines to provide the same virtual IP to the outside world. The client performs data operations through the virtual IP. During normal periods, the host always provides services to the outside world. After a downtime, the VIP automatically drifts to the backup machine. on board.

The advantage is that it has no impact on the client and still operates through VIP.
The shortcomings are also obvious. Most of the time, the backup machine is not used and is wasted.

Master-slave method
This method adopts one master and multiple slaves, and data synchronization is performed between masters and slaves. When the Master goes down, a new Master is elected from the slave through the election algorithm (Paxos, Raft) to continue providing services to the outside world. After the master recovers, it rejoins as the slave.
Another purpose of master-slave is to separate reading and writing. This is a general solution when the reading and writing pressure of a single machine is too high. Its host role only provides write operations or a small amount of reading, and offloads redundant read requests to a single or multiple slave servers through a load balancing algorithm.

The disadvantage is that after the host goes down, although the Slave is elected as the new Master, the IP service address provided to the outside world has changed, which means that it will affect the client. Solving this situation requires some additional work. When the host address changes, the client is notified in time. After the client receives the new address, it uses the new address to continue sending new requests.

(2) Data synchronization
Both master and slave or master-slave are involved in the problem of data synchronization. This is also divided into two situations:

Synchronization method: When the host receives the client After the end-side write operation, the data is synchronized to the slave machine in a synchronous manner. When the slave machine also successfully writes, the host returns success to the client, which is also called strong data consistency. Obviously, the performance of this method will be reduced a lot. When there are many slave machines, it is not necessary to synchronize each one. After the master synchronizes a certain slave machine, the slave machine will then synchronize the data distribution to other slave machines, thus improving the performance sharing of the host machine. Synchronous pressure. This configuration is supported in redis, with one master and one slave. At the same time, this salve serves as the master of other slaves.

Asynchronous mode: After receiving the write operation, the host directly returns success, and then synchronizes the data to the slave in the background in asynchronous mode. This kind of synchronization performance is relatively good, but it cannot guarantee the integrity of the data. For example, if the host suddenly crashes during the asynchronous synchronization process, this method is also called weak data consistency.

Redis master-slave synchronization uses an asynchronous method, so there is a risk of losing a small amount of data. There is also a special case of weak consistency called eventual consistency. Please refer to the CAP principle and consistency model for details.

(3) Solution selection
The keepalived solution is simple to configure and has low labor costs. It is recommended when the amount of data is small and the pressure is low. If the amount of data is relatively large and you don’t want to waste too much machines, and you also want to take some customized measures after the downtime, such as alarming, logging, data migration, etc., it is recommended to use the master-slave method, because it matches the master-slave method. Usually there is also a management and monitoring center.

Downtime notification can be integrated into the client component or separated separately. Redis official Sentinel supports automatic failover, notification, etc. For details, see Low-cost and High-Availability Solution Design (4).

Logic diagram:
Redis interview questions and distributed clusters

2. Distributed

Distributed (distributed) means that when the business volume and data volume increase, it can be processed through any Increase or decrease the number of servers to solve the problem.

Cluster Era
Deploy at least two Redis servers to form a small cluster, with two main purposes:

High availability: After the host hangs up, automatic failover makes the front-end service accessible to users no effect.
Read and write separation: offload the read pressure from the host to the slave.
Load balancing can be achieved on the client component, and different proportions of read request pressure can be shared according to the operating conditions of different servers.

Logic diagram:
Redis interview questions and distributed clusters

3. Distributed cluster era

When the amount of cached data continues to increase, the memory of a single machine is not enough, and the data needs to be cut. Divide into different parts and distribute them to multiple servers.
Data can be fragmented on the client side. For details on the data fragmentation algorithm, see C# Consistent Hash Detailed Explanation and C# Virtual Bucket Fragmentation.

Logic diagram:
Redis interview questions and distributed clusters

Large-scale distributed cluster era
When the amount of data continues to increase, applications can apply for corresponding distributed services based on the business in different scenarios cluster. The most critical part of this is cache management, and the most important part is the addition of proxy services. The application accesses the real Redis server through a proxy for reading and writing. The advantage of this is:

It avoids the risk caused by more and more clients directly accessing the Redis server, which is difficult to manage and causes risks.
Corresponding security measures can be implemented at the proxy layer, such as current limiting, authorization, and sharding.
Avoid more and more logic codes on the client side, which is not only bloated but also troublesome to upgrade.
The proxy layer is stateless and can expand nodes arbitrarily. For the client, accessing the proxy is the same as accessing stand-alone Redis.
Currently, the host company uses two solutions: client component and proxy, because using a proxy will affect certain performance. The corresponding solutions for proxy implementation include Twitter's Twemproxy and Wandoujia's codis.

Logic diagram:
Redis interview questions and distributed clusters

Four, Summary

The distributed cache is followed by the cloud service cache, which completely shields the details from the user. Each application can apply for its own size and traffic plan, such as Taobao OCS cloud service cache.
The required implementation components for distributed cache are:

A cache monitoring, migration, and management center.
A custom client component, SmartClient in the picture above.
A stateless proxy service.
N servers.

Related learning recommendations: redis video tutorial

Related learning recommendations: mysql video tutorial

The above is the detailed content of Redis interview questions and distributed clusters. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How do you modify data stored in a PHP session?Apr 27, 2025 am 12:23 AM

TomodifydatainaPHPsession,startthesessionwithsession_start(),thenuse$_SESSIONtoset,modify,orremovevariables.1)Startthesession.2)Setormodifysessionvariablesusing$_SESSION.3)Removevariableswithunset().4)Clearallvariableswithsession_unset().5)Destroythe

Give an example of storing an array in a PHP session.Apr 27, 2025 am 12:20 AM

Arrays can be stored in PHP sessions. 1. Start the session and use session_start(). 2. Create an array and store it in $_SESSION. 3. Retrieve the array through $_SESSION. 4. Optimize session data to improve performance.

How does garbage collection work for PHP sessions?Apr 27, 2025 am 12:19 AM

PHP session garbage collection is triggered through a probability mechanism to clean up expired session data. 1) Set the trigger probability and session life cycle in the configuration file; 2) You can use cron tasks to optimize high-load applications; 3) You need to balance the garbage collection frequency and performance to avoid data loss.

How can you trace session activity in PHP?Apr 27, 2025 am 12:10 AM

Tracking user session activities in PHP is implemented through session management. 1) Use session_start() to start the session. 2) Store and access data through the $_SESSION array. 3) Call session_destroy() to end the session. Session tracking is used for user behavior analysis, security monitoring, and performance optimization.

How can you use a database to store PHP session data?Apr 27, 2025 am 12:02 AM

Using databases to store PHP session data can improve performance and scalability. 1) Configure MySQL to store session data: Set up the session processor in php.ini or PHP code. 2) Implement custom session processor: define open, close, read, write and other functions to interact with the database. 3) Optimization and best practices: Use indexing, caching, data compression and distributed storage to improve performance.

Explain the concept of a PHP session in simple terms.Apr 26, 2025 am 12:09 AM

PHPsessionstrackuserdataacrossmultiplepagerequestsusingauniqueIDstoredinacookie.Here'showtomanagethemeffectively:1)Startasessionwithsession_start()andstoredatain$_SESSION.2)RegeneratethesessionIDafterloginwithsession_regenerate_id(true)topreventsessi

How do you loop through all the values stored in a PHP session?Apr 26, 2025 am 12:06 AM

In PHP, iterating through session data can be achieved through the following steps: 1. Start the session using session_start(). 2. Iterate through foreach loop through all key-value pairs in the $_SESSION array. 3. When processing complex data structures, use is_array() or is_object() functions and use print_r() to output detailed information. 4. When optimizing traversal, paging can be used to avoid processing large amounts of data at one time. This will help you manage and use PHP session data more efficiently in your actual project.

Explain how to use sessions for user authentication.Apr 26, 2025 am 12:04 AM

The session realizes user authentication through the server-side state management mechanism. 1) Session creation and generation of unique IDs, 2) IDs are passed through cookies, 3) Server stores and accesses session data through IDs, 4) User authentication and status management are realized, improving application security and user experience.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

1 months agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

1 months agoByDDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks agoByDDD

InZoi: How To Apply To School And University

3 weeks agoByDDD

Hot Tools

Dreamweaver Mac version

Visual web development tools

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),