Home  >  Article  >  Database  >  Summarize and share some interview questions about redis cache

Summarize and share some interview questions about redis cache

青灯夜游
青灯夜游forward
2021-05-07 10:43:263045browse

This article will share with you some interview questions about redis cache. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to everyone.

Summarize and share some interview questions about redis cache

redis cache interview questions

1. What is the difference between redis and memcached? Why is single-threaded redis sometimes more efficient than multi-threaded memcached under high concurrency?

Difference:

  • memcached can cache images and videos. redis supports more data structures besides k/v;

  • redis can use virtual memory, redis can be persisted and aof disaster recovery, redis supports data backup through master-slave;

3.redis can be used as a message queue.

Reason: The memcached multi-threading model introduces cache consistency and locking, and locking brings performance losses.

2. How is redis master-slave replication implemented? How to implement redis cluster mode? How is the key of redis addressed?

Master-slave replication implementation: the master node takes a snapshot of the data in its own memory, sends the snapshot to the slave node, and the slave node restores the data to the memory. After that, every time new data is added, the master node sends the statement to the slave node in a binary log format similar to mysql, and the slave node gets the statement sent by the master node for replay.

Sharding method:

  • Client-side sharding

  • Proxy-based sharding

  • Twemproxy

  • codis

  • Routing Query Sharding

  • Redis-cluster The body provides the ability to automatically disperse data to different nodes of RedisCluster. Which node a certain data subset of the entire data collection is stored is transparent to the user)

  • redis- Cluster fragmentation principle: There is a 16384-length slot (virtual slot) in the Cluster, numbered 0-16383. Each Master node will be responsible for a part of the slots. When a certain key is mapped to a slot that a Master is responsible for, then the Master is responsible for providing services for this key. As for which Master node is responsible for which slot, it can be specified by the user, or It is automatically generated during initialization, and only the Master has ownership of the slot. The Master node maintains a 16384/8-byte bit sequence. The Master node uses bits to identify whether it owns a certain slot. For example, for the slot numbered 1, the Master only needs to determine whether the second bit of the sequence (index starting from 0) is 1. This structure makes it easy to add or remove nodes. For example, if I want to add a new node D, I need to get some slots from nodes A, B, and C to D.

[Related recommendations: Redis video tutorial]

3. How to design distributed locks using redis? Tell me about the implementation idea? Is it possible to use zk? How to achieve? What's the difference between these two?

redis:

  • Thread Asetnx (timestamp tl when the locked object times out), if true is returned, the lock is obtained.

  • Thread B uses get to obtain t1, compares it with the current timestamp, and determines whether it has timed out. If not, it is false. If it times out, execute step 3;

  • Calculate the new timeout t2, use the getset command to return t3 (this value may have been modified by other threads), if t1==t3, obtain the lock, if t1!=t3, the lock has been acquired by other threads.

  • After acquiring the lock, process the business logic, and then determine whether the lock has timed out. If it has not timed out, delete the lock. If it has timed out, there is no need to process it (to prevent the locks of other threads from being deleted).

zk:

  • When the client locks a method, it is in the directory of the specified node corresponding to the method on zk , generate a unique instantaneous ordered node node1;

  • The client obtains all the child nodes that have been created under the path. If it finds that the serial number of node1 created by itself is the smallest, it will This client acquired the lock.

  • If it is found that node1 is not the smallest, it will listen to the largest node with a smaller serial number than the node it created and wait.

  • After acquiring the lock, finish processing the logic and delete the node1 you created. Difference: The performance of zk is worse, the overhead is high, and the implementation is simple.

#4. Do you know the persistence of redis? How is the bottom layer implemented? What are the advantages and disadvantages?

RDB (RedisDataBase: synchronizes snapshots generated by redis data to disks and other media at different points in time): Snapshots from memory to hard disk, updated regularly. Disadvantages: time-consuming, performance-consuming (fork io operation), easy to lose data.

AOF (AppendOnlyFile: Record all instructions executed by redis. When redis restarts next time, you only need to execute the instructions): Write log. Disadvantages: Large size, slow recovery speed.

bgsave does full image persistence, and aof does incremental persistence. Because bgsave will take a long time and is not real-time enough. It will cause a lot of data loss during shutdown and requires aof to cooperate. When the redis instance is restarted, aof will be used first to restore the memory state. If there is no aof log, it will Use rdb file to restore. Redis will regularly rewrite AOF and compress the AOF file log size. After Redis 4.0, there is a hybrid persistence function, which integrates the full amount of bgsave and the increment of aof, which not only ensures the efficiency of recovery but also takes into account the security of the data. The principle of bgsave, fork and cow, fork means that redis performs bgsave operation by creating a child process, and cow means copyonwrite. After the child process is created, the parent and child processes share the data segment, and the parent process continues to provide read and write services and write dirty pages. The data will gradually be separated from the child process.

5. What are the expiration strategies of redis? Do you know the LRU algorithm? Write some java code to implement it?

Expiration strategy:

Scheduled expiration (one key has a timer), lazy expiration: only when the key is used, it is judged whether the key has expired, and it is cleared when it expires. Periodic expiration: a compromise between the first two.

LRU: newLinkedHashMap(capacity,DEFAULT_LOAD_FACTORY,true); The third parameter is set to true, which means that the linkedlist is sorted in the access order and can be used as an LRU cache; set to false, which means it is sorted in the insertion order. , can be used as a FIFO cache

LRU algorithm implementation:

  • Implemented through a two-way linked list, new data is inserted into the head of the linked list;

  • Whenever the cache hits (that is, the cached data is accessed), the data is moved to the head of the linked list;

  • When the linked list is full, the data at the end of the linked list is discarded.

LinkedHashMap: The combination of HashMap and doubly linked list is LinkedHashMap. HashMap is unordered, and LinkedHashMap ensures the iteration order by maintaining an additional doubly linked list. The iteration order can be insertion order (default) or access order.

6. Cache penetration, cache breakdown, cache avalanche solution?

** Cache penetration: ** refers to querying a data that must not exist. If the data cannot be found from the storage layer, it will not be written to the cache. This will cause the non-existent data to be All requests must go to the DB for query, which may cause the DB to hang.

Solution:

  • The data returned by the query is empty, the empty result is still cached, but the expiration time will be shorter;

  • Bloom filter: Hash all possible data into a bitmap that is large enough. Data that must not exist will be intercepted by this bitmap, thus avoiding DB queries.

**Cache breakdown: **For a key with an expiration time set, when the cache expires at a certain point in time, there happens to be a large number of concurrent requests for this key at this point in time. In the past, when these requests found that the cache had expired, they would usually load data from the back-end DB and reset it to the cache. At this time, large concurrent requests may instantly overwhelm the DB.

Solution:

  • Use a mutex lock: When the cache fails, do not go to Ioaddb immediately. First use setnx such as Redis to set a mutex lock. When When the operation returns successfully, perform the Ioaddb operation and restore the cache. Otherwise, retry the get cache method.

  • Never expires: Physical does not expire, but logic expires (background asynchronous thread refreshes). Cache avalanche: The same expiration time is used when setting up the cache, causing the cache to expire at the same time at a certain moment, all requests are forwarded to the DB, and the DB is under instantaneous pressure and causes an avalanche. The difference from cache breakdown: avalanche is a lot of keys, breakdown is a certain key cache.

Solution:

Spread the cache expiration time. For example, you can add a random value to the original expiration time, such as 1-5 minutes randomly, so The repetition rate of each cache's expiration time will be reduced, making it difficult to cause a collective failure event.

7. When choosing cache, when to choose redis and when to choose memcached

Situations when choosing redis:

  • Complex data structure. In this case, the data of value is hash, list, set, ordered set, etc., redis will be chosen because memcache cannot satisfy these data structures. The most typical usage scenario is user order list, user Messages, post comments, etc.

  • Need to persist data, but be careful not to use redis as a database. If redis hangs, the memory can quickly restore hot data and will not put pressure on the database instantly. On, there is no cache warm-up process. For scenarios where read-only and data consistency requirements are not high, persistent storage can be used

  • for high availability. Redis supports clusters and can achieve active replication and read-write separation. For memcache, if you want To achieve high availability, secondary development is required.

  • The stored content is relatively large, and the maximum value stored in memcache is 1M.

Scenarios for choosing memcache:

Pure KV, for businesses with very large amounts of data, memcache is more suitable for the following reasons:

  • The memory allocation of memcache adopts the management method of pre-allocated memory pool, which can save the time of memory allocation. Redis is a temporary application space, which may lead to fragmentation.

  • Using virtual memory, memcache stores all data in physical memory. Redis has its own vm mechanism, which can theoretically store more data than physical memory. When the data is exceeded When, swap is triggered and the cold data is refreshed to the disk. From this point, when the amount of data is large, memcache is faster

  • Network model, memcache uses a non-blocking 10 reuse model , redis also uses non-blocking I. Reuse model, but redis also provides some sorting, aggregation functions, and complex CPU calculations other than KV storage, which will block the entire I0 scheduling. From this point of view, since redis provides more functions, memcache is faster

  • Threading model, memcache uses multi-threading, the main thread listens, and the worker sub-thread accepts requests and performs reading and writing. There may be lock conflicts in this process. Although the single thread used by redis has no lock conflicts, it is difficult to use the characteristics of multi-core to improve throughput.

#8. What should I do if the cache is inconsistent with the database?

Assuming that the main memory is separated and the read-write separated database is used,
If a thread A first deletes the cached data and then writes the data to the main library, at this time, the main library and The slave library synchronization is not completed. Thread B fails to read data from the cache. It reads the old data from the slave library and then updates it to the cache. At this time, the cache contains the old data.

The reason for the above inconsistency is that the master-slave database data is inconsistent. After the cache is added, the master-slave inconsistency time is lengthened.

Processing idea: After the data is updated from the database, the data in the cache will also be updated at the same time. That is, when the data is updated from the database, delete it from the cache and eliminate the old data written during this period. data.

9. How to solve the inconsistency between master and slave databases?

Scenario description: For the master-slave database, reading and writing are separated. If there is a time difference in the master-slave database update synchronization, it will lead to inconsistency in the master-slave database data

  • Ignore this data inconsistency. In businesses with low data consistency requirements, time-to-time consistency may not be necessary.

  • Force reading from the main library, use a highly available main library, and database reading and writing All in the main library, add a cache to improve the performance of data reading.

  • Selectively read the main library, add a cache to record the data that must be read from the main library, use which library, which table, and which primary key as the cache key, and set cache invalidation The time is the synchronization time between the master and slave libraries. If there is this data in the cache, read the master library directly. If there is no primary key in the cache, read it from the corresponding slave library.

10. Redis common performance problems and solutions

  • master is best not to do persistence work, such as RDB memory snapshot and AOF log file

  • If the data is important, a slave enables AOF backup, and the policy is set to synchronize once per second

  • For the speed of master-slave replication and the stability of the connection, it is best for the master and slave to be in a local area network

  • Try to avoid adding slave libraries to the stressed master library

  • Do not use a mesh structure for master-slave replication, try to use a linear structure, Master<–Slave1<—Slave2…

11. Redis data elimination What are the strategies?

voltile-lru selects the least recently used data from the data set that has set the expiration time and eliminates it

voltile-ttl selects the data to be used from the database set that has the expiration time set. Expired data

voltile-random selects and eliminates data from the data set that has set expiration time

allkeys-lru selects the least recently used data from the data set to eliminate

allkeys -random randomly selects the eliminated data from the data set

no-eviction prohibits eviction of data

12. What data structures are there in Redis

String String, dictionary Hash, list, set, ordered set SortedSet. If you are a high-level user, there will be more. If you are an intermediate or advanced Redis user, you will also need to add the following data structures HyperLogLog, Geo, and Pub/Sub.

13. Suppose there are 100 million keys in Redis, and 100,000 keys start with a fixed, known prefix. How to find them all?

Use the keys command to scan out the key list of the specified mode.

The other party then asked: If this redis is providing services to online businesses, what are the problems with using the keys command?

At this time you have to answer one of the key features of redis: redis's single thread. The keys instruction will cause the thread to block for a period of time and the online service will pause. The service cannot be restored until the instruction is executed. At this time, you can use the scan command. The scan command can extract the key list of the specified mode without blocking, but there will be a certain probability of duplication. Just do it once on the client, but the overall time spent will be longer than using it directly. The keys command is long.

14. Have you ever used Redis to create an asynchronous queue? How is it implemented?

Use the list type to save data information, rpush produces messages, and lpop consumes messages. When lpop has no messages, You can sleep for a period of time and then check whether there is any information. If you don't want to sleep, you can use blpop. When there is no information, it will block until the information arrives. Redis can implement one producer and multiple consumers through the pub/sub topic subscription model. Of course, there are certain shortcomings. When the consumer goes offline, the produced messages will be lost.

15. How to implement delay queue in Redis

Use sortedset, use timestamp as score, message content as key, call zadd to produce messages, and consumers use zrangbyscore Get the data n seconds ago for polling processing.

16. What is Redis? Briefly describe its advantages and disadvantages?

Redis is essentially a Key-Value type in-memory database, much like memcached. The entire database is loaded into the memory for operation, and the database data is flushed to the hard disk for storage through asynchronous operations on a regular basis.

Because it is a pure memory operation, Redis has excellent performance and can handle more than 100,000 read and write operations per second. It is the fastest Key-ValueDB known to perform.

The excellence of Redis is not just performance. The biggest charm of Redis is that it supports saving a variety of data
structures. In addition, the maximum limit of a single value is 1GB, unlike memcached, which can only save 1MB of data. Therefore Redis can be used to implement many useful functions.

For example, use its List to make a FIFO doubly linked list to implement a lightweight high-performance message queue service, and use its Set to make a high-performance tag system, etc.

In addition, Redis can also set the expire time for the stored Key-Value, so it can also be used as an enhanced version of memcached. The main disadvantage of Redis is that the database capacity is limited by physical memory and cannot be used for high-performance reading and writing of massive data. Therefore, the scenarios suitable for Redis are mainly limited to high-performance operations and calculations of smaller amounts of data.

17. What are the advantages of Redis compared to memcached?

  • All values ​​in memcached are simple strings, and redis, as its replacement, supports richer data types

  • Redis is much faster than memcached

  • redis can persist its data

18. What data types does Redis support?

String, List, Set, SortedSet, hashes

19. What physical resources does Redis mainly consume?

Memory.

20. What is the full name of Redis?

Remote Dictionary Server

21. What data elimination strategies does Redis have?

noeviction: Returns an error when the memory limit is reached and the client attempts to execute a command that would cause more memory to be used (most write commands, but DEL and a few exceptions)

allkeys-lru: Try to recycle the least used keys (LRU) so that there is space for newly added data.

volatile-lru: Try to recycle least used keys (LRU), but only keys in the expired set, so that there is room for newly added data to be stored.

allkeys-random: Recycle random keys so that there is space for newly added data to be stored.

volatile-random: Recycle random keys so that there is space for newly added data, but only for keys in the expired set.

volatile-ttl: Recycle the keys in the expired set, and give priority to recycling keys with a shorter survival time (TTL), so that there is space for newly added data to be stored.

22. Why doesn’t Redis officially provide a Windows version?

Because the current Linux version is quite stable and has a large number of users, there is no need to develop a windows version, which will cause compatibility and other problems.

23. What is the maximum capacity that a string type value can store?

512M

24. Why does Redis need to put all data in memory?

In order to achieve the fastest reading and writing speed, Redis reads all the data into the memory and writes the data to the disk asynchronously.

So redis has the characteristics of fast speed and data persistence. If the data is not placed in memory, disk I/O speed will seriously affect the performance of redis.

As memory becomes cheaper and cheaper today, redis will become more and more popular. If the maximum memory used is set, new values ​​cannot be inserted after the number of existing data records reaches the memory limit.

25. How should the Redis cluster solution be implemented? What are the plans?

  • codis.
    The most commonly used cluster solution at present has basically the same effect as twemproxy, but it supports the recovery of old node data to new hash nodes when the number of nodes changes.

  • The cluster that comes with rediscluster3.0 is characterized by its distributed algorithm not consistent hashing, but the concept of hash slots, and its own support for node settings from slave nodes. See the official documentation for details.

  • Implemented at the business code layer, create several unrelated redis instances. At the code layer, perform hash calculation on the key, and then operate the data on the corresponding redis instance. This method has relatively high requirements for the hash layer code. Considerations include alternative algorithm solutions after node failure, automatic script recovery after data shock, instance monitoring, etc.

#26. Under what circumstances will the Redis cluster solution cause the entire cluster to be unavailable?

In a cluster with three nodes A, B, and C, without a replication model, if node B fails, the entire cluster will think that there is a lack of slots in the range of 5501-11000. unavailable.

27. There are 20 million data in MySQL, but only 20 million data are stored in redis. How to ensure that the data in redis are hot data?

When the size of the redis memory data set increases to a certain size, the data elimination strategy will be implemented.

28. What are the suitable scenarios for Redis?

  • Session Cache(SessionCache)
    One of the most commonly used scenarios for using Redis is session cache (sessioncache). The advantage of using Redis to cache sessions over other storage (such as Memcached) is that Redis provides persistence. When maintaining a cache that does not strictly require consistency, most people would be unhappy if all the user's shopping cart information was lost. Now, would they still be?

    Fortunately, as Redis has improved over the years, it is easy to find how to properly use Redis to cache session documents. Even the well-known commercial platform Magento provides Redis plug-ins.

  • Full Page Cache (FPC)
    In addition to the basic session token, Redis also provides a very simple FPC platform. Back to the consistency issue, even if the Redis instance is restarted, users will not see a decrease in page loading speed because of disk persistence. This is a great improvement, similar to PHP local FPC.

    Taking Magento as an example again, Magento provides a plug-in to use Redis as a full-page cache backend.
    In addition, for WordPress users, Pantheon has a very good plug-in wp-redis, which can help you load the pages you have browsed as quickly as possible.

  • Queue
    One of the great advantages of Reids in the field of memory storage engines is that it provides list and set operations, which allows Redis to be used as a good message queue platform. The operations used by Redis as a queue are similar to the push/pop operations of local programming languages ​​​​(such as Python) on lists.

    If you quickly search "Redisqueues" in Google, you will immediately find a large number of open source projects. The purpose of these projects is to use Redis to create very good back-end tools to meet various queue needs. For example, Celery has a backend that uses Redis as a broker. You can view it from here.

  • Rankboard/Counter
    Redis implements the operation of incrementing or decrementing numbers in memory very well. Sets (Sets) and ordered sets (SortedSet) also make it very simple for us to perform these operations. Redis just provides these two data structures. So, we want to get the top 10 users from the sorted set - we call them "user_scores", we just need to do it like the following:

  • Of course , this assumes that you are sorting in ascending order based on your users' scores. If you want to return the user and the user's score, you need to execute it like this:
    ZRANGEuser_scores010WITHSCORES
    AgoraGames is a good example, implemented in Ruby, and its rankings use Redis to store data. You can Seen here.

  • Publish/Subscribe
    Last (but certainly not least) is the publish/subscribe function of Redis. There are indeed many use cases for publish/subscribe. I've seen people use it in social network connections, as triggers for publish/subscribe based scripts, and even to build chat systems using Redis' publish/subscribe functionality!

29. What are the Java clients supported by Redis? Which one is officially recommended?

Redisson, Jedis, lettuce, etc., the official recommendation is to use Redisson.

30. What is the relationship between Redis and Redisson?

Redisson is an advanced distributed coordination Redis client that can help users easily implement some Java objects (Bloomfilter, BitSet, Set, SetMultimap, ScoredSortedSet, SortedSet, Map, ConcurrentMap, List, ListMultimap, Queue, BlockingQueue, Deque, BlockingDeque, Semaphore, Lock, ReadWriteLock, AtomicLong, CountDownLatch, Publish/Subscribe, HyperLogLog).

31. What are the advantages and disadvantages of Jedis and Redisson?

Jedis is a client implemented by Redis in Java. Its API provides relatively comprehensive support for Redis commands;
Redisson implements a distributed and scalable Java data structure, which is similar to Jedis. Compared with Redis, the function is relatively simple, does not support string operations, and does not support Redis features such as sorting, transactions, pipelines, and partitions. The purpose of Redisson is to promote the separation of users' concerns from Redis, so that users can focus more on processing business logic.

32. How to set password and verify password for Redis?

Set password: config set require pass 123456 Authorization password: auth123456

33. Talk about the concept of Redis hash slot?

The Redis cluster does not use consistent hashing, but introduces the concept of hash slots. The Redis cluster has 16384 hash slots. Each key is determined by taking the modulo 16384 after passing the CRC16 check. Which slot to place, each node in the cluster is responsible for a part of the hash slot.

34. What is the master-slave replication model of Redis cluster?

In order to make the cluster still available when some nodes fail or most nodes cannot communicate, the cluster uses a master-slave replication model, and each node will have N-1 replicas.

35. Will write operations be lost in the Redis cluster? Why?

Redis does not guarantee strong consistency of data, which means that in practice, the cluster may lose write operations under certain conditions.

36. How are Redis clusters replicated?

Asynchronous replication

37. What is the maximum number of nodes in a Redis cluster?

16384.

38. How to choose a database for Redis cluster?

Redis cluster currently cannot select database, and defaults to database 0.

39. How to test the connectivity of Redis?

ping

40. What are the uses of pipelines in Redis?

A request/response server can handle new requests even if old requests have not yet been responded to. This makes it possible to send multiple commands to the server without waiting for a reply, and finally read that reply in one step.

This is pipelining, a technology that has been widely used for decades. For example, many POP3 protocols have implemented support for this function, which greatly speeds up the process of downloading new emails from the server.

41. How to understand Redis transactions?

A transaction is a single isolated operation: all commands in the transaction will be serialized and executed in order. During the execution of the transaction, it will not be interrupted by command requests sent by other clients. A transaction is an atomic operation: either all of the commands in the transaction are executed, or none of them are executed.

42. What are the commands related to Redis transactions?

MULTI, EXEC, DISCARD, WATCH

43. How to set the expiration time and permanent validity of Rediskey respectively?

EXPIRE and PERSIST commands.

44. How does Redis optimize memory?

Use hash tables (hashes) as much as possible. Hash tables (meaning that the number stored in the hash table is small) use very small memory, so you should abstract your data model as much as possible Inside a hash table. For example, if there is a user object in your web system, do not set a separate key for the user's name, surname, email, and password. Instead, store all the user's information in a hash table.

45. How does the Redis recycling process work?

A client ran a new command and added new data.
Redi checks the memory usage. If it is greater than the limit of maxmemory, it will be recycled according to the set policy. A new command is executed, etc.

So we keep crossing the boundary of the memory limit by constantly reaching the boundary and then constantly recycling back below the boundary.

If the result of a command causes a large amount of memory to be used (such as saving the intersection of a large set to a new key), it will not take long for the memory limit to be exceeded by this memory usage.

For more programming related knowledge, please visit: Programming Video! !

The above is the detailed content of Summarize and share some interview questions about redis cache. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:csdn.net. If there is any infringement, please contact admin@php.cn delete