Home >Database >Redis >How to solve the simple dynamic string problem of SDS in Redis

How to solve the simple dynamic string problem of SDS in Redis

王林
王林forward
2023-05-26 12:50:201516browse

1. The structure of SDS

C language does not have a string type, it is essentially a char[] array; and the size of the C language array must be initialized when it is created. It cannot be changed after the type is specified, and the last character array The element is always the null character '\0' .

The following shows a C string with the value "Redis":

How to solve the simple dynamic string problem of SDS in Redis

Redis does not directly use the string method of C language, but builds A type of simple dynamic string (SDS). The underlying strings in Redis are stored using the SDS structure. For example, the underlying key-value pairs containing strings are all implemented using the SDS structure.

The SDS structure is defined in sds.h

struct sdshdr{


    int len;//SDS保存的字符串长度


    int free;//buf数组中未使用字节数量


    char buf[];//字符数组,保存字符串


}

How to solve the simple dynamic string problem of SDS in Redis

The last byte saves the null character '\0', retaining the specification of the C string , so that the SDS structured string can reuse some functions of the C function library.

2. Why not use C string

Mainly because C string has the following shortcomings:

The time complexity of obtaining the string length is O(N): C To obtain the length of a string, you need to traverse the entire string until you encounter the '\0' null character. A buffer overflow can occur if insufficient memory is allocated during a string append operation. Memory reallocation: Every time a string is grown or truncated, the program must perform a memory reallocation operation on the array that holds the C string. Memory reallocation involves complex algorithms and may require the execution of system calls, so it is usually relatively time-consuming. hour. Null character problem: Spaces cannot be stored in the middle of a C string, otherwise the program will mistakenly think it is the end of the string when traversing. Due to this limitation, C strings can only be used to store text data and are not suitable for saving binary data such as pictures, audio and video, and compressed files.

3. How to solve the C string problem

How to solve the simple dynamic string problem of SDS in Redis

1. SDS records the SDS length through the len attribute, so the time complexity of obtaining the length is O( 1), that is, the time complexity of the strlen command is O(1).

2. The SDS space allocation strategy avoids buffer overflow: when SDS is modified, it will first check whether the SDS space meets the modification. If not, it will automatically expand to the required size before performing the modification.

3. Fewer memory reallocation times when modifying strings: free in SDS records unused bytes in the buf byte array.

Redis implements two optimization strategies of space pre-allocation and lazy space release through the free attribute.

Space pre-allocation: When performing a growth operation on SDS, the program will not only allocate the space necessary for modification, but also allocate additional unused space for SDS. The number of memory reallocations is reduced when string growth operations are performed continuously, which is achieved through the pre-allocation strategy. Lazy space release: When the SDS is truncated, the program will not immediately reclaim the memory occupied by the extra bytes after shortening. Instead, it will use the free attribute to record the extra bytes for future use. The unused space may come in handy for future SDS growth, where the growth operation does not necessarily require memory reallocation.

The buf byte array in the SDS structure is binary safe and can not only save characters but also binary data.

SDS retains the convention of C strings, setting the end of the data to the null character '\0'. The reason why SDS retains this specification is that it can reuse some functions of the C string function library, such as append String.

4. Further optimization of strings

Three encodings of Redis string:

int stores 8-byte long integer (long, 2^63- 1) embstr, embstr format SDS (Simple Dynamic String) raw, raw format SDS, stores long strings greater than 44 bytes

int type refers to numbers, then raw and embstr both represent What are the similarities and differences between strings? Let’s analyze them below.

How to solve the simple dynamic string problem of SDS in Redis

The picture shows the difference between the two. You can see that embstr saves redisObject and SDS in a continuous 64-byte space, so that only one memory allocation is required. For raw, the separation of SDS and redisObject requires two memory allocations and takes up more memory space.

How to solve the simple dynamic string problem of SDS in Redis

You can see that embstr uses a structure called sdshdr8 in 3.2. Under this structure, metadata only requires 3 bytes, while Redis requires 8 bytes. , so a total of 64 bytes, minus redisObject (16 bytes), and then minus the original information of SDS, the final actual content becomes 44 bytes and 39 bytes.

How to solve the simple dynamic string problem of SDS in Redis

When the string is less than or equal to 44 bytes, Redis uses the embedded string creation method to reduce memory allocation and memory fragmentation.

The following picture shows the process of createEmbeddedStringObject creating an embedded string:

How to solve the simple dynamic string problem of SDS in Redis

In short, just remember that Redis will realize a continuous piece of string by design. Memory space, compactly place the redisObject structure and SDS structure together.

In this way, for strings no longer than 44 bytes, memory fragmentation and the overhead of two memory allocations can be avoided.

SDS is an efficient string implementation in Redis. It has the advantages of automatic expansion, binary safety, O(1) length acquisition and modification, etc. In actual applications, SDS can help us achieve efficient string operations and also avoid some common string operation problems, such as buffer overflow. By in-depth understanding of the internal structure and implementation principles of SDS, we can better understand the underlying mechanism of Redis and further improve our Redis application capabilities.

The above is the detailed content of How to solve the simple dynamic string problem of SDS in Redis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete