1. We need to store POJO in the cache. The class is defined as follows
public class TestPOJO implements Serializable { private String testStatus; private String userPin; private String investor; private Date testQueryTime; private Date createTime; private String bizInfo; private Date otherTime; private BigDecimal userAmount; private BigDecimal userRate; private BigDecimal applyAmount; private String type; private String checkTime; private String preTestStatus; public Object[] toValueArray(){ Object[] array = {testStatus, userPin, investor, testQueryTime, createTime, bizInfo, otherTime, userAmount, userRate, applyAmount, type, checkTime, preTestStatus}; return array; } public CreditRecord fromValueArray(Object[] valueArray){ //具体的数据类型会丢失,需要做处理 } }
2. Use the following example as test data
TestPOJO pojo = new TestPOJO(); pojo.setApplyAmount(new BigDecimal("200.11")); pojo.setBizInfo("XX"); pojo.setUserAmount(new BigDecimal("1000.00")); pojo.setTestStatus("SUCCESS"); pojo.setCheckTime("2023-02-02"); pojo.setInvestor("ABCD"); pojo.setUserRate(new BigDecimal("0.002")); pojo.setTestQueryTime(new Date()); pojo.setOtherTime(new Date()); pojo.setPreTestStatus("PROCESSING"); pojo.setUserPin("ABCDEFGHIJ"); pojo.setType("Y");
System.out.println(JSON.toJSONString(pojo).length());
Use JSON to directly serialize and print length=284**, **This method is the simplest way , which is also the most commonly used method. The specific data is as follows:
{"applyAmount":200.11,"bizInfo":"XX","checkTime":"2023-02-02","investor":"ABCD ","otherTime":"2023-04-10 17:45:17.717","preCheckStatus":"PROCESSING","testQueryTime":"2023-04-10 17:45:17.717","testStatus":"SUCCESS ","type":"Y","userAmount":1000.00,"userPin":"ABCDEFGHIJ","userRate":0.002}
We found that the above contains a lot of useless data, among which the attribute names There is no need to store it.
System.out.println(JSON.toJSONString(pojo.toValueArray()).length());
By selecting the array structure instead of the object structure, the attribute name is removed, and length=144 is printed. The data size has been reduced by 50%. The specific data is as follows:
["SUCCESS","ABCDEFGHIJ","ABCD","2023-04-10 17:45:17.717",null,"XX"," 2023-04-10 17:45:17.717",1000.00,0.002,200.11,"Y","2023-02-02","PROCESSING"]
We found that there is no need to store null. The time format is serialized into a string. Unreasonable serialization results lead to data expansion, so we should choose a better serialization tool.
//我们仍然选取JSON格式,但使用了第三方序列化工具 System.out.println(new ObjectMapper(new MessagePackFactory()).writeValueAsBytes(pojo.toValueArray()).length);
Choose better serialization tools to achieve field compression and reasonable data format, print** length=92, the space is reduced by 40% compared with the previous step.
This is a piece of binary data. Redis needs to be operated in binary. After converting the binary to a string, print it as follows:
��SUCCESS�ABCDEFGHIJ�ABCD� �j�6� ��XX� �j�6�� ��?`bM����@i � �Q�Y�2023-02-02�PROCESSING
Follow this idea further Digging, we found that we can achieve more extreme optimization effects by manually selecting data types. Choosing to use smaller data types will achieve further improvements.
In the above use case, the three fields testStatus, preCheckStatus, and investor are actually enumeration string types. If they can Using simpler data types (such as byte or int, etc.) instead of string can further save space. You can use the Long type instead of a string to represent checkTime, so that the serialization tool outputs fewer bytes.
public Object[] toValueArray(){ Object[] array = {toInt(testStatus), userPin, toInt(investor), testQueryTime, createTime, bizInfo, otherTime, userAmount, userRate, applyAmount, type, toLong(checkTime), toInt(preTestStatus)}; return array; }
After manual adjustment, a smaller data type was used instead of String type, printinglength=69
In addition to the above points, you can also consider using ZIP compression to obtain a smaller volume. When the content is large or repetitive, the effect of ZIP compression is obvious. If the storage The content is an array of TestPOJOs, probably suitable for use with ZIP compression.
For files smaller than 30 bytes, ZIP compression may increase the file size but may not necessarily reduce the file size. In the case of less repetitive content, no significant improvement can be obtained. And there is CPU overhead.
After the above optimization, ZIP compression is no longer a required option, and testing based on actual data is required to determine the ZIP compression effect.
The above improvement steps reflect the optimization ideas, but the deserialization process will lead to the loss of types, which is more cumbersome to handle, so We also need to consider the issue of deserialization.
When the cache object is predefined, we can completely process each field manually. Therefore, in actual combat, it is recommended to use manual serialization to achieve the above purpose, achieve refined control, and achieve the best compression. effect and minimal performance overhead.
You can refer to the following msgpack implementation code. The following is the test code. Please package better Packer and UnPacker tools by yourself:
<dependency> <groupId>org.msgpack</groupId> <artifactId>msgpack-core</artifactId> <version>0.9.3</version> </dependency>
public byte[] toByteArray() throws Exception { MessageBufferPacker packer = MessagePack.newDefaultBufferPacker(); toByteArray(packer); packer.close(); return packer.toByteArray(); } public void toByteArray(MessageBufferPacker packer) throws Exception { if (testStatus == null) { packer.packNil(); }else{ packer.packString(testStatus); } if (userPin == null) { packer.packNil(); }else{ packer.packString(userPin); } if (investor == null) { packer.packNil(); }else{ packer.packString(investor); } if (testQueryTime == null) { packer.packNil(); }else{ packer.packLong(testQueryTime.getTime()); } if (createTime == null) { packer.packNil(); }else{ packer.packLong(createTime.getTime()); } if (bizInfo == null) { packer.packNil(); }else{ packer.packString(bizInfo); } if (otherTime == null) { packer.packNil(); }else{ packer.packLong(otherTime.getTime()); } if (userAmount == null) { packer.packNil(); }else{ packer.packString(userAmount.toString()); } if (userRate == null) { packer.packNil(); }else{ packer.packString(userRate.toString()); } if (applyAmount == null) { packer.packNil(); }else{ packer.packString(applyAmount.toString()); } if (type == null) { packer.packNil(); }else{ packer.packString(type); } if (checkTime == null) { packer.packNil(); }else{ packer.packString(checkTime); } if (preTestStatus == null) { packer.packNil(); }else{ packer.packString(preTestStatus); } } public void fromByteArray(byte[] byteArray) throws Exception { MessageUnpacker unpacker = MessagePack.newDefaultUnpacker(byteArray); fromByteArray(unpacker); unpacker.close(); } public void fromByteArray(MessageUnpacker unpacker) throws Exception { if (!unpacker.tryUnpackNil()){ this.setTestStatus(unpacker.unpackString()); } if (!unpacker.tryUnpackNil()){ this.setUserPin(unpacker.unpackString()); } if (!unpacker.tryUnpackNil()){ this.setInvestor(unpacker.unpackString()); } if (!unpacker.tryUnpackNil()){ this.setTestQueryTime(new Date(unpacker.unpackLong())); } if (!unpacker.tryUnpackNil()){ this.setCreateTime(new Date(unpacker.unpackLong())); } if (!unpacker.tryUnpackNil()){ this.setBizInfo(unpacker.unpackString()); } if (!unpacker.tryUnpackNil()){ this.setOtherTime(new Date(unpacker.unpackLong())); } if (!unpacker.tryUnpackNil()){ this.setUserAmount(new BigDecimal(unpacker.unpackString())); } if (!unpacker.tryUnpackNil()){ this.setUserRate(new BigDecimal(unpacker.unpackString())); } if (!unpacker.tryUnpackNil()){ this.setApplyAmount(new BigDecimal(unpacker.unpackString())); } if (!unpacker.tryUnpackNil()){ this.setType(unpacker.unpackString()); } if (!unpacker.tryUnpackNil()){ this.setCheckTime(unpacker.unpackString()); } if (!unpacker.tryUnpackNil()){ this.setPreTestStatus(unpacker.unpackString()); } }
Assume that we store data for 200 million users. Each user contains 40 fields. The length of the field key is 6 bytes, and the fields are managed separately.
Under normal circumstances, we will think of the hash structure, and the hash structure stores key information, which will occupy additional resources. The field key is unnecessary data. According to the above ideas, you can use list instead of hash structure.
Through the Redis official tool test, using the list structure requires 144G of space, while using the hash structure requires 245G of space** (When more than 50% of the attributes are empty, you need to test whether it is still applicable)* *
In the above case, we took several very simple measures, with just a few lines of simple code, which can reduce the space by more than 70%. When the amount of data is relatively large, It is highly recommended in scenarios with large and high performance requirements. :
• Use arrays instead of objects (if a large number of fields are empty, you need to use serialization tools to compress nulls)
• Use better serialization tools
• Use Smaller data types
• Consider using ZIP compression
• Use list instead of hash structure (if a large number of fields are empty, testing and comparison are required)
The above is the detailed content of How to optimize Redis cache space. For more information, please follow other related articles on the PHP Chinese website!