In the database optimization work, make the data as small as possible so that the table occupies 10% of the space on the hard disk. Keeping the space as small as possible is one of the most commonly used and most effective methods. Because the data is reduced, the read and write speed of the hard disk can be relatively improved, and the content processing of the small table during the query process takes up less system resources. In the same way, if an index is set on a smaller column, the index will occupy less resources. So how can database administrators lose weight for their own data? The author has the following suggestions for this.
Suggestion 1: Null values do not necessarily take up space
Here I will give you some literacy first. Some database administrators believe that null values will not occupy system resources. In fact, this is a wrong understanding. When designing the database, they don't like to set the field's properties to NOT NULL. Let users enter data according to their needs. The author believes that this approach is detrimental to the performance of the database
The author's opinion is that if possible, try to set the column to NOT NULL, that is, no null values are allowed. Doing this can speed up subsequent processing, and at the same time save one bit per column from the perspective of data storage, thereby achieving the purpose of data weight loss. In actual work, if there are situations where users are not required to input data, default fields can also be used to achieve non-empty purposes. For example, in the payroll system, the user's working years can be set to 0 by default instead of blank. Of course, if you really need NULL, there is no way. But as a database engineer, you should try to avoid using NULL values.
Suggestion 2: Use as small a data type as possible
The size of the data type will also affect the size of the underlying table. For example, the two data types MEDIUMINT and INT can be used to save integer data, but the precision they can save is different. But from the perspective of storing data, the former requires about 25% less storage space than the latter. For this reason, do not use INT if MEDIUMINT can be used.
In addition, when defining the data length, it should be as short as possible while meeting the needs. For example, there is a field for employee coding in the salary assessment system. If the enterprise employee code has been determined, it consists of five characters. Then when defining the field, you only need to define the length of 5 characters. This can not only reduce the storage space, but also play a certain data proofreading function. When the code length entered by the user exceeds 5 digits, the data cannot be saved.
Although there are many data types to choose from when saving certain data, you can also define a relatively large number of characters. However, choosing the smallest data type as possible can help reduce data storage space and achieve the purpose of data weight loss. Thereby further improving the performance of the database.
Suggestion 3: The relationship between index and data table size
The author mentioned at the beginning of the article that if an index is set for a relatively small column, the index will also occupy relatively few resources. It can be seen that the index and the size of the data table are also closely related. Setting the right index at the right place and at the right time can also achieve the purpose of data weight loss.
Normally, each data table may have multiple indexes, but there is often only one primary index. For this reason, the primary index of each table should be kept as short and concise as possible. This can help the database identify it faster.
Another example is to index the prefix as much as possible. For example, if you have a table now, you need to set an index on a certain column. And this column has a characteristic, that is, it has a unique prefix on the first few characters. If this is the case, it would be better to index this prefix tightly, rather than all of them. In the MySQL database, it is supported to create an index on the leftmost part of a character column. This means that the database will split a field into two parts according to certain rules. If the data in the front part can remain unique after splitting, then you only need to set an index on the front part, and there is no need to set an index on the data in the entire field. This can undoubtedly reduce the resources occupied by the index and achieve the purpose of weight loss. Shorter indexes provide faster query speeds. Because they take up less hard drive space, and they will save more accesses in the index cache. This reduces the number of hard disk searches and improves query efficiency.
The last thing to note is that indexes cannot be abused. Using indexes can indeed improve data processing capabilities, but indexes also bring additional overhead. Only when the benefits are greater than the overhead, using indexes can improve database performance. Otherwise, it will have the opposite effect. For example, if a table needs to be stored quickly, if too many indexes are set on the table, the indexes will have side effects. In this regard, the author suggests that if a table is accessed mainly through a combination of search columns, it is best to set only one index for them. Of course, this index part should be the most commonly used column in daily work. As a last resort, if you need to use multiple indexes, it is best to use columns with more copies to get better index compression. This reduces the increased resource consumption caused by using multiple indexes.
Suggestion 4: You still can’t save where you need to be “full”
A woman should be thinner where she should be thin, and plump where she should be plump. In fact, the same is true for databases. Wherever you can save hard drive space, save it. And what cannot be saved cannot be streamlined in order to lose weight. Sometimes this can backfire.
The author takes Varchar as an example. As in MyISAM, if you don't have any variable-length columns, it's better to use fixed-size data types. Although fixed-length data types are used, a certain amount of storage space is often wasted. Because if the data entered by the user is insufficient and a fixed length is used, the data will still be stored at this fixed length. But in this case, if you can use a fixed length, you still have to use a fixed length. Because in this case, although a certain amount of hard disk space will be wasted, it can improve the data query speed.
It can be seen that weight loss of data cannot improve database performance under any circumstances. This is like saving money to increase revenue, and this saving should be saved on the cutting edge. Otherwise, not only will you not be able to save money, but you will also shoot yourself in the foot. In layman's terms, you should be thinner where you should be thin, and be plump where you should be plump. Just remember this sentence.
Suggestion 5: Split the table to achieve weight loss
When ants are moving food, if a piece of food is too large to move, the ants may divide the piece of food until it can be moved. This is the principle of dividing the cake. In fact, this phenomenon is often common in daily work. For example, if we have a database table, if there are a lot of records in it, the table's allowed speed will be very slow. In this case, the table can be divided into multiple workbooks based on certain rules. For example, there is now a copy of the attendance information of corporate employees. When querying, sorting, and counting this table, the waiting time is very long. At this time, you can divide it into different workbooks according to departments, and then perform relevant data analysis on them. Although the workload will be larger at this time, the processing speed will be much faster
According to this principle, when optimizing the database, a large table that is often scanned can be divided into 2 or more The representation is very helpful. For example, in my daily work, I now have a dynamic format data table, and when this data is using a scan table, I will use this to find the relevant rows in a relatively small static format table.
Through the splitting of this table, a large cake can be divided into several smaller cakes to facilitate subsequent data statistics and analysis. Of course, the quality of this effect is directly related to the rules of this split. Regarding how to split the table to achieve the desired effect, this is another relatively big topic. Due to the limited space here, the author will not give too much explanation. Perhaps in subsequent articles, the author will expand on this proposition and give you a detailed explanation.
The above is the content of MySQL performance optimization to make the database run faster. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!