Home  >  Article  >  Database  >  Evolution of high-performance MySql (1): Optimization of data types_Part 1

Evolution of high-performance MySql (1): Optimization of data types_Part 1

黄舟
黄舟Original
2017-02-09 15:00:461042browse

In the process of database performance tuning, a lot of knowledge will be involved, including whether the field attribute settings are appropriate, whether the index establishment is appropriate, whether the table structure is reasonable, whether the database/operating system settings are correct... Each topic may be a field.

In my opinion, among the key technologies for improving database performance, optimizing fields is relatively difficult and has a very large impact on performance. Since MySQL supports many data types, and each type has its own unique characteristics, sometimes when choosing a specific data type, one often chooses a usable type at will without taking this into consideration. Whether the type is optimal. Before describing the specific types, let’s first look at some main principles for selecting data types:

a) Try to choose types that take up small space
Because small types are either on disk or in memory The space occupied is small, and the space required by the temporary table for querying or sorting will be relatively small. You may not feel it when the amount of data is relatively small, but when the amount of data is relatively large, the importance of this principle may become apparent.

For example, there is a "product information" table with 20 million records. This table has a "remaining product quantity" (COUNT) field. Generally speaking, SMALLINT (len :16 range:0-65535) is enough to express this field, but if you use BIGINT (len:64 range:0-18446744073709551615) to express it during the design process, although the program may run correctly, this field It will add about 95M of additional disk storage space (64-16)/8*20,000,000 Bytes). In addition, when doing data selection and sorting, just this field will increase your memory consumption by 95M. Based on the impact of the above behavior, The performance of the database will inevitably be affected

The premise of keeping it as small as possible is to ensure that the type you are going to choose can meet the needs of future business development, because the table structure needs to be updated when the amount of data is relatively large. It's a very slow and troublesome thing.

b) Try to choose simple/appropriate types

When selecting and sorting tables, simple types often consume less CPU clocks cycle. For example, for MySql server, Compare of integer type values ​​is often simpler and faster than Compare of string type values, so when you need to sort a specific table, you should try to choose the integer type as the basis for sorting

c) Try to set the field to NOTNULL
Under normal circumstances, if you do not explicitly specify a field as NULL, then this field will be considered NULLABLE by the database system. The default behavior will cause the following three problems
(1) The query optimization function of the Mysql server itself will be affected
(2) Mysql requires additional storage space and processing for fields with null values
(3 ) If a null value is part of the index, the index effect will also be affected

Since this principle does not have a great effect on improving database performance, there is a NULLABLE field for the existing DB schema. Or if the index is NULLABLE, there is no need to modify it specifically, but for newly designed DBs or indexes, this principle needs to be adhered to as much as possible.

After introducing the principles of data type selection, we will introduce the common data types in Mysql and what needs to be paid attention to in performance optimization.

· Intersection
mainly includes Tinyint (8bit), Smallint (16bit), Mediumint (24bit), int (32bit), or Bigint (64bit).

For signed integers, the storage range of these types is (-2(n-1), 2(n-1)-1), and for unsigned numbers, the expression range is (0, 2n-1), for the database, signed numbers and unsigned numbers occupy the same storage space, so when selecting the type, you can only consider the range of the number, without considering whether it is signed or unsigned

Mysql allows You specify the width when defining an integer type, such as INT(10). There is a difference in the output of INT(10) for Client/CMD Line, but from the perspective of Mysql Server, there is no difference between INT(10) and INT(32) in terms of actual storage space/computation consumption/number range.

·                                                                                             # More space, so unless necessary, you should try to avoid using decimal types

When creating a decimal type field, you can use FLOAT(10,3) to specify the precision of the decimal, >= The maximum precision in MySQL 5.0 version supports 65 decimal places.

Since the database uses Binary Array String to store the numbers after the decimal point, the higher the precision you require, the higher the storage space/calculation CPU clock may be consumed.

Although using decimals may consume more storage space and CPU resources, and for early Mysql versions, the accuracy will be lost when two decimals are involved in calculations, but it is necessary in many cases. For example, in the financial field, the storage of amounts. In many cases, in order to reduce storage overhead and ensure accuracy, decimals are often expanded to integers and stored in the database, and the decimals are converted and calculated in the Application. For example, a user's account balance remains 999.35 yuan, then the amount stored in the data is 99935 points. After the bank's processing program gets 99935 points, it will first convert it into 999.35 yuan, and then perform the corresponding processing

· String

No matter which language it is, string is a relatively important and complex type. This rule also applies to MYSQL.
In MYSQL, there are two main string types: VARCHAR and CHAR. The storage methods of these two string types in disk and memory are determined by the Storage engine, and different storage engines may have different storage methods. Generally speaking, for a storage engine, the storage methods in disk and memory are also different. When data is transferred between disk and memory, the storage engine will be responsible for converting the data
VARCHAR
First of all, it needs to be pointed out that Mysql uses variable length to store VARCHAR. Compared with fixed length, this method adopts a storage space strategy of "use as much as you need", which is a relatively space-saving storage solution. It can be used as the default type without special requirements

The reason why VARCHAR can achieve fixed length is because each VARCHAR value will be appended with a length indicator of 1-2byte, for example, when it needs to store " "I Love Java", the underlying storage content is "11I Love Java", where 11 (1 Byte) represents the length. When the length of the content to be stored is 1000, the length indicator requires two bytes. Because the maximum value of 2bytes is 216, when the stored string exceeds this length, unpredictable exceptions will occur. In this case, CLOB needs to be used to store such ultra-long strings.

In different versions of MYSQL, the processing of trailing spaces in VARCHAR fields is also different
Version>=5.0 Keep the trailing spaces
Version
MYSQL 5.6 is an example:

Evolution of high-performance MySql (1): Optimization of data types_Part 1


The space overhead of using VARCHAR(5) and VARCHAR(200) to store 'hello' is the same. So are there any advantages to using shorter columns?

Facts have proved to have great advantages. Larger columns consume more memory because MySQL typically allocates fixed-size blocks of memory to hold internal values. This is especially bad when using in-memory temporary tables for sorting or operations. It's equally bad when sorting using disk temporary tables.

So the best strategy is to allocate only the space you really need.

CHAR
The biggest difference between the CHAR type and the VARCHAR type is that it is of fixed length. At the same time, compared with VARCHAR, it has the following main features
1) In all MYSQL versions, the spaces at the end will be intercepted

Evolution of high-performance MySql (1): Optimization of data types_Part 1

2) For some short and Fields with basically the same length are a good choice, such as MD5, ID Number
3) For fields that often need to be changed, the CHAR type will be more efficient
4) For some ultra-short fields, it is also very space-saving. For example, if you save "Y" or "N", using CHAR only requires one byte, while using VARCHAR requires two bytes (1byte length+1 byte value)

For fixed-length CHAR, Mysql The server will allocate sufficient storage space by padding spaces according to its defined length. One thing to note is that the operations of "filling spaces" and "removing trailing spaces" for VARCHAR/CHAR are implemented by the Mysql server and have nothing to do with the Storage engine.

The above is the evolution theory of high-performance MySql (1) :Data type optimization_Contents on, for more related content, please pay attention to the PHP Chinese website (www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn