Home >Database >Mysql Tutorial >MySQL big data query performance optimization tutorial (picture)

MySQL big data query performance optimization tutorial (picture)

php是最好的语言
php是最好的语言Original
2018-07-26 16:42:592675browse

MySQL performance optimization includes table optimization and column type selection. What can table optimization be broken down into? 1. Separate fixed-length and variable-length fields; 2. Separate commonly used fields from uncommon fields; 3. Add redundant fields to 1-to-many fields that require correlation statistics.

1. Table optimization and column type selection

Table optimization:

1. Fixed length Separate from the change of length

##al, such as ID int, account for 4 bytes, Char (4) accounts for 4 characters length, and it is also fixed. Time is the byte of each unit value. .

Core and commonly used fields should be built to a fixed length and placed in one table.

and Varchar, Text, Blob, long fields are suitable for placing a single table and associated with the main key with the core table.

         

2. Commonly used fields and less commonly used fields should be separated

                                                                                                                                                                                                                        need to be analyzed in conjunction with the specific business of the website, and the query scenarios of the fields should be separated. Take it apart.

           

3. Add redundant fields to the 1-to-many fields that require related statistics.

See the following effect:

MySQL big data query performance optimization tutorial (picture)

This section, there are n posts, displaying the section information and posts on the homepage number.

                                                                                                                                                                                   , select count(*) from post group by board_id to get the number of posts in each board.

2. Column type selectionMySQL big data query performance optimization tutorial (picture)

1. Field type priority

Integer type>date time>enum

char>varchar>blob,text Integer type: fixed length, no country/region distinction, no character set difference. For example:

tinyint 1,2,3,4,5 char(1) a,b,c,d,e

In terms of space, they all occupy 1 Bytes, but order by sorting, the former is faster. The reason may be that the character set and collation set (that is, the sorting rules) need to be considered;

The time is fixed length, the operation is fast, and the space is saved. Considering the time zone, it is inconvenient to write sql where > `2018-08-08`;

enum, which can serve the purpose of constraint, is stored internally using integers, but when jointly querying with cahr, the internal Go through the conversion of strings and values;

char fixed length, consider the character set and (sorting) proofreading set;

varchar variable length, need to consider the character set conversion and proofreading set when sorting, Slow speed;

text/blob cannot use memory temporary table (sorting and other operations can only be performed on disk)

Attachment: Regarding the selection of date/time, the master’s clear opinion, choose directly int unsgined not null, stores timestamp.

For example:

Gender: Take utf8 as an example

char(1), 3 bytes long

enum('Male',' Female'); Internally converted into numbers for storage, one more conversion process

tinyint(), fixed length 1 byte

2. Just use enough, don’t be generous (such as smallint varchar(N))

Reason: Large bytes waste memory and affect speed.

Taking age as an example tinyint unsigned not null can store 255 years old, which is enough. Using int wastes 3 bytes;

The content stored in varchar(10) and varchar(300) is the same, but varchar(300) takes more memory during table join query.

3. Try to avoid using NULL()

Reason: NULL is not conducive to indexing and must be marked with special characters.

The space occupied on the disk is actually larger (MySQL5.5 has improved null, but the query is still inconvenient)

3. Index optimization strategy

1. Index type

1.1 B-tree index

It’s called btree index. From a broad perspective, they all use balanced trees, but in terms of specific implementation, each engine is slightly different. For example, strictly speaking, the NDB engine uses T-tree.

But abstracting the B-tree system, it can be understood as a "sorted fast query structure".

1.2 Hash index

The default is hash index in the memory table, and the theoretical query time complexity of hash is O(1).

Question: Since hash search is so efficient, why not use hash index?

Answer:

1. The result calculated by the hash function is random. If the data is placed on the disk, taking the primary key as id as an example, then as the id grows, the id The corresponding rows are randomly placed on the disk.

2. Range query cannot be optimized.

3. The prefix index cannot be used. For example, in btree, the value of the field column is "helloworld", and the index query x=helloworld can naturally use the index, and x=hello can also use the index (left prefix index) .

4. Sorting cannot be optimized.

5. The row must be returned, which means that to get the data location through the index, the data must be returned to the table.

2. Common misunderstandings of btree indexes

2.1 Add indexes on columns commonly used in where conditions, for example:

where cat_id = 3 and price> ;100; Check the third column for products over 100 yuan.

Misunderstanding: Add indexes to both cat_id and price.

Error: Only cat_id or price index can be used, because they are independent indexes, and only one can be used at the same time.

2.2 After creating an index on multiple columns (joint index), the index will play a role in whichever column is queried

Misunderstanding: For the index to work on a multi-column index, the left prefix requirement needs to be met .

Take index(a,b,c) as an example, (note that it depends on the order)

MySQL big data query performance optimization tutorial (picture)

4. Index experiment

For example: select * from t4 where c1=3 and c2 = 4 and c4>5 and c3=2;

Which indexes are used:

explain select * from t4 Where C1 = 3 and C2 = 4 and C4 & GT; 5 and C3 = 2 \ G

## As follows:

MySQL big data query performance optimization tutorial (picture)## Note: (Key_Len: 4)

5. Clustered index and non-clustered index

Myisam and innodb engine, similarities and differences in index files

Myisam: consists of news.myd and new.myi The two files, the index file and the data file, are separate and are called non-clustered indexes. Both the primary index and the secondary index point to the physical row (the location of the disk)

innodb: The index and data are gathered together, so it is a clustered index. The row of data is stored directly in the primary index file of innodb, and the secondary index points to a reference to the primary key index.

Note: For innodb:

1. The primary key index stores the index value and stores the row data in the leaves.

2. If there is no primary key, unique key will be used as the primary key.

3. If there is no unique, the system generates an internal rowid as the primary key.

4. Like innodb, in the primary key index structure, both the primary key value and the row data are stored. This structure is called a clustered index.

Clustered index

Advantages: When there are relatively few query entries based on the primary key, no rowback is needed (the data is under the primary key node)

Disadvantages: If When irregular data is inserted, frequent page splits occur

Related articles:

Mysql performance optimization

Related videos:

MySQL Optimization Video Tutorial

The above is the detailed content of MySQL big data query performance optimization tutorial (picture). For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn