Home >Database >Mysql Tutorial >Detailed introduction to mysql innodb index principle (code example)

Detailed introduction to mysql innodb index principle (code example)

不言
不言forward
2019-03-04 15:06:482681browse

This article brings you a detailed introduction (code example) about the mysql innodb index principle. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

Clustered index (clustered index)

The innodb storage engine table is an index-organized table, and the data in the table is stored in the order of the primary key. The clustered index constructs a B-tree in the order of the primary keys of each table, and its leaf nodes store the row record data of the entire table, and these leaf nodes become data pages. (Related recommendations: MySQL Tutorial)

The storage of the clustered index is not physically continuous, but logically continuous. The leaf nodes are sorted according to the order of the primary key and connected through a two-way linked list. . In most cases, the query optimizer tends to use clustered indexes because clustered indexes can find data directly at leaf nodes, and because the logical order of data is defined, queries for range values ​​can be accessed very quickly.

This feature of the clustered index determines that the data in the index-organized table is also part of the index. Since the data in the table can only be sorted according to a B-tree, a table can only have one clustered index.

In Innodb, the clustered index is the primary key index by default. If there is no primary key, follow the following rules to build a clustered index:

  • When there is no primary key, a non-empty and unique index column will be used as the primary key and become the clustered index of this table;
  • If there is no such index, InnoDB will implicitly define a primary key as a clustered index.

Since the primary key uses a clustered index, if the primary key is an auto-incrementing id, the corresponding data will also be stored adjacently on the disk, and the writing performance is high. If it is in the form of a string such as uuid, frequent insertion will cause innodb to frequently move disk blocks, and the writing performance will be relatively low.

B-tree (multi-path balanced search tree)

We know that the innodb engine index uses the B-tree structure, so why not other types of tree structures, such as binary trees?

When a computer stores data, it has a minimum storage unit, just like the minimum unit of RMB circulation is cents. The smallest unit of the file system is a block. The size of a block is 4K (this value varies according to the system and can be set). The InnoDB storage engine also has its own minimum storage unit - page (Page). The size of a page is 16K (this value is also configurable).

The size of a file in the file system is only 1 byte, but it has to occupy 4KB of space on the disk. In the same way, the size of all data files in innodb is always an integer multiple of 16384 (16k).

Detailed introduction to mysql innodb index principle (code example)

So in MySQL, a block node that stores the index occupies 16K, and each IO operation of MySQL will use the system's read-ahead capability to load 16K at a time. In this way, it is very wasteful to put only one index value on this node, because only one index value can be obtained at one time IO, so a binary tree cannot be used.

B tree is a multi-path search tree. One node can hold n values, n = 16K / the size of each index value.
For example, if the index field size is 1Kb, then each node can theoretically store 16 index values. In this case, the binary tree can only load one index value per IO, while the B-tree can load 16.

B The number of ways in the tree is n 1, where n is the number of values ​​that exist in each node. For example, each node stores 16 values, then the tree has 17 ways.

It can also be seen from here that B-tree nodes can store multiple values, so the B-tree index cannot find a specific row with a given key value. The B-tree can only find the specific page where the data row is stored, then read the page into memory, and then search for the specified data in memory.

Attachment: The difference between B-tree and B-tree is that the non-leaf nodes of B-tree only contain navigation information and do not contain actual values. All leaf nodes and connected nodes are connected using linked lists to facilitate interval Find and traverse.

Auxiliary index

is also called a non-clustered index. Its leaf nodes do not contain all the data of the row records. In addition to the key values, the leaf nodes contain the index rows in each leaf node. Also contains a bookmark, which is the clustered index key for the corresponding row.

The following figure can show the relationship between the auxiliary index and the clustered index (the picture is from the Internet, just look at the general meaning):

Detailed introduction to mysql innodb index principle (code example)

When using the auxiliary index When searching for data, the innodb storage engine will obtain the primary key that only wants the primary key index through the auxiliary index leaf node, and then find the complete row record through the primary key index.

For example, if you want to search for data in an auxiliary index tree with a height of 3, you need to perform 3 IOs on the auxiliary index tree to find the specified primary key. If the height of the clustered index tree is also 3, then you need to Search the clustered index tree three times to finally find a page where the complete row data is located, so a total of six IO accesses are needed to get the final data page.

The indexes created, such as joint indexes, unique indexes, etc., are all non-clustered indexes.

Joint index

Joint index refers to indexing multiple columns on the table. The joint index is also a B-tree. The difference is that the number of key values ​​in the joint index is not 1, but greater than or equal to 2.

For example, there is a user table with fields id, age, and name. It is found that the following two SQLs are used most frequently:

Select * from user where age = ? ;
Select * from user where age = ? and name = ?;

At this time, there is no need to build two separate indexes for age and name. You only need to build the following joint index:

create index idx_age_name on user(age, name)

Another benefit of the joint index is that the second key value has been sorted, which can sometimes avoid an additional sorting operation.

Covering index

Covering index means that all the field values ​​required for the query can be obtained from the auxiliary index without querying the records in the clustered index. The advantage of the covering index is that the auxiliary index does not contain all the information of the entire row record, so its size is much smaller than the clustered index, so it can reduce a large number of IO operations.

For example, if there is a joint index (age, name) above, if it is as follows:

select age,name from user where age=?

You can use the covering index.

Another benefit of covering indexes is for statistical issues, such as:

select count(*) from user

The innodb storage engine does not choose to perform statistics by querying the clustered index. Since there is an auxiliary index on the user table, and the auxiliary index is much smaller than the clustered index, selecting the auxiliary index can reduce IO operations.

Notes

  • Only build appropriate indexes, not redundant ones
Because every time data is added or deleted, the B-tree must be adjusted. If multiple indexes are created, multiple B-trees must be adjusted. The more trees and the larger the structure, the more time-consuming and resource-consuming this adjustment is. If these unnecessary indexes are reduced, disk usage may be greatly reduced.
  • The data length of the index column can be as little as possible.

The smaller the index data length, the more indexes are stored in each block, and the more values ​​can be obtained in one IO.

  • The matching column prefix can be used in the index like 9999%, like �99%, like �99 cannot use the index;
  • Where conditions in in and or can use the index , not in and operations cannot use the index;

If it is not in or , facing the B-tree, the engine does not know which node it should start from.

  • Match range values, order by can also be used to index;
  • Use specified column queries more often, only return the data columns you think of, use select * less;

There is no need to query useless fields, and if you do not use *, you may still hit the covering index;

  • If you do not start searching from the leftmost column of the index in the joint index, you cannot Use the index;

Leftmost matching principle;

  • In the joint index, the index can be used to accurately match the leftmost front column and range match another column;
  • In the joint index, if there is a range query for a certain column in the query, all columns to the right of it cannot be indexed

The above is the detailed content of Detailed introduction to mysql innodb index principle (code example). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete