Home >Database >Mysql Tutorial >How to use clustered index, auxiliary index, covering index and joint index in mysql
The clustered index constructs a B-tree based on the primary key of each table, and the row record data of the entire table is stored in the leaf nodes.
For example, let’s intuitively feel the clustered index.
Create table t, and artificially allow each page to store only two row records (I don’t know how to artificially control only two row records per page):
Finally, the author of "MySQL Technology Insider" obtained the rough structure of this clustered index tree through analysis tools as follows:
The leaf nodes of a clustered index are referred to as data pages, each of which is linked by a doubly linked list, and the data pages are arranged in the order of primary keys..
As shown in the figure, each data page stores a complete row record, while in the index page of the non-data page, only the key value and the offset pointing to the data page are stored. Not a complete line record.
If a primary key is defined, InnoDB will automatically use the primary key to create a clustered index. When no primary key is defined, InnoDB will choose a unique and non-empty index to serve as the primary key. InnoDB will implicitly define a primary key as a clustered index if there is no unique non-null index.
Auxiliary index, also called non-clustered index. Compared with the clustered index, the leaf nodes do not contain all the data of the row records. In addition to the key value, the leaf node's index row also contains a bookmark (bookmark), which is used to tell InnoDB where to find the row data corresponding to the index.
Let’s use the example in "MySQL Technology Insider" to intuitively feel what the auxiliary index looks like.
Still taking the above table t as an example, create a non-clustered index on column c:
Then the author obtains the auxiliary index and clustered index through analysis work Relationship diagram:
You can see that the leaf node of the auxiliary index idx_c contains the value of column c and the value of the primary key.
For example, assume that the value of Key is 0x7ffffffff, where the binary representation of 7 is 0111 and 0 is a negative number. The actual integer value should be inverted plus 1, so the result is -1, and this is the value in column c. The primary key value is a positive number 1, represented by the pointer value 80000001, where 8 bits represent the binary number 1000.
Using the InnoDB storage engine, you can cover the index through the auxiliary index and obtain the query records directly without querying the records in the clustered index.
What are the benefits of using covering index?
Can reduce a large number of IO operations
We know from the above figure that if you want to query fields that are not included in the auxiliary index, you must first traverse Auxiliary index, and then traverse the clustered index. If the field value to be queried exists in the auxiliary index, there is no need to check the clustered index, which will obviously reduce IO operations.
For example, in the picture above, the following sql can directly use the auxiliary index,
select a from where c = -2;
It is helpful for statistics
Assume that there is As shown in the following table:
CREATE TABLE `student` ( `id` bigint(20) NOT NULL, `name` varchar(255) NOT NULL, `age` varchar(255) NOT NULL, `school` varchar(255) NOT NULL, PRIMARY KEY (`id`), KEY `idx_name` (`name`), KEY `idx_school_age` (`school`,`age`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
If executed on this table:
select count(*) from student
How will the optimizer handle it?
The optimizer will choose the auxiliary index for statistics, because although results can be obtained by traversing both the clustered index and the auxiliary index, the size of the auxiliary index is much smaller than the clustered index. Execute the explain command:
key and Extra show that the auxiliary index idx_name is used.
Also, assume that the following sql is executed:
select * from student where age > 10 and age < 15
Because the field order of the joint index idx_school_age is first school and then age, the conditional query is based on age, usually without indexing:
However, if the conditions remain unchanged, querying all fields is changed to querying the number of entries:
select count(*) from student where age > 10 and age < 15
The optimizer will choose this joint index:
Joint index refers to indexing multiple columns on the table.
The following is an example of creating a joint index idx_a_b:
Internal structure of the joint index:
联合索引也是一棵B+树,其键值数量大于等于2。键值都是排序的,通过叶子节点可以逻辑上顺序的读出所有数据。数据(1,1)(1,2)(2,1)(2,4)(3,1)(3,2)是按照(a,b)先比较a再比较b的顺序排列。
基于上面的结构,对于以下查询显然是可以使用(a,b)这个联合索引的:
select * from table where a=xxx and b=xxx ; select * from table where a=xxx;
但是对于下面的sql是不能使用这个联合索引的,因为叶子节点的b值,1,2,1,4,1,2显然不是排序的。
select * from table where b=xxx
联合索引的第二个好处是对第二个键值已经做了排序。举个例子:
create table buy_log( userid int not null, buy_date DATE )ENGINE=InnoDB; insert into buy_log values(1, '2009-01-01'); insert into buy_log values(2, '2009-02-01'); alter table buy_log add key(userid); alter table buy_log add key(userid, buy_date);
当执行
select * from buy_log where user_id = 2;
时,优化器会选择key(userid);但是当执行以下sql:
select * from buy_log where user_id = 2 order by buy_date desc;
时,优化器会选择key(userid, buy_date),因为buy_date是在userid排序的基础上做的排序。
如果把key(userid,buy_date)删除掉,再执行:
select * from buy_log where user_id = 2 order by buy_date desc;
优化器会选择key(userid),但是对查询出来的结果会进行一次filesort,即按照buy_date重新排下序。所以联合索引的好处在于可以避免filesort排序。
The above is the detailed content of How to use clustered index, auxiliary index, covering index and joint index in mysql. For more information, please follow other related articles on the PHP Chinese website!