Home  >  Article  >  Database  >  MySql index things

MySql index things

little bottle
little bottleforward
2019-04-04 16:46:502635browse

Today we will talk about MySql indexes. In this article, I will mainly talk about the data structure of indexes under InnoDB, how indexes work, and how to better use indexes to improve efficiency.

MySql index things

1. What is an index?

A database index is a sorted data structure in a database management system to assist in quickly querying and updating data in database tables. . Just like the table of contents of the Xinhua Dictionary we used before, it can help us quickly search for a certain word.

2. Index classification

Classification angle Index name
Data structure B-tree, Hash index, R-Tree, etc.
Storage level

Clustered index, non-clustered index

Logical level Primary key index, ordinary index, compound index, unique index, spatial index, etc.

3. Index instance analysis (taking InnoDB as an example)

3.1 Index structure under InnoDB

Under InnoDB, tables are stored in the form of indexes according to the order of primary keys. This The data storage method is also called a clustered index. "Clustering" means that the data rows and adjacent key values ​​​​are compactly stored together, that is, the data rows are actually stored in the leaf pages of the index. We create a table to actually illustrate the index structure under InnoDB. The table creation statement is as follows:

create table person(id int primary key, 
age int not nullindex (age)engine=InnoDB;

Then we insert five pieces of data: (1,15), (2,17), (6,20 ), (10,18), (19,21), the index tree structure is as follows:

MySql index things

The above picture shows two parts of the content. The first picture is the cluster From the content of the cluster index (primary key index), you can see that the data is sorted according to the size of the Id, and the corresponding index will contain the entire row of data for the index.

The second picture shows the index structure diagram using age as the index, which is a non-clustered index (non-primary key index). You can see that the index is sorted by age, but unlike the primary key index, age The index corresponds to Id, so we can know that the content of the non-primary key index record is the value of the primary key index.

Some students may have questions here. If I do not specify the primary key when creating the table, what is the index structure? In fact, in InnoDB, if the primary key is not defined, then it will choose a unique non-empty index instead. If there is no such index, then it will implicitly define a primary key as a clustered index. So no matter whether you set a primary key or not, InnoDB will still help you index data in the form of the above figure. Next we analyze the process of index query.

3.2 Index Query Analysis

Suppose we execute a query statement select * from person where ID = 6. Because the primary key ID query is directly used, the primary key index will be used. Since the primary key index All data in the entire row are directly related, so the engine only needs to execute it once to query the results.

If the executed sql statement is a non-primary key index

select * from person where age = 18

The process of finding the primary key index from the ordinary index and then querying the data is called table return. Since returning the table requires one more query, this is why primary key indexes are faster than ordinary indexes. Therefore, we should try to use primary key queries as much as possible. The above statement will use the ordinary index of age. The index first searches for index records equal to 18 based on age, finds the record with ID=10, and then searches once in the primary key index, and then takes out the data that needs to be queried.

3.3 Covering Index

We usually create indexes based on the where condition of the query, but this is just our usual practice. Based on the above analysis, we can know that if we want If you want to have high query efficiency, first, use the primary key index, and second, avoid returning the table, that is, you can get the desired data in the index as much as possible. If an index contains the fields that need to be queried, then we call it a "covering index".

So how to create a covering index? The answer is to achieve it through a joint index. The fields to be queried are covered by the fields of the joint index, so as to achieve the effect of index coverage.

我们把上面的建表语句改造下,来分析下如何实现覆盖索引。

 CREATE TABLE `person` (  `id` int(11) NOT NULL,  `age` int(11) DEFAULT NULL,  `name` varchar(20) DEFAULT NULL,  `sex` varchar(1) DEFAULT NULL,

上面我创建了一个name和age的联合索引,索引结构图表示如下:

MySql index things

我们根据图可以知道,联合索引是和创建索引字段顺序有关的,上面这个例子就是先以name排序,然后name相同再以age为标准排序。那么我们建表后该如何达到覆盖索引的效果呢?相信有些同学已经知道了怎么写sql可以达到覆盖索引效果,sql如下:

select name,age from person where name = "Barry"

因为我们需要查询的字段name和age,都在索引中可以直接查询到了,所以不需要查找到主键ID,然后再回表了。

看到这里,肯定有同学会说,既然这样的话,我把所有需要查询的字段组合都建上联合索引不就行了吗?答案是:不行。因为索引也是需要消耗空间的,而且维护索引也是需要成本的,这一点我会在后面的优缺点中提到。那么有没有别的方式可以尽可能的实现不回表的效果呢?这里我们就要引入MySql的最左前缀原则了。

什么叫最左前缀原则呢?就是在索引的匹配中,可以以索引的最左N个字段,也可以是字符串索引的最左N个字符。比如在上图中,要查询以A开头的名字,查询语句就是

<span style="font-family: "Microsoft Yahei", "Hiragino Sans GB", Helvetica, "Helvetica Neue", 微软雅黑, Tahoma, Arial, sans-serif; white-space: normal;">select name from person where name like &#39;A%&#39;</span><br/>

这个时候就可以满足最左前缀规则来使用索引查询了,这里就会依赖索引查询到第一个首字母是A的名字,然后向后遍历,直到不满足条件为止。

那么最左N个字段是什么意思呢?意思就是索引(name,age),可以直接利用 name来当做单独索引使用,可以只使用联合索引的部分字段,但是必须是顺序一致,比如索引(a,b,c),如果要想使用最左前缀规则,可以使用索引a,ab。

我们也可以利用该规则来少维护一个或多个索引,比如我们需要 a,ab,abc的查询,那就只需要(a,b,c)联合索引就满足要求了。

3.4 索引下推

在MySql 5.6版本中引入了一个新特性,叫做“索引条件推送(index condition pushdown)”,这也称为索引下推。那么索引下推是这个什么东东呢?其实从“索引条件推送”这个名字就可以表明,这个特性是可以在索引中的字段进行条件判断,然后过滤不满足条件的记录,减少回表的次数。

比如以上图中的数据为准,sql如下:

<span style="font-family: "Microsoft Yahei", "Hiragino Sans GB", Helvetica, "Helvetica Neue", 微软雅黑, Tahoma, Arial, sans-serif; white-space: normal;">select * from person where name like &#39;A%&#39; and age =19;</span><br/>

那么如果没有索引下推的情况下,首先会根据索引查询出名字以A开头的所有记录,然后查询出ID,然后回表去查询对应的ID记录,最后再判断age=19,返回满足条件的语句。因为满足A开头的记录有2条,所以这种情况下,会回表2次。

在索引下推情况下,InnoDB会在索引内部直接判断age=19是否满足条件,过滤掉不满足条件的记录,所以只返回了一条,也就是只需要回表一次。从而提高了性能。

3.5 索引的优点与缺点

说了这么多关于索引的内容,我们来谈谈索引的优缺点。

优点:

减少服务器需要扫描的数据量索引可以帮助服务器避免排序和临时表索引可以将随机IO变为顺序IO

缺点

索引会占用额外的存储空间索引的维护需要一定的成本,插入数据后需要保证原来的索引有序,所以也会影响一定的数据库性能。

五、总结

In this blog post, I mainly talked about the definition of index, the classification of index, and the common types of index according to different perspectives. Then I focused on the data structure of the index indexed under InnoDB. The difference between primary key index and non-primary key index is that querying the primary key index can directly return data. Non-primary key index needs to query the primary key ID first, and then query the data. This process is called table return. We can reduce the number of table returns by covering indexes, thereby improving performance. After mysql5.6, InnoDB can support index pushdown. When using a joint index, if the conditions can be judged in the index, rows that do not meet the conditions will be filtered in the index, thereby reducing the number of table returns.

6. Reference

"High Performance MySql" 3rd Edition

"MySql45 Lecture" column


【 Recommended courses: MySQL Video Tutorial]



The above is the detailed content of MySql index things. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:oschina.net. If there is any infringement, please contact admin@php.cn delete