Home  >  Article  >  Database  >  What is the role of MySQL index

What is the role of MySQL index

王林
王林forward
2023-05-27 23:52:391719browse

    First create a database table:

    create table single_table(
    	id int not auto_increment, 
    	key1 varchar(100),         
    	key2 int,
    	key3 varchar(100),
    	key_part1 varchar(100),
    	key_part2 varchar(100),
    	key_part3 varchar(100),
        common_field varchar(100),
    	primary key(id),          # 聚簇索引
    	key idx_key1(key1),       # 二级索引
    	unique key uk_key2(key2), # 二级索引,而且该索引是唯一二级索引
    	key idx_key3(key3),       # 二级索引
    	key idx_key_part(key_part1,key_part2,key_part3) # 二级索引,也是联合索引
    )Engine=InnoDB CHARSET=utf8;

    1. Indexes are used to reduce the number of records that need to be scanned

    The most basic query execution plan is to scan all records in the table and check whether each search record meets the search conditions. If it matches, send it to the client, otherwise skip the record. This execution scheme is called a full table scan.

    For the InnoDB storage engine, a full table scan means starting from the first record of the first leaf node of the clustered index and moving backward along the one-way linked list where the record is located. Scan until the last record of the last leaf node. If you can use the B-tree to find records whose index column value is equal to a certain value, you can reduce the number of records that need to be scanned.

    Since the records in the B-tree leaf nodes are sorted in ascending order of index column values, only scanning records in a certain interval or certain intervals can significantly reduce the number of records that need to be scanned. quantity.

    For the query statement:

    select * from single_table where id>=2 and id<=100;

    This statement actually wants to find the id value in the [2,100] interval For all clustered index records, we can quickly find the clustered index record with id=2 through the B-tree corresponding to the clustered index, and then scan backward along the one-way linked list where the record is located until Until the id value of a certain clustered index record is not in the [2,100] interval. Compared with scanning all clustered index records, this method greatly reduces the number of records that need to be scanned. quantity, thus improving query efficiency.

    In fact, for B-tree, as long as the index column and constant use =, , in, not in, is null, is not null, >, operators can be connected to generate a scan interval, thereby improving query efficiency.

    2. Index is used for sorting

    When we write query statements, we often need to use the order by clause to sort the queried records according to certain rules. Under normal circumstances, we can only load records into memory, and then use some sorting algorithms to sort these records in memory. Sometimes the query result set may be too large to be sorted in memory. In this case, it is necessary to temporarily use disk space to store the intermediate results, and then return the sorted results to the client after the sorting operation is completed.

    In MySQL, this method of sorting in memory or on disk is called file sorting, but if an index column is used in the order by clause, it may be omitted. Steps for sorting in memory or disk.

    1. Analyze the following query statement:

    select * form single_table order by key_part1,key_part2,key_part3 limit 10;

    The result set of this query statement needs to be sorted according to the key_part1 value. If the recorded key_part1 If the values ​​are the same, then sort by the key_part2 value. If the key_part1 value and the key_part2 value are the same, then sort by the key_part3 value. The joint index idx_key_part we established is sorted according to the above rules. The following is a simplified diagram of the idx_key_part index:

    What is the role of MySQL index

    So we can start from the first idx_key_part secondary index record, scan backward along the one-way linked list where the record is located, and get 10 secondary index records. Since our query list is *, that is, we need to read the complete user record, so we perform a table return operation for each secondary index record obtained and send the complete user record to the customer. end. This saves the time of sorting 10,000 records.

    Here we add a limit statement when executing the query statement. If we do not limit the number of records to be obtained, it will cause a large number of secondary index records to be returned to the table, which will affect the overall performance.

    2. Things to note when using a joint index for sorting

    When using a joint index, you need to pay attention: order byThe order of the columns after the clause must also be in accordance with The order of the index columns is given; if the order of order by key_part3, key_part2, key_part1 is given, the B-tree index cannot be used.

    The reason why you cannot use the index if you reverse the sorting order is that the sorting rules for pages and records in the joint index are stipulated, that is, first sort according to the key_part1 value. If the record's key_part1 has the same value, and then sorts according to the key_part2 value. If the recorded key_part1 value and key_part2 value are the same, then sort according to key_part3Value sorting. If the content of the order by clause is order by key_part3, key_part2, key_part1, then it is required to sort by the key_part3 value first. If the recorded key_part3 The values ​​are the same, and then sorted by the key_part2 value. If the recorded key_part3 values ​​and the key_part2 values ​​are the same, then sort them by the key_part1 Value ordering, which is obviously a conflict.

    3、不可以使用索引进行排序的情况:

    (1) ASC、DESC混用;

    对于使用联合索引进行排序的场景,我们要求各个排序列的排序规则是一致的,也就是要么各个列都是按照升序规则排序,要么都是按照降序规则排序。

    (2) 排序列包含非一个索引的列;

    有时用来排序的多个列不是同一个索引中的,这种情况也不能使用索引进行排序,比如下面的查询语句:

    select * from single_table order by key1,,key2 limit 10;

    对于idx_key1的二级索引记录来说,只按照key1列的值进行排序,而且在key1列相同的情况下是不按照

    key2列的值进行排序的,所以不能使用idx_key1索引执行上述查询。

    (3) 排序列是某个联合索引的索引列,但是这些排序列在联合索引中并不连续;

    (4) 排序列不是以单独列名的形式出现在order by子句中;

    3、索引用于分组

    有时为了方便统计表中的一些信息,会把表中的记录按照某些列进行分组。比如下面的分组查询语句:

    select key_part1,key_part2,key_part3,count(*) fron single_table group by key_part1,key_part2,key_part3;

    这个查询语句相当于执行了3次分组操作:

    • 先按照key_part1值把记录进行分组,key_part1值相同的所有记录划分为一组;

    • key_part1值相同的每个分组中的记录再按照key_part2的值进行分组,将key_part2值相同的记录放到一个小分组中,看起来像是在一个大分组中又细分了好多小分组。

    • 再将上一步中产生的小分组按照key_part3的值分成更小的分组。所以整体上看起来就像是先把记录分成一个大分组,然后再把大分组分成若干个小分组,最后把若干个小分组再细分为更多的小分组。

    上面这个查询语句就是统计每个小小分组包含的记录条数。

    如果没有idx_key_part索引,就得建立一个用于统计的临时表,在扫描聚簇索引的记录时将统计的中间结果填入这个临时表。当扫描完记录后,再把临时表中的结果作为结果集发送给客户端。

    如果有了idx_key_part索引,恰巧这个分组顺序又与idx_key_part的索引列的顺序一致,因此可以直接使用idx_key_part的二级索引进行分组,而不用建立临时表了。

    与使用B+树索引进行排序差不多,分组列的顺序页需要与索引列的顺序一致,也可以值使用索引列中左边连续的列进行分组。

    The above is the detailed content of What is the role of MySQL index. For more information, please follow other related articles on the PHP Chinese website!

    Statement:
    This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete