Home  >  Article  >  Database  >  15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

青灯夜游
青灯夜游forward
2022-03-01 11:17:412546browse

This article summarizes and shares 15 Mysql index failure scenarios so that everyone can avoid pitfalls and thunderstorms. I hope it can help everyone!

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

Whether you are a technical expert or a novice who has just entered the industry, you will step on the pitfall of Mysql database not using indexes from time to time. A common phenomenon is that an index is added to a field, but it does not take effect.

I encountered a slightly special scenario a few days ago. The same SQL statement took effect under certain parameters but not under certain parameters. Why is this?

In addition, whether it is an interview or daily life, you should understand and learn the common situations of Mysql index failure.

In order to facilitate learning and memory, this document summarizes 15 common situations of not following the index and demonstrates them with examples to help everyone better avoid pitfalls. It is recommended to save it for emergencies.

Database and index preparation

Create table structure

In order to verify the use of the index item by item, we first prepare a Table t_user:

CREATE TABLE `t_user` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT 'ID',
  `id_no` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL COMMENT '身份编号',
  `username` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL COMMENT '用户名',
  `age` int(11) DEFAULT NULL COMMENT '年龄',
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  PRIMARY KEY (`id`),
  KEY `union_idx` (`id_no`,`username`,`age`),
  KEY `create_time_idx` (`create_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

There are three indexes in the above table structure:

  • id: is the primary key of the database;
  • union_idx: A union index composed of id_no, username, and age;
  • create_time_idx: A common index composed of create_time;

Initialization data

The initialization data is divided into two parts: basic data and batch import data.

The basic data inserts 4 pieces of data, of which the creation time of the 4th piece of data is in the future, which is used for subsequent verification of special scenarios:

INSERT INTO `t_user` (`id`, `id_no`, `username`, `age`, `create_time`) VALUES (null, '1001', 'Tom1', 11, '2022-02-27 09:04:23');
INSERT INTO `t_user` (`id`, `id_no`, `username`, `age`, `create_time`) VALUES (null, '1002', 'Tom2', 12, '2022-02-26 09:04:23');
INSERT INTO `t_user` (`id`, `id_no`, `username`, `age`, `create_time`) VALUES (null, '1003', 'Tom3', 13, '2022-02-25 09:04:23');
INSERT INTO `t_user` (`id`, `id_no`, `username`, `age`, `create_time`) VALUES (null, '1004', 'Tom4', 14, '2023-02-25 09:04:23');

In addition to the basic data, there is also a storage The process and the SQL it calls are convenient for inserting data in batches and are used to verify scenarios with a lot of data:

-- 删除历史存储过程
DROP PROCEDURE IF EXISTS `insert_t_user`

-- 创建存储过程
delimiter $

CREATE PROCEDURE insert_t_user(IN limit_num int)
BEGIN
  DECLARE i INT DEFAULT 10;
    DECLARE id_no varchar(18) ;
    DECLARE username varchar(32) ;
    DECLARE age TINYINT DEFAULT 1;
    WHILE i < limit_num DO
        SET id_no = CONCAT("NO", i);
        SET username = CONCAT("Tom",i);
        SET age = FLOOR(10 + RAND()*2);
        INSERT INTO `t_user` VALUES (NULL, id_no, username, age, NOW());
        SET i = i + 1;
    END WHILE;

END $
-- 调用存储过程
call insert_t_user(100);

Regarding the creation and storage of stored procedures, you can not execute them temporarily and execute them when needed.

Database version and execution plan

View the current database version:

select version();
8.0.18

The above is the database version I tested: 8.0.18. Of course, all the following examples can be verified in other versions.

To view the SQL statement execution plan, we generally use the explain keyword to judge the index usage through the execution results.

Execution example:

explain select * from t_user where id = 1;

Execution result:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

You can see that the above SQL statement uses the primary key index (PRIMARY),key_len is 4;

where key_len means: Indicates the number of bytes used by the index. Based on this value, the usage of the index can be judged, especially in combined indexes. At this time, it is very important to determine how much of the index is used.

After preparing the above data and knowledge, let’s start to explain specific examples of index failure.

1 The joint index does not satisfy the leftmost matching principle

The joint index follows the leftmost matching principle. As the name suggests, In the joint index, the leftmost field is matched first. Therefore, when creating a joint index, the most frequently used fields in the where clause are placed on the leftmost side of the combined index.

When querying, if you want the query conditions to be indexed, you need to meet the following requirements: the leftmost field must appear in the query conditions.

In the example, union_idxthe union index consists of:

KEY `union_idx` (`id_no`,`username`,`age`)

The leftmost field is id_no. Under normal circumstances, as long as id_no appears in the query conditions, it will go The joint index.

Example 1:

explain select * from t_user where id_no = &#39;1002&#39;;

explain result:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

It can be seen from the explain execution result that the above SQL The statement goes to the index union_idx.

Here is a general introduction to the calculation of key_len:

  • id_no The type is varchar(18), the character set is utf8mb4_bin, that is, 4 bytes are used. Represents a complete UTF-8. At this time, key_len = 18* 4 = 72;
  • Since the field type varchar is a variable-length data type, 2 additional bytes need to be added. At this time, key_len = 72 2 = 74;
  • Since this field runs as NULL (default NULL), 1 more byte needs to be added. At this time, key_len = 74 1 = 75;

The above demonstrates the calculation process of key_len in one case. We will not proceed with the deduction one by one in the future. Just know the basic composition and principle. You can learn more about it. See for yourself.

Example 2:

explain select * from t_user where id_no = &#39;1002&#39; and username = &#39;Tom2&#39;;

explain result:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

Obviously, still goneunion_idx Index, based on the analysis of key_len above, I boldly guess that when using the index, not only the id_no column is used, but also the username column.

Example three:

explain select * from t_user where id_no = &#39;1002&#39; and age = 12;

explain result:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

走了union_idx索引,但跟示例一一样,只用到了id_no列。

当然,还有三列都在查询条件中的情况,就不再举例了。上面都是走索引的正向例子,也就是满足最左匹配原则的例子,下面来看看,不满足该原则的反向例子。

反向示例

explain select * from t_user where username = &#39;Tom2&#39; and age = 12;

explain结果:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

此时,可以看到未走任何索引,也就是说索引失效了。

同样的,下面只要没出现最左条件的组合,索引也是失效的:

explain select * from t_user where age = 12;
explain select * from t_user where username = &#39;Tom2&#39;;

那么,第一种索引失效的场景就是:在联合索引的场景下,查询条件不满足最左匹配原则

2 使用了select *

在《阿里巴巴开发手册》的ORM映射章节中有一条【强制】的规范:

【强制】在表查询中,一律不要使用 * 作为查询的字段列表,需要哪些字段必须明确写明。 说明:1)增加查询分析器解析成本。2)增减字段容易与 resultMap 配置不一致。3)无用字段增加网络 消耗,尤其是 text 类型的字段。

虽然在规范手册中没有提到索引方面的问题,但禁止使用select * 语句可能会带来的附带好处就是:某些情况下可以走覆盖索引

比如,在上面的联合索引中,如果查询条件是age或username,当使用了select * ,肯定是不会走索引的。

但如果希望根据username查询出id_no、username、age这三个结果(均为索引字段),明确查询结果字段,是可以走覆盖索引的:

explain select id_no, username, age from t_user where username = &#39;Tom2&#39;;
explain select id_no, username, age from t_user where age = 12;

explain结果:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

无论查询条件是username还是age,都走了索引,根据key_len可以看出使用了索引的所有列。

第二种索引失效场景:在联合索引下,尽量使用明确的查询列来趋向于走覆盖索引

这一条不走索引的情况属于优化项,如果业务场景满足,则进来促使SQL语句走索引。至于阿里巴巴开发手册中的规范,只不过是两者撞到一起了,规范本身并不是为这条索引规则而定的。

3 索引列参与运算

直接来看示例:

explain select * from t_user where id + 1 = 2 ;

explain结果:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

可以看到,即便id列有索引,由于进行了计算处理,导致无法正常走索引。

针对这种情况,其实不单单是索引的问题,还会增加数据库的计算负担。就以上述SQL语句为例,数据库需要全表扫描出所有的id字段值,然后对其计算,计算之后再与参数值进行比较。如果每次执行都经历上述步骤,性能损耗可想而知。

建议的使用方式是:先在内存中进行计算好预期的值,或者在SQL语句条件的右侧进行参数值的计算。

针对上述示例的优化如下:

-- 内存计算,得知要查询的id为1
explain select * from t_user where id = 1 ;
-- 参数侧计算
explain select * from t_user where id = 2 - 1 ;

第三种索引失效情况:索引列参与了运算,会导致全表扫描,索引失效

4 索引列参使用了函数

示例:

explain select * from t_user where SUBSTR(id_no,1,3) = &#39;100&#39;;

explain结果:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

上述示例中,索引列使用了函数(SUBSTR,字符串截取),导致索引失效。

此时,索引失效的原因与第三种情况一样,都是因为数据库要先进行全表扫描,获得数据之后再进行截取、计算,导致索引索引失效。同时,还伴随着性能问题。

示例中只列举了SUBSTR函数,像CONCAT等类似的函数,也都会出现类似的情况。解决方案可参考第三种场景,可考虑先通过内存计算或其他方式减少数据库来进行内容的处理。

第四种索引失效情况:索引列参与了函数处理,会导致全表扫描,索引失效

5 错误的Like使用

示例:

explain select * from t_user where id_no like &#39;%00%&#39;;

explain结果:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

针对like的使用非常频繁,但使用不当往往会导致不走索引。常见的like使用方式有:

  • 方式一:like '%abc';
  • 方式二:like 'abc%';
  • 方式三:like '%abc%';

其中方式一和方式三,由于占位符出现在首部,导致无法走索引。这种情况不做索引的原因很容易理解,索引本身就相当于目录,从左到右逐个排序。而条件的左侧使用了占位符,导致无法按照正常的目录进行匹配,导致索引失效就很正常了。

第五种索引失效情况:模糊查询时(like语句),模糊匹配的占位符位于条件的首部

6 类型隐式转换

示例:

explain select * from t_user where id_no = 1002;

explain结果:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

id_no字段类型为varchar,但在SQL语句中使用了int类型,导致全表扫描。

出现索引失效的原因是:varchar和int是两个种不同的类型。

解决方案就是将参数1002添加上单引号或双引号。

第六种索引失效情况:参数类型与字段类型不匹配,导致类型发生了隐式转换,索引失效

这种情况还有一个特例,如果字段类型为int类型,而查询条件添加了单引号或双引号,则Mysql会参数转化为int类型,虽然使用了单引号或双引号:

explain select * from t_user where id = &#39;2&#39;;

上述语句是依旧会走索引的。

7、使用OR操作

OR是日常使用最多的操作关键字了,但使用不当,也会导致索引失效。

示例:

explain select * from t_user where id = 2 or username = &#39;Tom2&#39;;

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

看到上述执行结果是否是很惊奇啊,明明id字段是有索引的,由于使用or关键字,索引竟然失效了。

其实,换一个角度来想,如果单独使用username字段作为条件很显然是全表扫描,既然已经进行了全表扫描了,前面id的条件再走一次索引反而是浪费了。所以,在使用or关键字时,切记两个条件都要添加索引,否则会导致索引失效。

但如果or两边同时使用“>”和“

explain select * from t_user where id  > 1 or id  < 80;

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

第七种索引失效情况:查询条件使用or关键字,其中一个字段没有创建索引,则会导致整个查询语句索引失效; or两边为“>”和“。

8 两列做比较

如果两个列数据都有索引,但在查询条件中对两列数据进行了对比操作,则会导致索引失效。

这里举个不恰当的示例,比如age小于id这样的两列(真实场景可能是两列同维度的数据比较,这里迁就现有表结构):

explain select * from t_user where id > age;

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

这里虽然id有索引,age也可以创建索引,但当两列做比较时,索引还是会失效的。

第八种索引失效情况:两列数据做比较,即便两列都创建了索引,索引也会失效

9 不等于比较

示例:

explain select * from t_user where id_no <> &#39;1002&#39;;

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

当查询条件为字符串时,使用”“或”!=“作为条件查询,有可能不走索引,但也不全是。

explain select * from t_user where create_time != &#39;2022-02-27 09:56:42&#39;;

上述SQL中,由于“2022-02-27 09:56:42”是存储过程在同一秒生成的,大量数据是这个时间。执行之后会发现,当查询结果集占比比较小时,会走索引,占比比较大时不会走索引。此处与结果集与总体的占比有关。

需要注意的是:上述语句如果是id进行不等操作,则正常走索引。

explain select * from t_user where id != 2;

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

第九种索引失效情况:查询条件使用不等进行比较时,需要慎重,普通索引会查询结果集占比较大时索引会失效

10 is not null

示例:

explain select * from t_user where id_no is not null;

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

第十种索引失效情况:查询条件使用is null时正常走索引,使用is not null时,不走索引

11 not in和not exists

在日常中使用比较多的范围查询有in、exists、not in、not exists、between and等。

explain select * from t_user where id in (2,3);

explain select * from t_user where id_no in (&#39;1001&#39;,&#39;1002&#39;);

explain select * from t_user u1 where exists (select 1 from t_user u2 where u2.id  = 2 and u2.id = u1.id);

explain select * from t_user where id_no between &#39;1002&#39; and &#39;1003&#39;;

上述四种语句执行时都会正常走索引,具体的explain结果就不再展示。主要看不走索引的情况:

explain select * from t_user where id_no not in(&#39;1002&#39; , &#39;1003&#39;);

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

当使用not in时,不走索引?把条件列换成主键试试:

explain select * from t_user where id not in (2,3);

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

如果是主键,则正常走索引。

第十一种索引失效情况:查询条件使用not in时,如果是主键则走索引,如果是普通索引,则索引失效

再来看看not exists

explain select * from t_user u1 where not exists (select 1 from t_user u2 where u2.id  = 2 and u2.id = u1.id);

explain结果:

115 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

当查询条件使用not exists时,不走索引。

第十二种索引失效情况:查询条件使用not exists时,索引失效

12 order by导致索引失效

示例:

explain select * from t_user order by id_no ;

explain结果:

15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

其实这种情况的索引失效很容易理解,毕竟需要对全表数据进行排序处理。

那么,添加删limit关键字是否就走索引了呢?

explain select * from t_user order by id_no limit 10;

explain结果:

215 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

结果依旧不走索引。在网络上看到有说如果order by条件满足最左匹配则会正常走索引, 在当前8.0.18版本中并未出现。所以,在基于order bylimit进行使用时,要特别留意。是否走索引不仅涉及到数据库版本,还要看Mysql优化器是如何处理的。

这里还有一个特例,就是主键使用order by时,可以正常走索引。

explain select * from t_user order by id desc;

explain结果:

215 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

可以看出针对主键,还是order by可以正常走索引。

另外,笔者测试如下SQL语句:

explain select id from t_user order by age;
explain select id , username from t_user order by age;
explain select id_no from t_user order by id_no;

上述三条SQL语句都是走索引的,也就是说覆盖索引的场景也是可以正常走索引的。

现在将idid_no组合起来进行order by

explain select * from t_user order by id,id_no desc;
explain select * from t_user order by id,id_no desc limit 10;
explain select * from t_user order by id_no desc,username desc;

explain结果:

215 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

上述两个SQL语句,都未走索引。

第十三种索引失效情况:当查询条件涉及到order by、limit等条件时,是否走索引情况比较复杂,而且与Mysql版本有关,通常普通索引,如果未使用limit,则不会走索引。order by多个索引字段时,可能不会走索引。其他情况,建议在使用时进行expain验证。

13 参数不同导致索引失效

此时,如果你还未执行最开始创建的存储过程,建议你先执行一下存储过程,然后执行如下SQL:

explain select * from t_user where create_time > &#39;2023-02-24 09:04:23&#39;;

其中,时间是未来的时间,确保能够查到数据。

explain结果:

215 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

可以看到,正常走索引。

随后,我们将查询条件的参数换个日期:

explain select * from t_user where create_time > &#39;2022-02-27 09:04:23&#39;;

explain结果:

215 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly)

此时,进行了全表扫描。这也是最开始提到的奇怪的现象。

为什么同样的查询语句,只是查询的参数值不同,却会出现一个走索引,一个不走索引的情况呢?

答案很简单:上述索引失效是因为DBMS发现全表扫描比走索引效率更高,因此就放弃了走索引

也就是说,当Mysql发现通过索引扫描的行记录数超过全表的10%-30%时,优化器可能会放弃走索引,自动变成全表扫描。某些场景下即便强制SQL语句走索引,也同样会失效。

类似的问题,在进行范围查询(比如>、=、

The fourteenth index failure situation: When the query conditions are range queries such as greater than or equal to, in, etc., depending on the proportion of the query results in the entire table data, the optimizer may give up the index and perform a full table scan. .

14 Others

Of course, there are other rules for whether to use an index. This is also related to whether the type of index is a B-tree index or a bitmap index, so I won’t go into details. Expand.

The other things to be mentioned here can be summarized as the fifteenth index failure situation: Other optimization strategies of the Mysql optimizer. For example, the optimizer believes that in some cases, a full table scan is faster than a full table scan. If the index is fast, it will give up indexing.

Generally speaking, you don’t need to pay too much attention to this situation. When you find a problem, you can just investigate it at a designated location.

Summary

This article summarizes 15 common index failure scenarios for everyone. Due to different Mysql versions, the index failure strategies are also different. Most index failure situations are clear, and a small number of index failures will vary depending on the version of Mysql. Therefore, it is recommended to save this article and compare it during practice. If you cannot accurately grasp it, you can directly execute explain for verification.

[Related recommendations: mysql video tutorial]

The above is the detailed content of 15 Mysql index failure scenarios you deserve to know (helping you avoid pitfalls quickly). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete