Home  >  Article  >  Database  >  The relationship between MySQL not exists and indexes

The relationship between MySQL not exists and indexes

coldplay.xixi
coldplay.xixiOriginal
2020-09-01 17:30:242566browse

The relationship between MySQL not exists and indexes

In some business scenarios, the NOT EXISTS statement will be used to ensure that the returned data does not exist in a specific collection. Some colleagues will find that the performance of NOT EXISTS in some scenarios is poor, and even some online There is a rumor that "NOT EXISTS does not use the index". How can we optimize the NOT EXISTS statement?

Take today’s optimized SQL as an example. The SQL before optimization is:

SELECT count(1) FROM t_monitor m WHERE NOT exists (  SELECT 1   FROM t_alarm_realtime AS a   WHERE a.resource_id=m.resource_id   AND a.resource_type=m.resource_type   AND a.monitor_name=m.monitor_name)

We use the LEFT JOIN method to optimize. The SQL after optimization is:

SELECT count(1) FROM t_monitor m LEFT JOIN t_alarm_realtime AS a    ON a.resource_id=m.resource_id   AND a.resource_type=m.resource_type   AND a.monitor_name=m.monitor_name WHERE a.resource_id is NULL

Optimization Effect:

The execution time before optimization is more than 29 seconds, and after optimization it is 1.2 seconds, and the optimization is improved by 25 times.

NOT EXISTS really not indexed?

View the execution plans of two SQLs!

Execution plan using NOT EXIST method:

The relationship between MySQL not exists and indexes

##Execution plan using LEFT JOIN method:

The relationship between MySQL not exists and indexes

From the execution plan, both tables use index , the difference is that NOT EXISTS uses the "DEPENDENT SUBQUERY" method , while LEFT JOIN uses ordinary table association.

Recommend reading:Why can indexes improve query speed?

Use the

Profiling method provided by MySQL to view the execution process of the two methods.

Execution process using NOT EXIST mode:

The relationship between MySQL not exists and indexes

The execution process of using LEFT JOIN method:

The relationship between MySQL not exists and indexes

From the execution process, the main consumption of LEFT JOIN method is Sending data One item (1.2s), while the NOT EXISTS method mainly consumes two items: executing and Sending data, which is limited by the fact that Profiling only stores 100 rows of records.

From Profiling, we can only see 47 combinations of "executing and Sending data" (each combination is about 50us). From the execution plan, we can see that the amount of data in the outer t_monitor is 578436 rows, ignoring statistics. If the information is inaccurate, using the NOT EXISTS method should produce 578436 combinations of "executing and Sending data", with a total consumption time of =50μs*578436=28921800us=28.92s.

It can be inferred from the above execution process:

The execution performance of NOT EXISTS method depends heavily on the number of execution times of NOT EXISTS subquery, that is, The amount of data in the outer query result set.

  1. When the data volume N of the outer query result set is small, the execution performance is better. If N=10, the execution time is 50μs*10= 500us=0.005s, plus some additional consumption, the execution result can also be within the range of 0.01 seconds or 10 milliseconds. This response time should be acceptable to most applications.

  2. When the data volume N of the outer Chengxun result set is large or even tens of millions of data, the query performance of NOT EXISTS will become very bad. It may even consume a large amount of server IO and CPU resources, affecting the normal operation of other businesses.

In addition to the above problems, during the optimization process it was discovered that the resource_id column that should store the same data is defined differently in the two tables, one table is VARCHAR and the other table is BIGINT, The field type of the external result set is different from the field type in the NOT EXIST word table, resulting in the index being unable to be used in the NOT EXISTS subquery, resulting in poor subquery performance and ultimately affecting the execution performance of the entire query.

Jingdong Mall has also seen a large number of similar cases. Some tables use VARCHAR to store order numbers, while other tables use BIGINT to store them. When managing the two tables, the performance is extremely poor. I hope my R&D colleagues will take this as a warning. Follow the public account Java Technology Stack and reply to m36 to get a copy of MySQL R&D military regulations.

Related learning recommendations: mysql video tutorial

The above is the detailed content of The relationship between MySQL not exists and indexes. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn