Home  >  Article  >  Database  >  How does MySQL optimize in subquery?

How does MySQL optimize in subquery?

PHPz
PHPzOriginal
2023-04-21 11:23:525351browse

In actual development, we often use the in subquery, which is equivalent to a set of values ​​to match specified fields, allowing us to filter and query more conveniently. However, this subquery method will cause great performance problems when the amount of data is large. This article will introduce how MySQL optimizes the in subquery.

1. Avoid using in subquery

In actual projects, we often see this way of writing:

SELECT *
FROM table
WHERE col1 IN (SELECT col1 FROM table2 WHERE condition);

This statement is the simplest in subquery , according to the conditions of table2, take out the values ​​of col1 in multiple rows, match the values ​​in the table, and return the matching rows. However, writing this way will cause a performance bottleneck, because the way MySQL internally executes the in subquery will cache the result set of the subquery in memory (or disk). After that, every time an in judgment is executed, the memory (or disk) will be cached. ), this will cause a large number of I/O operations, and when the result set of the subquery is large, it will also occupy a large amount of memory.

Therefore, try to avoid using in subqueries in actual development and can use join instead.

2. Use join to replace the in subquery

Use join to replace the in subquery. There is no difference between the writing of the subquery and the way of writing the subquery. It just converts the original in subquery into join to optimize the SQL syntax. That's all, and the execution efficiency is much higher than that of the in subquery. Find col1, and then join it with col1 in table1, as shown below:

SELECT table.*
FROM table
JOIN table2 ON table.col1 = table2.col1
WHERE table2.condition;

Compared with the in subquery, using join can connect the result set of the subquery with the table, reducing a lot of memory (disk) read operation.

3. Use exists instead of in subquery

Using exists instead of in subquery is actually using join. Unlike the in subquery, the exists subquery only needs to perform a simple judgment, regardless of the size of the result set. The following is a syntax example of exists subquery:

SELECT *
FROM table
WHERE EXISTS (SELECT 1 FROM table2 WHERE table.col1 = table2.col1 AND table2.condition);

Using exists instead of in subquery has greatly improved efficiency and can save a lot of I/O and memory consumption.

4. Use index to optimize in statement

If the index can be used to speed up the in subquery during query, the query efficiency will also be greatly improved. MySQL indexes are divided into three types: primary key index, unique index and ordinary index. If you can create a suitable index, you can avoid MySQL from performing a full table scan and improve query efficiency.

CREATE INDEX idx_col1 ON table (col1);

When the col1 value is large, using the index will greatly optimize query efficiency and reduce efficiency problems caused by using in subqueries.

5. Use limit and exists to optimize in subquery

If the result set of in subquery is very large, we can use limit and exists to perform paging query on it while avoiding full table scan , in order to achieve the purpose of optimizing query efficiency.

SELECT *
FROM table
WHERE EXISTS (SELECT 1 FROM table2 WHERE table.col1 = table2.col1 AND table2.condition LIMIT 1000, 20);

The function of this SQL statement is to find the result set of table2, and then use col1 and table to perform exists. Limit the query result set to 20 items, and query from the 1000th row.

6. Appropriate use of memory optimization in statement

If the number of in subquery results used in the query is not many, we can use set instead of in. set stores the result set of the in subquery in memory for subsequent query matching. Using memory to optimize the in statement can also improve performance a lot.

SET @col1 = (SELECT GROUP_CONCAT(DISTINCT col1) FROM table2 WHERE condition);
SELECT *
FROM table
WHERE FIND_IN_SET(table.col1, @col1);

This statement first uses select for data matching, and then uses GROUP_CONCAT to connect the col1 value list into a string, which is stored in @col1. In subsequent queries, FIND_IN_SET is used for matching and memory caching is used to optimize query efficiency.

7. Summary

in When using subqueries, be sure to avoid full table scans, especially when the amount of data is large, otherwise it will cause serious performance problems. By joining, exists, optimizing indexes, using limit appropriately, using memory and other methods, you can improve query efficiency and optimize the performance of in subqueries. In actual projects, we should choose the best solution according to the specific situation to achieve the best performance optimization effect.

The above is the detailed content of How does MySQL optimize in subquery?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn