使用的 MySQL 官方示例表 sakila.film。
索引情况如下所示,均为 BTREE 索引:
需要对film 根据标题 title 排序后取某一页的数据,直接查询
EXPLAIN SELECT film_id,description FROM film ORDER BY title LIMIT 50,5;
可以看到,优化器执行了全表扫描,我想通过按照索引分页后回表的方式改写 SQL,所以重写 SQL 如下:
#按照索引分页后回表查询
EXPLAIN SELECT a.`film_id`,a.`description` FROM film a
INNER JOIN (SELECT film_id FROM film ORDER BY title LIMIT 50,5) b ON a.`film_id`=b.`film_id`;
但是在 id 为 2 的查询过程中,虽然 type 为 index,但是实际上仍然扫描了全表这是为什么呢?
怪我咯2017-04-17 13:46:57
In fact, performance has improved since you rewritten it as INNER JOIN
.
The cost of the first query is:
mysql> select film_id, description from film order by title limit 50,5;
mysql> show status like 'Last_query_cost';
+-----------------+------------+
| Variable_name | Value |
+-----------------+------------+
| Last_query_cost | 209.799000 |
+-----------------+------------+
The cost of the second query is:
mysql> select a.film_id, a.description from film a inner join (select film_id from film c order by title limit 50,5) b on a.film_id = b.film_id;
mysql> show status like 'Last_query_cost'; +-----------------+----------+
| Variable_name | Value |
+-----------------+----------+
| Last_query_cost | 0.000000 |
+-----------------+----------+
So your goal has been achieved.
The reason for the improvement is that after using INNER JOIN
, the covering index comes into play. The entire matching process only requires film_id and title, and they both have indexes. description
The field will only be read after matching, so a lot of IO is saved.
Let’s briefly analyze the second query.
The JOIN algorithm implemented by MySQL is essentially a simple nested loop. The outermost loop always traverses linearly and does not use indexes (this is a temporary table and there is no index). If the inner loop has an index, it will Use indexes. Therefore, the size of the outer loop directly determines the performance of JOIN.
Judging from the results of the explain, MySQL places the table SELECT
obtained by the b
clause in the outer loop. This table only has 55 rows of records, so MySQL decides to start with it first, and the access type is ALL
, which means scanning the entire table, that is, 55 rows of records.
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 55
Extra: NULL
The second level of loop is based on the table a
. Here we only need to find film_id
, using the primary key index, which is very fast. Remember, the third level of looping occurs only after matching.
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: a
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: b.film_id
rows: 1
Extra: NULL
The third level loop is based on the table SELECT
in the film
clause, matching title
, and also uses an index, which is very fast.
*************************** 3. row ***************************
id: 2
select_type: DERIVED
table: c
type: index
possible_keys: NULL
key: idx_title
key_len: 767
ref: NULL
rows: 989
Extra: Using index
You ask why the entire table is still scanned? In fact, it shouldn't be. If you want to scan the entire table, the access type should be ALL instead of index. So don't worry.
Your optimized writing method happened to appear in Chapter 6 of High Performance MySQL, I happened to see it.
阿神2017-04-17 13:46:57
The usage of INNER JOIN is to take out the tuples that meet the requirements, and you must scan the entire table.
During the query with id 2. . . Why didn’t you add and a.film_id=2;