I have a model called CacheSync and mysql shows it has an index:
mysql> show indexes from cache_syncs; +-------------+------------+---------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-------------+------------+---------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | cache_syncs | 0 | PRIMARY | 1 | id | A | 90878 | NULL | NULL | | BTREE | | | | cache_syncs | 1 | index_cache_syncs_on_created_at | 1 | created_at | A | 18175 | NULL | NULL | YES | BTREE | | | +-------------+------------+---------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 2 rows in set (0.01 sec)
But when I go to explain it, it says it's not using an index:
CacheSync.where("created_at < ?", (Time.now - 1.hour).to_time).explain => EXPLAIN for: SELECT `cache_syncs`.* FROM `cache_syncs` WHERE (created_at < '2022-06-13 19:37:23.316439') +----+-------------+-------------+------+---------------------------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+---------------------------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | cache_syncs | ALL | index_cache_syncs_on_created_at | NULL | NULL | NULL | 93651 | Using where | +----+-------------+-------------+------+---------------------------------+------+---------+------+-------+-------------+ 1 row in set (0.00 sec)
Why not use index?
thanks for your help, Kevin
P粉4211197782024-03-22 10:04:01
In my experience, if the optimizer estimates that your condition matches more than 20% of the table, it will fall back to a table scan. It guesses that reading all the rows from the clustered index is faster than looking up the values in the secondary index, and then doing another lookup to get the corresponding rows from the table.
The 20% threshold is not any official feature, it's just something I observed. It is not configurable in current versions of MySQL.
You can use index hints to convince it that a table scan is too expensive:
SELECT ... FROM mytable FORCE INDEX (index_cache_syncs_on_created_at) WHERE ...
It will perform a table scan only if the index you specify is not related to the conditions in the query.
See https://dev.mysql.com/ doc/refman/8.0/en/index-hints.html for more information about index hints.
I'm not a Rails developer, but this old answer shows a way to pass index hint syntax to Rails: https:// stackoverflow.com/a/13904227/20860 I don't know if this This is still the current practice.