search

Home  >  Q&A  >  body text

mysql - sql optimization problem, between is better than in?

Saw the information on the Internet and said:

in 和 not in 也要慎用,否则会导致全表扫描,如:

select id from t where num in(1,2,3)

对于连续的数值,能用 between 就不要用 in 了:

select id from t where num between 1 and 3
phpcn_u1582phpcn_u15822752 days ago830

reply all(4)I'll reply

  • 阿神

    阿神2017-05-18 10:58:50

    Continuous values ​​are of course between, which reduces parsing, and if the range of in exceeds a certain number, the entire table will be deleted by default. 9 or more or less is forgotten

    Additional answer to the comments below:
    In the full list, it depends on the situation. The above answer is just a rough summary from memory. The friend downstairs is more serious. I will explain here again. This is not a number but a number. It's a ratio, about 25%-35%. If you want to ask how much it is, I'm sorry because my level is limited and I can't confirm it without reading the source code. And the ratio of about 30 does not mean that a full table scan must be performed, because mysql also has an index scan, which means that if the selected content can be found in your index, of course it will not scan the entire table, such as the following In the examples, select id from ttt where id in (..); and select * from ttt where id in (...); the previous one is definitely a primary key scan, even if you enter all the id values, it will still be a primary key scan, and The following situation is the situation of this percentage. Please refer to the example below for details. Any corrections are welcome ^_^
    . Then I also want to talk about the situation between. Why is it good? In addition to reducing parsing for continuous access on the index segment, there are also One situation is that when retrieving data through disk addressing, part of the data near the first value will be read by default (there is such a probability algorithm that when a piece of data is retrieved, the data near it is also very large. Probability will be used) So there is a situation where redundant data is taken out at one time to avoid multiple addressing. At this time, using the continuous value between between is suitable

    mysql> select * from ttt;
    +----+-------------------+
    | id | name              |
    +----+-------------------+
    |  1 | I17021234001      |
    |  2 | IC17031234002     |
    |  3 | C17041234003      |
    |  4 | IAsEw1234001      |
    |  5 | I17021234001A2    |
    |  6 | IC17031234002A2   |
    |  7 | C17041234003A2    |
    |  8 | IAsEw1234001A2    |
    |  9 | I17021234001A2    |
    | 10 | IC17031234002A2   |
    | 11 | C17041234003A2    |
    | 12 | IAsEw1234001A2    |
    | 13 | I17021234001A2A2  |
    | 14 | IC17031234002A2A2 |
    | 15 | C17041234003A2A2  |
    | 16 | IAsEw1234001A2A2  |
    | 17 | I17021234001A2    |
    | 18 | IC17031234002A2   |
    | 19 | C17041234003A2    |
    | 20 | IAsEw1234001A2    |
    +----+-------------------+
    20 rows in set (0.00 sec)
    
    mysql> show create table ttt;
    +-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Table | Create Table                                                                                                                                                             |
    +-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | ttt   | CREATE TABLE `ttt` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `name` char(32) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=32 DEFAULT CHARSET=utf8 |
    +-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> explain select * from ttt where id in (1,2,3,4,5,6);
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    |  1 | SIMPLE      | ttt   | ALL  | PRIMARY       | NULL | NULL    | NULL |   20 | Using where |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    1 row in set (0.00 sec)
    
    mysql> explain select * from ttt where id in (1,2,3,4,5);
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    | id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    |  1 | SIMPLE      | ttt   | range | PRIMARY       | PRIMARY | 4       | NULL |    5 | Using where |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    1 row in set (0.00 sec)
    
    mysql> explain select * from ttt where id in (1,2,3);
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    | id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    |  1 | SIMPLE      | ttt   | range | PRIMARY       | PRIMARY | 4       | NULL |    3 | Using where |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    1 row in set (0.00 sec)
    
    mysql> explain select * from ttt where id in (1,2,3,4,5,6,7);
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    |  1 | SIMPLE      | ttt   | ALL  | PRIMARY       | NULL | NULL    | NULL |   20 | Using where |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    1 row in set (0.00 sec)
    
    mysql> explain select id from ttt where id in (1,2,3,4,5,6,7);
    +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
    | id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
    |  1 | SIMPLE      | ttt   | index | PRIMARY       | PRIMARY | 4       | NULL |   20 | Using where; Using index |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
    1 row in set (0.00 sec)
    
    mysql> explain select name from ttt where id in (1,2,3,4,5,6,7);
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    |  1 | SIMPLE      | ttt   | ALL  | PRIMARY       | NULL | NULL    | NULL |   20 | Using where |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    1 row in set (0.00 sec)

    reply
    0
  • 阿神

    阿神2017-05-18 10:58:50

    According to the storage structure of the B-tree index in the database, the physical address pointing to the data is stored in the leaf node, and this physical address is ordered when there is a clustered index.

    如果是连续数值,between在找到第一个匹配值后,则直接从该地址往后搜索,直到最后一个元素为止。这样就不会对后续值进行索引扫描,因此速度快了。
    
    对于in操作,不大清楚,但是估计应该会对全索引进行扫描吧。

    reply
    0
  • PHP中文网

    PHP中文网2017-05-18 10:58:50

    EXPLAIN mysql 语句 Take a look at the output

    reply
    0
  • 淡淡烟草味

    淡淡烟草味2017-05-18 10:58:50

    When using between, you only need to match the upper and lower bounds, so it will be faster; each in must be read again, which will cause a full table scan.

    reply
    0
  • Cancelreply