Home  >  Article  >  Database  >  The difference between mysql merge union merge sort_union

The difference between mysql merge union merge sort_union

伊谢尔伦
伊谢尔伦Original
2016-11-21 15:20:241394browse

Seeing the Index Merge Optimization in the MYSQL manual, I can’t help but have some ideas, so I recorded them as follows

First, let’s explain the difference between the two methods:
Both methods use different secondary indexes in a table. Note that they are single surface.
merge union: When using or, if the secondary index contains all key parts, then you can get the key value or ROWID of the sorted clustered index, then a simple union is enough to remove duplicates, no additional sorting is required.
            Source code interface quick_ror_union_select class
merge sort_union: The difference from the above is that it does not include all the key parts of the secondary index, so you must first obtain the sorted clustered index key value or ROWID before you can union the clustered index key value or ROWID Operation
                  Source code interface quick_index_merge_select
Reference manual: 9.2.1.4 Index Merge Optimization
In general, as long as mysql cannot determine that the primary key is sorted, additional sorting operations are required.


If we have a certain understanding of the merge sort algorithm, we can see that such processing is necessary.
We know that all subsets that need to be merged need to be sorted when merging. Here is a simple merge Illustration of the algorithm:

The difference between mysql merge union merge sort_union

If we regard 1 2 5 9 and 3 4 7 8 as primary keys, then they need to be sorted to complete the final merge. Of course, the upper sorting operation can be merged or other sorting can be used Method, as long as the sorting is done well, another point is that merge
sorting friends who are familiar with data structures should know that it is also a good way to sort external disks.

To understand here we need to have an understanding of the arrangement of the combined index in the INNODB B+ tree page block:
For example: seq int, id1 int, id2 int seq is the primary key, ID1, DI2 is a combined B+ index
Then we insert the value

values(1,1,2)
values(2,1,3)
values(3,1,2)

Obviously, the order of leaf nodes in the combined index is as follows:



1       2       3
id1:1  id1:1  id1:1
id2:2  id2:2  id2:3
seq:1  seq:3  seq:2

That is, first sort by id1, then sort by id2, and finally sort by the primary key seq.

Then you can see that the order of the final primary key is 1 3 2 It is not ordered. Obviously, such a
result set cannot be used as a merged result set, so we need to sort it. This is why
sort_union sort comes from.

Then let’s demonstrate the different
scripts of the two execution plans:

create table testmer
(seq int,id1 int,id2 int,id3 int,id4 int,primary key(seq),key(id1,id2),key(id3,id4));
insert into testmer values(1,1,2,4,4);
insert into testmer values(2,1,3,4,5);
insert into testmer values(3,1,2,4,4);
insert into testmer values(4,2,4,5,6);
insert into testmer values(5,2,6,5,8);
insert into testmer values(6,2,10,5,3);
insert into testmer values(7,4,5,8,10);
insert into testmer values(8,0,1,3,4);
mysql> select * from testmer;
+-----+------+------+------+------+
| seq | id1  | id2  | id3  | id4  |
+-----+------+------+------+------+
|   1 |    1 |    2 |    4 |    4 |
|   2 |    1 |    3 |    4 |    5 |
|   3 |    1 |    2 |    4 |    4 |
|   4 |    2 |    4 |    5 |    6 |
|   5 |    2 |    6 |    5 |    8 |
|   6 |    2 |   10 |    5 |    3 |
|   7 |    4 |    5 |    8 |   10 |
|   8 |    0 |    1 |    3 |    4 |
+-----+------+------+------+------+
Using sort_union:
mysql> explain  select * from testmer force index(id1,id3) where id1=1 or id3=4;
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+----------------------------------------+
| id | select_type | table   | partitions | type        | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                  |
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+----------------------------------------+
|  1 | SIMPLE      | testmer | NULL       | index_merge | id1,id3       | id1,id3 | 5,5     | NULL |    6 |   100.00 | Using sort_union(id1,id3); Using where |
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+----------------------------------------+
1 row in set, 1 warning (5.07 sec)

Obviously we only need to sort if we only look at key(id1,id2), because the arrangement is as follows:

1 2 3
id1:1 id1:1 id1:1
id2:2 id2:2 id2:3
seq:1 seq:3 seq:2

If we bring all the secondary index KEY_PART

mysql> explain  select * from testmer force index(id1,id3) where id1=1 and id2=2 or id3=4 and id4=1;
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+-----------------------------------+
| id | select_type | table   | partitions | type        | possible_keys | key     | key_len | ref  | rows | filtered | Extra                             |
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+-----------------------------------+
|  1 | SIMPLE      | testmer | NULL       | index_merge | id1,id3       | id1,id3 | 10,10   | NULL |    2 |   100.00 | Using union(id1,id3); Using where |
+----+-------------+---------+------------+-------------+---------------+---------+---------+------+------+----------+-----------------------------------+

Of course there is no need to sort here, we look at id1=1 and id2 =2 (the same is true for id3=4 and id4=1)

The arrangement is as follows:

1         2      
id1:1   id1:1   
id2:2   id2:2 
seq:1  seq:3

That is to say, if KEY_PART contains complete, then the primary key will be naturally sorted.


In fact, I ran it in the DEBUG environment, breakpoint Hit Unique::unique_add

(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000000ebd333 in main(int, char**) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/main.cc:25
        breakpoint already hit 1 time
6       breakpoint     keep y   0x000000000145de13 in Unique::unique_add(void*) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/uniques.h:52
        breakpoint already hit 2 times

When executing select * from testmer force index(id1,id3) where id1=1 and id2=1 or id3=4 and id4=1;

Unique::unique_add is not triggered, nor There is just no sorting operation.

Finally, let’s explain the merge_sort sorting interface of the source code
QUICK_INDEX_MERGE_SELECT::read_keys_and_merge()
Call
Unique::unique_add
(Use balanced binary trees, the difference between balanced binary trees and non-red-black trees, refer to:
http://blog.itpub. net/7728585/viewspace-2127419/
)

The following is the annotation of the source code read_keys_and_merge():

/*
  Perform key scans for all used indexes (except CPK), get rowids and merge 
  them into an ordered non-recurrent sequence of rowids.
  
  The merge/duplicate removal is performed using Unique class. We put all
  rowids into Unique, get the sorted sequence and destroy the Unique.
  
  If table has a clustered primary key that covers all rows (TRUE for bdb
  and innodb currently) and one of the index_merge scans is a scan on PK,
  then rows that will be retrieved by PK scan are not put into Unique and 
  primary key scan is not performed here, it is performed later separately.
  RETURN
    0     OK
    other error
*/

The following is the stack information when I was using gdb:


(gdb) bt
#0  tree_insert (tree=0x7fffd801c768, key=0x7fffd801ada0, key_size=0, custom_arg=0x7fffd80103d0) at /root/mysql5.7.14/percona-server-5.7.14-7/mysys/tree.c:207
#1  0x000000000145df19 in Unique::unique_add (this=0x7fffd801c260, ptr=0x7fffd801ada0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/uniques.h:56
#2  0x000000000178e6a8 in QUICK_INDEX_MERGE_SELECT::read_keys_and_merge (this=0x7fffd89083f0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:10700
#3  0x0000000001778c73 in QUICK_INDEX_MERGE_SELECT::reset (this=0x7fffd89083f0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/opt_range.cc:1601
#4  0x000000000155e529 in join_init_read_record (tab=0x7fffd8906e20) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:2471
#5  0x000000000155b6a1 in sub_select (join=0x7fffd8905b08, qep_tab=0x7fffd8906e20, end_of_records=false)
    at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:1271
#6  0x000000000155b026 in do_select (join=0x7fffd8905b08) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:944
#7  0x0000000001558efc in JOIN::exec (this=0x7fffd8905b08) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_executor.cc:199
#8  0x00000000015f91c6 in handle_query (thd=0x7fffd8000df0, lex=0x7fffd80033d0, result=0x7fffd8007a60, added_options=0, removed_options=0)
    at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_select.cc:184
#9  0x00000000015ac025 in execute_sqlcom_select (thd=0x7fffd8000df0, all_tables=0x7fffd8006e98) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5391
#10 0x00000000015a4640 in mysql_execute_command (thd=0x7fffd8000df0, first_level=true) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:2889
#11 0x00000000015acff6 in mysql_parse (thd=0x7fffd8000df0, parser_state=0x7ffff0fd6600) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:5836
#12 0x00000000015a0eb5 in dispatch_command (thd=0x7fffd8000df0, com_data=0x7ffff0fd6d70, command=COM_QUERY)
    at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1447
#13 0x000000000159fce6 in do_command (thd=0x7fffd8000df0) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/sql_parse.cc:1010
#14 0x00000000016e1c08 in handle_connection (arg=0x3c1c880) at /root/mysql5.7.14/percona-server-5.7.14-7/sql/conn_handler/connection_handler_per_thread.cc:312
#15 0x0000000001d71ed0 in pfs_spawn_thread (arg=0x3bec1b0) at /root/mysql5.7.14/percona-server-5.7.14-7/storage/perfschema/pfs.cc:2188
#16 0x0000003ca62079d1 in start_thread () from /lib64/libpthread.so.0
#17 0x0000003ca5ee8b6d in clone () from /lib64/libc.so.6

Attached are the two methods of function interface calling:

merge sort_union:


T@3: | | | | | | | | | | >QUICK_INDEX_MERGE_SELECT::QUICK_INDEX_MERGE_SELECT
T@3: | | | | | | | | | | <QUICK_INDEX_MERGE_SELECT::QUICK_INDEX_MERGE_SELECT 1589
T@3: | | | | | | | | | | >QUICK_INDEX_MERGE_SELECT::init
T@3: | | | | | | | | | | <QUICK_INDEX_MERGE_SELECT::init 1595
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::reset
T@3: | | | | | | | | | >QUICK_INDEX_MERGE_SELECT::read_keys_and_merge
T@3: | | | | | | | | | <QUICK_INDEX_MERGE_SELECT::read_keys_and_merge 10716
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::reset 1602
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::get_next
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::get_next 10753
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::get_next
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::get_next 10753
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::get_next
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::get_next 10753
T@3: | | | | | | | | >QUICK_INDEX_MERGE_SELECT::get_next
T@3: | | | | | | | | <QUICK_INDEX_MERGE_SELECT::get_next 10753
T@3: | | | | | | | >QUICK_INDEX_MERGE_SELECT::~QUICK_INDEX_MERGE_SELECT
T@3: | | | | | | | <QUICK_INDEX_MERGE_SELECT::~QUICK_INDEX_MERGE_SELECT 1635

merge union:



T@3: | | | | | | | | | | >QUICK_ROR_UNION_SELECT::init
T@3: | | | | | | | | | | <QUICK_ROR_UNION_SELECT::init 1942
T@3: | | | | | | | | >QUICK_ROR_UNION_SELECT::reset
T@3: | | | | | | | | <QUICK_ROR_UNION_SELECT::reset 2004
T@3: | | | | | | | | >QUICK_ROR_UNION_SELECT::get_next
T@3: | | | | | | | | <QUICK_ROR_UNION_SELECT::get_next 10948
T@3: | | | | | | | | >QUICK_ROR_UNION_SELECT::get_next
T@3: | | | | | | | | <QUICK_ROR_UNION_SELECT::get_next 10948
T@3: | | | | | | | | >QUICK_ROR_UNION_SELECT::get_next
T@3: | | | | | | | | <QUICK_ROR_UNION_SELECT::get_next 10913
T@3: | | | | | | | >QUICK_ROR_UNION_SELECT::~QUICK_ROR_UNION_SELECT
T@3: | | | | | | | <QUICK_ROR_UNION_SELECT::~QUICK_ROR_UNION_SELECT 2021

You can see the calling path and check the source code calling situation. I just want to prove that sorting is indeed done, and then see what sorting method is used.

This article only represents my personal opinion , prompt if there is an error. Thanks!

The above are the different contents of mysql merge union merge sort_union. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!



Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn