Home  >  Article  >  Database  >  MySQL optimization and indexing methods

MySQL optimization and indexing methods

PHPz
PHPzforward
2023-06-02 13:58:211018browse

A brief introduction to index

The essence of index:

  • The essence of MySQL index or the index of other relational databases is only one sentence , exchanging space for time.

The role of index:

  • Index relational database is used to speed up the retrieval of row data in the table (disk storage ) Data structure

Classification of index

Category above data structure:

  • HASH index

    • Equal value matching is highly efficient

    • Does not support range search

  • Tree index

    • Binary tree, recursive binary search method, small left and big right

    • Balanced binary tree, binary tree to balanced binary tree, The main reason is left-handed and right-handed

    • Disadvantage 1, too many IO times

    • Disadvantage 2, IO utilization is not high, IO saturation

  • Multi-path balanced search tree (B-Tree)

    • Features, greatly reducing the height of the tree

  • B tree

    • Characteristics, using the left-closed comparison method

    • The root node support node does not have a data area, and only the leaf nodes contain the data area (to put it bluntly, even if the root node and child nodes have been located, it will not stop because there is no data area, and will continue to find the leaf nodes.)

When we search for the data 13, we can locate both the root node and the child node, but we will always find the leaf node.

MySQL optimization and indexing methods

Binary tree Balanced binary tree, B-tree comparison:

The picture shows the situation if it is an auto-increasing primary key Next:

Binary trees are obviously not suitable for relational database indexes (no different from full table scans).

As for the balanced binary tree, although this situation is solved, it will also cause the tree to be thin and tall, which will also cause too many IO queries and low IO utilization mentioned above. .

B-tree has obviously solved these two problems, so the following explains why MySQL still uses B-tree in this case and made those enhancements.

MySQL optimization and indexing methods

Comparison between B-tree and B-tree:

MySQL optimization and indexing methods

B Optimization of the tree on the B-tree:

IO efficiency is higher (each node of the B-tree will retain the data area, but the B-tree will not. Suppose we need to traverse three layers to query a piece of data, then Obviously, the IO consumption in B-tree query is smaller)

Range search efficiency is higher (as shown in the picture, B-tree has formed a natural linked list form, and only needs to be searched according to the last chain structure)

MySQL optimization and indexing methods

Index-based data scanning is more efficient.

Classification of index types

Index types can be divided into two categories:

  • Primary key index

  • Auxiliary index (secondary index)

    • Unique index

    • Compound index

    • Normal index

    • Covered index

Although the primary key index has the relatively best performance, Usually in SQL optimization, we will improve and supplement the auxiliary index.

B tree is implemented at the storage engine level

  • We create two tables respectively test_innodb (using InnoDB as the storage engine) test_myisam (Using MyISAM as the storage engine) The following figure shows the relevant files for the two table disk implementations. The two storage engines are completely different in the B-tree disk implementation.

MySQL optimization and indexing methods

B tree is implemented in MyISAM:

  • *.frm file is The table skeleton file, such as the id field and name field in this table, is stored here

  • ##*.MYD (D=data) stores the data

  • *.MYI (I=index) stores the index

MySQL optimization and indexing methods

  • For example, if you execute the following sql statement now, then In MyISAM, he first finds 103 in test_myisam.MYI, then gets the address 0x194281, and then finds the data in test_myisam.MYD and returns it.

  • SELECT id,name from test_myisam where id =103

MySQL optimization and indexing methods

  • 如果test_myisam表中,id为主键索引,name也是一个索引,那么在test_myisam.MYI中则会有两个平级的B+树,这也导致MyISAM引擎中主键索引和二级索引是没有主次之分的,是平级关系。因为这种机制在MyISAM引擎中,有可能使用多个索引,在InnoDB中则不会出现这种情况。

B+树在InnoDB落地:

MySQL optimization and indexing methods

MySQL optimization and indexing methods

  • InnoDB不像MyISAM来独立一个MYD 文件来存储数据,它的数据直接存储在叶子结点关键字对应的数据区在这保存这一个id列所有行的详细记录。

  • InnoDB 主键索引和辅助索引关系

我们现在执行如下SQL语句,他会先去找辅助索引,然后找到辅助索引下101的主键,再去回表(二次扫描)根据主键索引查询103这条数据将其返回。

SELECT id,name from test_myisam where name ='zhangsan'

这里就有一个问题了,为什么不像MyISAM在辅助索引下直接记录磁盘地址,而是要多此一举再去回表扫描主键索引,这个问题在下面相关面试题中回答,记一下这个问题是这里来的。

MySQL optimization and indexing methods

相关面试题

  • 为什么MySQL选择B+树作为索引结构

这个就不说了,上文应该讲清楚了。

  • B+树在MyISAM和InnoDB落地区别。

这个可以总结一下,MyISAM落地数据储存会有三个类型文件 ,.frm文件是表骨架文件,.MYD(D=data)则储存数据 ,.MYI (I=index)则储存索引,MyISAM引擎中主键索引和二级索引平级关系,在MyISAM引擎中,有可能使用多个索引,InnoDB则相反,主键索引和二级索有严格的主次之分在InnoDB一条语句只能用一个索引要么不用。

  • 如何判断一条sql语句是否使用了索引。

可以通过执行计划来判断 可以在sql语句前explain/ desc

set global optimizer_trace='enabled=on' 打开执行计划开关他将会把每一条查询sql执行计划记录在information_schema 库中OPTIMIZER_TRACE表中

  • 为什么主键索引最好选择自增列?

自增列,数据插入时整个索引树是只有右边在增加的,相对来说索引树的变动更小。

  • 为什么经常变动的列不建议使用索引?

和上一个问题原因一样,当一个索引经常发生变化,那么就意味这,这个缩印树也要经常发生变化。4

  • 为什么说重复度高的列,不建议建立索引?

这个原因是因为离散性,比如说,一张一百万数据的表,其中一个字段代表性别,0代表男1代表女,把这字段加了索引,那么在索引树上,将会有大量的重复数据。而我们常见的索引建立一般都是驱动型的。其目的是,尽可能的删减数据的查询范围,这个显然是不匹配的。

  • 什么是联合索引

联合索引是一个包含了多个功效的索引,他只是一个索引而不是多个,

其次,单列索引是一种特殊的联合索引

联合索引的创立要遵循最左前置原则(最常用列>离散度>占用空间小)

  • 什么是覆盖索引

通过索引项信息可直接返回所需要查询的索引列,该索引被称之为覆盖索引,说白了就是不需要做回表操作,可以从二级索引中直接取到所需数据。

  • 什么是ICP机制

索引下推,简单点来说就是,在sql执行过程中,面对where多条件过滤时,通过一个索引,完成数据搜索和过滤条件其,特点能减少io操作。

  • 在InnoDB表中不可能没有主键对还是不对原因是什么?

首先这句话是对的,但是情况有三种:

  • That is, when you manually specify this field as the primary key, this field will be used as the clustered index.

  • There are two situations when the primary key is not explicitly specified:

  • He will look for the first UK (unique key) as the primary key index Organize indexing.

  • If neither the primary key nor UK is specified, the rowId (each record in the InnoDB table will have a hidden (6byte) rowId) will be used as the clustered index. .

  • What is table return operation

Contents of query based on auxiliary index in InnoDB , cannot be obtained directly from the auxiliary index, and the operation that requires a secondary scan based on the primary key index is called a table return operation.

  • Why does the auxiliary index leaf node data area in InnoDB record the value of the primary key index instead of recording the disk address like in MyISAM.

The reason is actually very simple, because the data structure of the primary key index changes frequently. If the disk address is recorded in the auxiliary index data area, then suppose we have 10 Auxiliary indexes, when our primary key index structure changes, we have to notify the auxiliary indexes one by one, and the primary key index structure changes frequently, and additions and deletions may affect its
data structure.

The above is the detailed content of MySQL optimization and indexing methods. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete