Detailed introduction to MySQL indexing and query optimization-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

Detailed introduction to MySQL indexing and query optimization

不言

Jan 07, 2019 am 11:04 AM

mysqlmysql optimizationsqlprogrammer

This article brings you a detailed introduction to MySQL indexing and query optimization. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

The article "MySQL Query Analysis" describes the method of using MySQL slow query and explain command to locate mysql performance bottlenecks. After locating the performance bottleneck sql statements, you need to analyze the inefficient sql statements. optimization. This article mainly discusses MySQL index principles and commonly used SQL query optimization.

A simple comparison test

In the previous case, the c2c_zwdb.t_file_count table has only one auto-incrementing id, and the sql execution situation of the FFileName field without indexing is as follows:

Detailed introduction to MySQL indexing and query optimization

In the above figure, type=all, key=null, rows=33777. This SQL does not use indexes and is a very inefficient full table scan. If joint queries and other constraints are added, the database will consume crazy memory and affect the execution of the front-end program.

At this time add an index to the FFileName field:

alter table c2c_zwdb.t_file_count add index index_title(FFileName);

Execute the above query statement again, the contrast is obvious:

Detailed introduction to MySQL indexing and query optimization

In this figure, type=ref, key=index name (index_title), rows=1. This SQL uses the index index_title, and it is a constant scan. Only one row is scanned based on the index.

Compared with the situation without indexing, after adding index, the query efficiency contrast is very obvious.

MySQL Index

It can be seen from the above comparison test that index is the key to fast search. The establishment of MySQL index is very important for the efficient operation of MySQL. For a small amount of data, the impact of not having a suitable index is not great, but as the amount of data increases, the performance will drop sharply. If multiple columns are indexed (combined index), the order of the columns is very important, and MySQL can only perform effective searches on the leftmost prefix of the index.

The following introduces several common MySQL index types.

Indexes are divided into single column indexes and combined indexes. A single-column index means that an index only contains a single column. A table can have multiple single-column indexes, but this is not a combined index. Combined index, that is, an index contains multiple columns.

1. MySQL index type

(1) Primary key index PRIMARY KEY

It is a special unique index that does not allow null values . Generally, the primary key index is created at the same time when creating the table.

Detailed introduction to MySQL indexing and query optimization

Of course, you can also use the ALTER command. Remember: a table can only have one primary key.

(2) Unique index UNIQUE

The value of the unique index column must be unique, but null values are allowed. In the case of a composite index, the combination of column values must be unique. You can specify it when creating the table, or you can modify the table structure, such as:

ALTER TABLE table_name ADD UNIQUE (column)

(3) Ordinary index INDEX

This is the most basic index, it has no restrictions. You can specify it when creating the table, or you can modify the table structure, such as:

ALTER TABLE table_name ADD INDEX index_name (column)

(4) Combined index INDEX

Combined index, that is, an index contains multiple columns. You can specify it when creating the table, or you can modify the table structure, such as:

ALTER TABLE table_name ADD INDEX index_name(column1, column2 , column3)

(5) Full-text index FULLTEXT

Full-text index (also called full-text search) is a key currently used by search engines technology. It can use various algorithms such as word segmentation technology to intelligently analyze the frequency and importance of key words in the text, and then intelligently filter out the search results we want according to certain algorithm rules.

can be specified when creating the table, or you can modify the table structure, such as:

ALTER TABLE table_name ADD FULLTEXT (column)

2. Index structure and principle

B Tree is commonly used as index in mysql, but the implementation differs according to clustered index and non-clustered index. This article will not discuss this point for now.

b tree introduction

The picture of the b tree below can be seen in many places. The reason why I choose this picture here is because I think this picture It can explain the index search process very well.

Detailed introduction to MySQL indexing and query optimization

As shown above, it is a b-tree. The light blue block is called a disk block. You can see that each disk block contains several data items (shown in dark blue) and pointers (shown in yellow). For example, disk block 1 contains data items 17 and 35. Contains pointers P1, P2, and P3. P1 represents disk blocks less than 17, P2 represents disk blocks between 17 and 35, and P3 represents disk blocks greater than 35.

The real data exists in leaf nodes, namely 3, 5, 9, 10, 13, 15, 28, 29, 36, 60, 75, 79, 90, 99. Non-leaf nodes do not store real data, only data items that guide the search direction. For example, 17 and 35 do not actually exist in the data table.

Search process

In the above figure, if you want to find data item 29, then disk block 1 will first be loaded from the disk to the memory, and an IO will occur at this time , use binary search in memory to determine that 29 is between 17 and 35, lock the P2 pointer of disk block 1, the memory time is negligible because it is very short (compared to the IO of the disk), and pass the disk address of the P2 pointer of disk block 1 Load disk block 3 from disk to memory, the second IO occurs, 29 is between 26 and 30, lock the P2 pointer of disk block 3, load disk block 8 into the memory through the pointer, the third IO occurs, and at the same time in the memory Do a binary search to find 29 and end the query, a total of three IOs. The real situation is that a 3-layer b-tree can represent millions of data. If millions of data searches only require three IOs, the performance improvement will be huge. If there is no index, each data item will have to have one IO. , then a total of millions of IOs are required, which is obviously very, very expensive.

Properties

(1) The index field should be as small as possible.

Through the search process of the b tree above, or from the fact that real data exists in leaf nodes, we can know that the number of IOs depends on the height h of the b number.

Assume that the data volume of the current data table is N, and the number of data items in each disk block is m, then the tree height h=㏒(m 1)N, when the data volume N is constant, m The larger the value, the smaller h;

And m = the size of the disk block/the size of the data item. The size of the disk block is the size of a data page, which is fixed; if the space occupied by the data item is smaller, , the greater the number m of data items, the lower the height h of the tree. This is why each data item, that is, the index field, must be as small as possible. For example, int occupies 4 bytes, which is half less than bigint 8 bytes.

(2) The leftmost matching characteristic of the index.

When the data items of the b tree are composite data structures, such as (name, age, sex), the b number is built in order from left to right, such as when (Zhang San ,20,F) When retrieving data like this, the b-tree will give priority to comparing the name to determine the next search direction. If the names are the same, then compare age and sex in turn, and finally get the retrieved data; when (20,F ) When data without name comes, the b-tree does not know which node to check next, because name is the first comparison factor when building the search tree, and it must first search based on name to know where to go next. Inquire. For example, when retrieving data like (Zhang San, F), the b-tree can use name to specify the search direction, but the next field age is missing, so it can only find all the data whose name is equal to Zhang San, and then match the gender. It is the data of F. This is a very important property, that is, the leftmost matching characteristic of the index.

Several major principles for building an index

(1) Leftmost prefix matching principle

For multi-column indexes, always start from the front of the index The field starts and continues, and the middle cannot be skipped. For example, if you create a multi-column index (name, age, sex), the name field will be matched first, then the age field, and then the sex field. The middle cannot be skipped. MySQL will keep matching to the right until it encounters a range query (>,

Generally, when creating a multi-column index, the most frequently used column in the where clause is placed on the far left.

Look at a comparative example of complementation that complies with the leftmost prefix matching principle and that complies with this principle.

Example: Table c2c_db.t_credit_detail has an index (Flistid,Fbank_listid)

Detailed introduction to MySQL indexing and query optimization

SQL statements that do not comply with the leftmost prefix matching principle:

select * from t_credit_detail where Fbank_listid='201108010000199'G

This SQL statement directly uses the second index field Fbank_listid, skip it The first index field Flistid is removed, which does not comply with the leftmost prefix matching principle. Use the explain command to view the execution plan of the sql statement, as shown below:

Detailed introduction to MySQL indexing and query optimization

As can be seen from the above figure, the sql does not use indexes and is a low Efficient full table scan.

SQL statement that conforms to the leftmost prefix matching principle:

select * from t_credit_detail where Flistid='2000000608201108010831508721' and Fbank_listid='201108010000199'G

This sql first uses the first field Flistid of the index, and then uses the second field Fbank_listid of the index. There is no skipping in the middle, which conforms to the leftmost prefix matching principle. Use the explain command to view the execution plan of the sql statement, as shown below:

Detailed introduction to MySQL indexing and query optimization

As can be seen from the above figure, the sql uses indexes and only scans One line.

Comparison shows that the efficiency of SQL statements that comply with the leftmost prefix matching principle is greatly improved compared to SQL statements that do not comply with this principle, from full table scan to constant scan.

(2) Try to select columns with high differentiation as indexes.

For example, we will choose the student number as the index, but not the gender.

(3) = and in can be out of order

For example, a = 1 and b = 2 and c = 3, (a, b, c) index can be created in any order, mysql query The optimizer will help you optimize it into a form that the index can recognize.

(4) Index columns cannot participate in calculations, keep the columns "clean"

For example: Flistid 1>'2000000608201108010831508721'. The reason is very simple. If the index column participates in the calculation, then each time the index is retrieved, the index will be calculated once and then compared. Obviously, the cost is too high.

(5) Expand the index as much as possible, do not create a new index.

For example, there is already an index of a in the table, and now you want to add an index of (a, b), then you only need to modify the original index.

Disadvantages of index

Although indexes can improve query efficiency, indexes also have their own shortcomings.

Additional overhead of index:

(1) Space: The index needs to occupy space;

(2) Time: Querying the index takes time;

( 3) Maintenance: The index needs to be maintained (when data changes);

It is not recommended to use the index:

(1) Table with a small amount of data

(2) Space is tight

Commonly used optimization summary

There are many optimization statements, and there are many things that need to be paid attention to. Here are a few points based on the usual situation:

1. There is an index but it is not used. (not recommended)

(1) When the parameters of Like start with a wildcard character

Try to avoid the parameters of Like starting with a wildcard character, otherwise the database engine will give up using the index and perform a full table scan .

SQL statements starting with wildcards, for example: select * from t_credit_detail where Flistid like '%0'G

Detailed introduction to MySQL indexing and query optimization

This is all Table scan does not use indexes and is not recommended.

SQL statements that do not start with wildcards, for example: select * from t_credit_detail where Flistid like '2%'G

Detailed introduction to MySQL indexing and query optimization

Obviously , this uses the index, which is a range search, and is much more efficient than the SQL statement starting with a wildcard character.

(2) When the where condition does not comply with the leftmost prefix principle

Examples have been given in the content of the leftmost prefix matching principle.

(3) Use! = or operators

try to avoid using them! = or operator, otherwise the database engine will give up using the index and perform a full table scan. It is more efficient to use > or <.>

select * from t_credit_detail where Flistid != '2000000608201108010831508721'G

Detailed introduction to MySQL indexing and query optimization

##(4) Index columns participate in the calculation

You should try to avoid expression operations on fields in the where clause, which will cause the engine to give up using the index and perform a full table scan.

select * from t_credit_detail where Flistid 1 > '2000000608201108010831508722'G

Detailed introduction to MySQL indexing and query optimization ##(5) Judge the null value of the field

Try to avoid making null value judgments on fields in the where clause, otherwise the engine will give up using the index and perform a full table scan, such as: Inefficiency: select * from t_credit_detail where Flistid is null;

You can set the default value 0 on Flistid, ensure that there is no null value in the Flistid column in the table, and then query like this: Efficient: select * from t_credit_detail where Flistid =0;

(6) Use or to connect conditions

You should try to avoid using or in the where clause to connect conditions, otherwise the engine will give up using the index and perform a full table scan, such as: Inefficiency: select * from t_credit_detail where Flistid = '2000000608201108010831508721' or Flistid = '10000200001';

You can use the following query to replace the above or query: Efficient: select

from t_credit_detail where Flistid = '2000000608201108010831508721' union all select

from t_credit_detail where Flistid = '10000200001' ;

#2. Avoid selecting *

During the parsing process, '*' will be converted into all column names in turn. This works This is done by querying the data dictionary, which means it will take more time.

So, you should develop a good habit of taking whatever you need.

3. Order by statement optimization

Any non-index items or calculation expressions in the Order by statement will reduce the query speed.

Method: 1. Rewrite the order by statement to use the index;

2.为所使用的列建立另外一个索引 3.绝对避免在order by子句中使用表达式。

4. GROUP BY statement optimization

Improve the efficiency of the GROUP BY statement by removing unnecessary Records are filtered out before GROUP BY

Inefficiency:

SELECT JOB, AVG(SAL)

FROM EMP

GROUP by JOB

HAVING JOB = 'PRESIDENT'

OR JOB = 'MANAGER'

Efficient:

SELECT JOB , AVG(SAL)

FROM EMP

WHERE JOB = 'PRESIDENT'

OR JOB = 'MANAGER'

GROUP by JOB

5. Use exists instead of in

Many times it is a good choice to use exists instead of in: select num from a where num in(select num from b) Replace with the following statement: select num from a where exists(select 1 from b where num=a.num)

6. Use varchar/nvarchar instead of char/nchar

Use varchar/nvarchar instead of char/nchar as much as possible, because first of all, the storage space of variable length fields is small, which can save storage space. Secondly, for In terms of queries, searching within a relatively small field is obviously more efficient.

7. If you can use DISTINCT, you don’t need GROUP BY

SELECT OrderID FROM Details WHERE UnitPrice > 10 GROUP BY OrderID

can be changed to:

SELECT DISTINCT OrderID FROM Details WHERE UnitPrice > 10

8. If you can use UNION ALL, don’t use UNION

UNION ALL does not execute the SELECT DISTINCT function, which will reduce a lot of unnecessary resources.

9. Use equivalent types of examples when joining tables and index them

If the application has many JOIN queries, you should confirm that the Join fields in the two tables are created. indexed. In this way, MySQL will start a mechanism internally to optimize the Join SQL statement for you.

Moreover, these fields used for Join should be of the same type. For example: If you join a DECIMAL field with an INT field, MySQL cannot use their indexes. For those STRING types, they also need to have the same character set. (The character sets of the two tables may be different)

This article ends here. For more knowledge about MySQL, you can follow the MySQL Tutorial column of the php Chinese website ! ! !

The above is the detailed content of Detailed introduction to MySQL indexing and query optimization. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:segmentfault. If there is any infringement, please contact admin@php.cn delete

揭秘C语言的吸引力: 发掘程序员的潜质Feb 24, 2024 pm 11:21 PM

学习C语言的魅力：解锁程序员的潜力随着科技的不断发展，计算机编程已经成为了一个备受关注的领域。在众多编程语言中，C语言一直以来都备受程序员的喜爱。它的简单、高效以及广泛应用的特点，使得学习C语言成为了许多人进入编程领域的第一步。本文将讨论学习C语言的魅力，以及如何通过学习C语言来解锁程序员的潜力。首先，学习C语言的魅力在于其简洁性。相比其他编程语言而言，C语

接私活挣钱！2023程序员接单平台大全！Jan 09, 2023 am 09:50 AM

上周我们做了一次关于《2023PHP创业》的公益直播，很多同学咨询具体有哪些接单平台，下面php中文网整理了22个还算靠谱的平台，以供参考！

2023过年，又限制放烟花？程序猿有办法！Jan 20, 2023 pm 02:57 PM

本篇文章给大家介绍如何用前端代码实现一个烟花绽放的绚烂效果，其实主要就是用前端三剑客来实现，也就是HTML+CSS+JS，下面一起来看一下，作者会解说相应的代码，希望对需要的朋友有所帮助。

程序员是做什么的Aug 03, 2019 pm 01:40 PM

程序员的工作职责：1、负责软件项目的详细设计、编码和内部测试的组织实施；2、协助项目经理和相关人员同客户进行沟通，保持良好的客户关系；3、参与需求调研、项目可行性分析、技术可行性分析和需求分析；4、熟悉并熟练掌握交付软件部开发的软件项目的相关软件技术；5、负责向项目经理及时反馈软件开发中的情况；6、参与软件开发和维护过程中重大技术问题的解决；7、负责相关技术文档的拟订等等。

520程序员专属浪漫表白方式！无法拒绝！May 19, 2022 pm 03:07 PM

520将至，年度虐汪大戏他又双叒叕来啦！想看看最理性的代码和最浪漫的告白究竟能碰撞出怎样的火花？下面带你逐一领略最全最完整的告白代码，看看程序员们的浪漫是否能够掳获各位心目中女神的芳心呢？

浅析怎么下载安装VSCode历史版本Apr 17, 2023 pm 07:18 PM

VSCode历史版本的下载安装 VSCode安装下载安装参考资料 VSCode安装 Windows版本：Windows10 VSCode版本：VScode1.65.0（64位User版本）本文

2022年最佳的Windows 11终端仿真器列表：Top 15款推荐Apr 24, 2023 pm 04:31 PM

终端仿真器允许您模仿标准计算机终端的功能。有了它，您可以执行数据传输并远程访问另一台计算机。当与Windows11等高级操作系统结合使用时，这些工具的创造性可能性是无穷无尽的。但是，有很多第三方终端仿真器可用。因此，很难选择合适的。但是，正如我们对必备的Windows11应用所做的那样，我们选择了您可以使用的最佳终端并提高您的工作效率。我们如何选择最好的Windows11终端模拟器？在选择此列表中的工具之前，我们的专家团队首先测试了它们与Windows11的兼容性。我们还检查了他们

Devin第一手使用体验：完成度很高，开始编码就停不下来，但要替代程序员还很远Mar 18, 2024 pm 03:30 PM

由10枚IOI金牌在手的创业团队CognitionAI开发的全球首个AI程序员智能体Devin，一发布就让科技圈坐立不安。在演示中，Devin几乎已经可以独立完成许多需要普通程序员花费大量时间才能完成的任务，而且表现一点也不逊色于普通程序员。但是，产品能力的边界在哪里，实际体验和演示时候有差距，还的看上手实测之后的效果。这位斯坦福的小哥在Devin发布的第一时间就联系了团队，获得了第一手体验的资格。他让Devin帮它做了几个难度不一的项目，录制了一个视频，在推上写下了自己的使用感受。下一个任务是

See all articles