Mysql Tutorial

Cry..I thought I knew MySQL indexes well

coldplay.xixi

Nov 04, 2020 pm 05:24 PM

mysql index

mysql video tutorial column introduces the real index.

Cry..I thought I knew MySQL indexes well

Related free learning recommendations: mysql video tutorial

1. What is an index?

In a relational database, an index is a separate, physical storage structure that sorts the values of one or more columns in a database table. It is a collection of the values of one or several columns in a table. Collection and corresponding list of logical pointers to the data pages in the table that physically identify these values. The index is equivalent to the table of contents of a book. You can quickly find the required content based on the page numbers in the table of contents.

When there are a large number of records in the table, if you want to query the table, the first way to search for information is to search the whole table, which is to take out all the records one by one, compare them one by one with the query conditions, and then return Records that meet the conditions will consume a lot of database system time and cause a lot of disk I/O operations; the second is to create an index in the table, then find the index value that meets the query conditions in the index, and finally save it in the index The ROWID (equivalent to the page number) in the table can quickly find the corresponding record in the table.

The index data structure used by the InnoDB storage engine after MySQL5.5 mainly uses: B Tree; this article will take you to talk about the past and present life of B Tree;

**Mark**

：

B Tree can be used for , >=, BETWEEN, IN , and LIKE usage indexes that do not start with a wildcard. (After MySQL 5.5)

These facts may subvert some of your perceptions, such as in other articles or books you have read. All of the above are "range queries" and are not indexed!

That’s right, before 5.5, the optimizer would not choose to search through the index. The optimizer believed that the rows retrieved in this way were more than those of the full table scan, because it would have to go back to the table to check again, which may involve The number of I/O rows is larger and is abandoned by the optimizer.

After optimization of the algorithm (B Tree), it supports scanning of some range types (taking advantage of the orderliness of the B Tree data structure). This approach also violates the leftmost prefix principle, resulting in the condition after the range query being unable to use the joint index, which we will explain in detail later.

2. Advantages and Disadvantages of Index

1. Advantages

The index greatly reduces the amount of data that the server needs to scan
The index can help The server avoids sorting and temporary tables
Indexes can turn random I/O into sequential I/O

2. Disadvantages

Although indexes are greatly improved It improves the query speed, but it also reduces the speed of updating the table, such as INSERT, UPDATE and DELETE on the table. Because when updating the table, MySQL not only needs to save the data, but also save the index file.
Creating index files will occupy disk space. Generally speaking, this problem is not serious, but if you create multiple combination indexes on a large table and insert a large amount of data, the index file size will expand rapidly.
If a data column contains a lot of repeated content, indexing it will not have much practical effect.
For very small tables, a simple full table scan is more efficient in most cases;

Therefore, only the most frequently queried and most frequently sorted data columns should be indexed. (The total number of indexes in the same data table in MySQL is limited to 16)

　One of the meanings of the existence of a database is to solve data storage and fast search. So where does the data in the database exist? That's right, it's a disk. What are the advantages of a disk? Cheap! What about the disadvantages? Slower than memory access.

So do you know the data structures mainly used by MySQL indexes?

B tree! you blurt out.

What kind of data structure is the B-tree? Why did MySQL choose B-tree for index?

In fact, the final choice of B-tree went through a long process of evolution:

#Binary sorting tree → Binary balanced tree → B-Tree（B tree） → B Tree（B tree）

A friend asked me “What is the difference between B-tree and B-tree?” ? To generalize here, MySQL data structures only have B-Tree (B tree) and B Tree (B tree). Most of them are just different pronunciations. "B-Tree" is generally called B-tree. You can also call it B-tree~ ~

Also, the red-black tree mentioned by a friend is a storage structure in a programming language, not MySQL; for example, Java's HashMap uses a linked list plus a red-black tree.

Okay, today I will take you through the process of evolving into a B-tree.

3. The past and present life of B Tree index

1. Binary sorting tree

Before understanding the B-tree, let’s briefly talk about the binary sorting tree. For a node, its left child The value of the child node of the tree must be less than itself, and the value of the child node of its right subtree must be greater than itself. If all nodes meet this condition, then it is a binary sorted tree. (Here you can string together the knowledge points of binary search)

The picture above is a binary sorting tree. You can try to use its characteristics to experience the process of finding 9:

9 is smaller than 10, go to its left subtree (node 3) to find
9 is larger than 3, go to the right subtree of node 3 (node 4) to find
9 is larger than 4, go to the right subtree of node 4 (node 9) to find
Node 9 is equal to 9, the search is successful

Compared 4 times in total, then you Have you ever thought about how to optimize the above structure?

2. AVL tree (self-balancing binary search tree)

The above picture is an AVL tree. The number and value of nodes are equal to the binary sorting tree. Exactly the same

Let’s look at the process of finding 9:

9 is larger than 4, go to its right subtree to find
9 is smaller than 10, go Its left subtree searches for
nodes 9 and 9, which are equal, and the search is successful

. It has been compared a total of 3 times, and the same amount of data is once less than the binary sorting tree. Why? ? Because the height of the AVL tree is smaller than that of the binary sorting tree, the higher the height, the more the number of comparisons. Don’t underestimate this optimization. If it is 2 million pieces of data, the number of comparisons will be significantly different.

You can imagine a balanced binary tree with 1 million nodes and a tree height of 20. A query may need to access 20 data blocks. In the era of mechanical hard disks, it took about 10 ms of seek time to randomly read a data block from the disk. In other words, for a table with 1 million rows, if a binary tree is used to store it, it may take 20 10 ms to access a single row. This query is really slow!

3. B-tree (Balanced Tree) Multi-way balanced search tree Multi-fork

B-tree is a multi-way self-balancing search tree. It is similar to an ordinary binary tree, but the B tree allows Each node has more child nodes. The schematic diagram of the B-tree is as follows:

Characteristics of the B-tree:

All key values are distributed throughout the tree
Any keyword appears and only appears in one node
The search may end at a non-leaf node
A search is performed within the full set of keywords, and the performance is close to the binary search algorithm

In order to improve efficiency, the number of disk I/Os should be reduced as much as possible. In the actual process, the disk is not read strictly on demand every time, but is read ahead every time.

After the disk reads the required data, it will read some more data into the memory in order. The theoretical basis for this is the locality principle noted in computer science:

Since disk sequential reading is very efficient (no addressing time required, only little rotation time), for programs with locality, read-ahead can improve I/O efficiency. The length of read-ahead Generally, it is an integral multiple of page.
MySQL (using the InnoDB engine by default) manages records in pages, and the default size of each page is 16K (can be modified).

B-Tree uses the computer disk read-ahead mechanism:

Every time a new node is created, a page of space is applied for, so each time a node is searched, only one I/ O;Because in actual applications, the node depth will be very small, so the search efficiency is very high.

So how is the final version of the B-tree made?

4. B Tree (B tree is a variant of B tree and also a multi-path search tree)

You can also see from the picture , The difference between B-tree and B-tree is:

All keywords are stored in leaf nodes, non-leaf nodes do not store real data, so leaf nodes can be quickly located .
Adds a chain pointer to all leaf nodes, which means that all values are stored in order, and the distance from each leaf page to the root is the same, which is very suitable for searching range data .

** Therefore, B Tree can use indexes for , >=, BETWEEN, IN, and LIKE that does not start with a wildcard character. **

Advantages of B tree:

The number of comparisons is balanced, reducing the number of I/Os, improving the search speed, and making the search more stable.

The disk read and write cost of B tree is lower
The query efficiency of B tree is more stable

What you need to know is that every time you create a table, the system will Automatically create an ID-based clustered index (the above-mentioned B-tree) for you to store all data; every time you add an index, the database will create an additional index for you (the above-mentioned B-tree), and the number of fields selected by the index is each The number of node storage data indexes. Note that this index does not store all data.

4. Why does MySQL choose B-tree instead of B-tree for index?

B tree is more suitable for external storage (usually disk storage). Since internal nodes (non-leaf nodes) do not store data, one node can store more internal nodes, and each node can be indexed The range is larger and more precise. In other words, the amount of information in a single disk I/O using B-tree is larger than that of B-tree, and the I/O efficiency is higher.
Mysql is a relational database, and an index column is often accessed according to the interval. Link pointers are established in order between the leaf nodes of the B-tree, which enhances interval access, so the B-tree has Interval range queries are friendly. The key and data of each node of the B-tree are together, so interval search cannot be performed.

5. Programmers, index knowledge points you should know

1. Table return query

For example, if you create name, age index name_age_index, query the data

select * from table where name ='陈哈哈' and age = 26;
1复制代码

is used because there are only name and age in the additional index. Therefore, after hitting the index, the database must go back to the clustered index to find other data. This is table return. This is also the one you memorize: use less The reason for select *.

2. Index coverage

It will be better understood when combined with the return table. For example, in the above name_age_index index, there is a query

select name， age from table where name ='陈哈哈' and age = 26;
1复制代码

At this time, the selected field name, age is in the index name_age_index can be obtained, so there is no need to return to the table, index coverage is satisfied, and the data in the index is returned directly, which is highly efficient. It is the preferred optimization method for DBA students when optimizing.

3. The leftmost prefix principle

The node storage index order of the B tree is stored from left to right. When matching, it is natural to satisfy the left-to-right matching; usually when we create When jointly indexing, that is, indexing multiple fields. I believe students who have established indexes will find that both Oracle and MySQL will let us choose the order of the index. For example, we want to index the three fields a, b, and c. To create a joint index, we can choose the priority we want, a, b, c, or b, a, c or c, a, b, etc. in order. Why does the database let us choose the order of fields? Aren’t they all joint indexes of three fields? This leads to the principle of the leftmost prefix of database indexes.

In our development, we often encounter the problem that a joint index is built for this field, but the index is not used when SQL queries this field. For example, the index abc_index: (a, b, c) is a joint index of the three fields a, b, c. When the following sql is executed, the index abc_index cannot be hit;

select * from table where c = '1';

select * from table where b ='1' and c ='2';
123复制代码

The following three situations will cause the index to be lost. :

select * from table where a = '1';

select * from table where a = '1' and b = '2';

select * from table where a = '1' and b = '2'  and c='3';
12345复制代码

Do you have any clues from the above two examples?

Yes, the index abc_index: (a,b,c) will only be used in three types of queries: (a), (a,b), and (a,b,c). In fact, there is a bit of ambiguity here. In fact, (a,c) will also be used, but only the a field index will be used, and the c field will not be used.

In addition, there is a special case. In the following type, only a and b will be indexed, and c will not be indexed.

select * from table where a = '1' and b > '2'  and c='3';
1复制代码

For a sql statement of the above type, after a and b are indexed, c is already out of order, so c cannot be indexed, and the optimizer will think that it is not as good as the full index. Table scan c field comes quickly.

**Leftmost prefix: As the name suggests, it means leftmost priority. In the above example, we created a_b_c multi-column index, which is equivalent to creating (a) single column index and (a, b) combined index. And (a,b,c) combination index. **

Therefore, when creating a multi-column index, the most frequently used column in the where clause should be placed on the far left according to business needs.

4. Index pushdown optimization

Or index name_age_index, the following sql

select * from table where name like '陈%' and age > 26；
1复制代码

This statement has two execution possibilities:

Hit The name_age_index joint index queries all data that satisfies the name starting with "陈", and then returns to the table to query all satisfying rows.
Hit the name_age_index joint index, query all data whose name starts with "Chen", and then filter out the index with age>20, and then return to the table to query the entire row of data.

Obviously, the second method returns fewer rows to the table and the number of I/Os will also be reduced. This is index pushdown. So not all likes will not hit the index.

6. Precautions when using indexes

1. The index will not include columns with null values

As long as the columns contain null values, they will not be included. In the index, as long as one column in the composite index contains a null value, then this column is invalid for this composite index. Therefore, we recommend not to let the default value of a field be null when designing the database.

2、使用短索引

对串列进行索引，如果可能应该指定一个前缀长度。例如，如果有一个char(255)的列，如果在前10个或20个字符内，多数值是惟一的，那么就不要对整个列进行索引。短索引不仅可以提高查询速度而且可以节省磁盘空间和I/O操作。

3、索引列排序

查询只使用一个索引，因此如果where子句中已经使用了索引的话，那么order by中的列是不会使用索引的。因此数据库默认排序可以符合要求的情况下不要使用排序操作；尽量不要包含多个列的排序，如果需要最好给这些列创建复合索引。

4、like语句操作

一般情况下不推荐使用like操作，如果非使用不可，如何使用也是一个问题。like “%陈%” 不会使用索引而like “陈%”可以使用索引。

5、不要在列上进行运算

这将导致索引失效而进行全表扫描，例如

SELECT * FROM table_name WHERE YEAR(column_name)<h2 id="不使用not-in和操作">6、不使用not in和操作</h2><p>这不属于支持的范围查询条件，不会使用索引。</p><h1 id="我的体会">我的体会</h1><p>  曾经，我一度以为我很懂MySQL。</p><p>  刚入职那年，我还是个孩子，记得第一个需求是做个统计接口，查询近两小时每隔5分钟为一时间段的网站访问量，JSONArray中一共返回24个值，当时菜啊，写了个接口循环二十四遍，发送24条SQL去查（捂脸），由于那个接口，被技术经理嘲讽~~表示他写的SQL比我吃的米都多。虽然我们山东人基本不吃米饭，但我还是羞愧不已。。<br>然后经理通过调用一个dateTime函数分组查询处理一下，就ok了，效率是我的几十倍吧。从那时起，我就定下目标，深入MySQL学习，万一日后有机会嘲讽回去？</p><p>  筒子们，MySQL路漫漫，其修远兮。永远不要眼高手低，一起加油，希望本文能对你有所帮助。</p>

The above is the detailed content of Cry..I thought I knew MySQL indexes well. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:juejin. If there is any infringement, please contact admin@php.cn delete

mysql索引失效的几种情况Feb 21, 2024 pm 04:23 PM

常见情况：1、使用函数或运算；2、隐式类型转换；3、使用不等于（!=或<>）；4、使用LIKE操作符，并以通配符开头；5、OR条件；6、NULL值；7、索引选择性低；8、复合索引的最左前缀原则；9、优化器决策；10、FORCE INDEX和IGNORE INDEX。

mysql索引什么情况下会失效Aug 09, 2023 pm 03:38 PM

mysql索引在不使用索引列进行查询、数据类型不匹配、前缀索引的使用不当、使用函数或表达式进行查询、索引列的顺序不正确、数据更新频繁和索引过多或过少情况下会失效。1、不使用索引列进行查询，为了避免这种情况，应该在查询中使用适当的索引列；2、数据类型不匹配，在设计表结构时，应该确保索引列和查询的数据类型匹配；3、前缀索引的使用不当，可使用前缀索引。

MySQL索引左前缀匹配规则Feb 24, 2024 am 10:42 AM

MySQL索引最左原则原理及代码示例在MySQL中，索引是提高查询效率的重要手段之一。其中，索引最左原则是我们在使用索引优化查询的过程中需要遵循的一个重要原则。本文将围绕MySQL索引最左原则的原理进行介绍，并给出一些具体的代码示例。一、索引最左原则的原理索引最左原则是指在一个索引中，如果查询条件是由多个列组成的，那么只有按照索引中的最左侧列进行查询，才能充

如何合理使用MySQL索引，优化数据库性能？技术同学须知的设计规约！Sep 10, 2023 pm 03:16 PM

如何合理使用MySQL索引，优化数据库性能？技术同学须知的设计规约！引言：在当今互联网时代，数据量不断增长，数据库性能优化成为了一个非常重要的课题。而MySQL作为最流行的关系型数据库之一，索引的合理使用对于提升数据库性能至关重要。本文将介绍如何合理使用MySQL索引，优化数据库性能，并为技术同学提供一些设计规约。一、为什么要使用索引？索引是一种数据结构，用

mysql索引的分类有哪几种Apr 22, 2024 pm 07:12 PM

MySQL 索引分为以下类型：1. 普通索引：匹配值、范围或前缀；2. 唯一索引：确保值唯一；3. 主键索引：主键列的唯一索引；4. 外键索引：指向另一表主键；5. 全文索引：全文搜索；6. 哈希索引：相等匹配搜索；7. 空间索引：地理空间搜索；8. 复合索引：基于多个列的搜索。

PHP与MySQL索引的数据更新和索引维护的性能优化策略及其对性能的影响Oct 15, 2023 pm 12:15 PM

PHP与MySQL索引的数据更新和索引维护的性能优化策略及其对性能的影响摘要：在PHP与MySQL的开发中，索引是优化数据库查询性能的重要工具。本文将介绍索引的基本原理和使用方法，并探讨索引对数据更新和维护的性能影响。同时，本文还提供了一些性能优化策略和具体的代码示例，帮助开发者更好地理解和应用索引。索引的基本原理和使用方法在MySQL中，索引是一种特殊的数

如何在MySQL中创建唯一索引来确保数据唯一性Mar 15, 2024 pm 12:45 PM

标题：MySQL中创建唯一索引来确保数据唯一性的方法及代码示例在数据库设计中，确保数据的唯一性是非常重要的，可以通过在MySQL中创建唯一索引来实现。唯一索引可以保证表中某列（或列组合）的数值是唯一的，如果尝试插入重复值，MySQL会阻止这种操作并报错。本文将介绍如何在MySQL中创建唯一索引，同时提供具体的代码示例。什么是唯一索引唯一索引是一种索引类型，它

MySQL索引是什么Aug 31, 2023 pm 05:43 PM

MySQL索引是一种用于提高数据库查询性能的数据结构。它是在数据库表中的一个或多个列上创建的，以帮助数据库系统快速定位和检索数据。索引可以类比为书籍的目录，它们提供了快速访问数据的方式，而不需要扫描整个表，通过合理地创建索引，可以加快查询速度，提高数据库的性能。

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Repo: How To Revive Teammates

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

3 weeks agoByDDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

3 weeks agoByDDD

Hot Tools

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),