In-depth understanding of MySQL index structure-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

In-depth understanding of MySQL index structure

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Mar 30, 2022 pm 06:13 PM

mysql

This article brings you relevant knowledge about mysql, which mainly introduces related issues about the index structure. So, what is the structure of the index? Why can indexing be so fast? Let’s take a look at it below, I hope it will be helpful to everyone.

In-depth understanding of MySQL index structure

Recommended learning: mysql tutorial

Database storage unit

First of all, we need to know that in order to achieve persistence ization, the index can only be stored on the hard disk. When querying through the index, I/O operations on the hard disk will occur. Therefore, when designing the index, it is necessary to reduce the number of searches as much as possible, thereby reducing I/O time-consuming.

In addition, you need to know a very important principle: the basic unit of database management storage space is Page (Page), and multiple row records (Row) are stored in one page.

The computer system will do read-ahead optimization for disk I/O. When an I/O is performed, in addition to the data at the current disk address, adjacent data will also be read. In the memory buffer pool, the data read by each I/O becomes one page. InnoDB's default page size is 16KB. In-depth understanding of MySQL index structure
64 consecutive pages form an Extent, one or more extents form a Segment, and one or more segments formTablespace. InnoDB has two table space types. Shared table space means that multiple tables share one table space. Independent table space means that the data and indexes of each table are all stored in independent table spaces.

The structure of the data page is as follows (Source: Geek Time "MySQL Must Know"):
In-depth understanding of MySQL index structure
The 7 structural contents of the data page can be roughly divided into the following three categories:

General part of the file, used to verify the completeness of page transmission
- File Header: expresses page information. FIL_PAGE_PREV and FIL_PAGE_NEXT are used in the file header to form a two-way linked list, respectively. Points to the previous and next data pages.
- File Header: Record the status information of the page
- File Trailer: Verify whether the page is complete
Record part , used to store data records
- Maximum and minimum records (Infimum/Supremum): virtual row records, representing the maximum record and minimum record of the data page.
- User Record and Free Space: used to store data row record content
Index part, used to improve record retrieval efficiency
- Page Directory: The relative location where user records are stored

For details, please refer to Taobao’s database kernel monthly report

Index data Structure

Naturally, we will think of some common data structures involved in search algorithms, such as binary search trees, binary balanced trees, etc. In fact, Innodb’s index uses B Tree is implemented. Let’s take a look at why this index structure was chosen.

Limitations of Binary Tree

Let’s briefly review the definition of Binary Search Tree. In a binary search tree, if the key to be found is greater than the root node, then in Search in the right subtree. If the key is smaller than the root node, search in the left subtree until the key is found. The time complexity is O(logn). For example, the sequence [4,2,6,1,3,5,7] will generate the following binary search tree:
In-depth understanding of MySQL index structure
However, in some special cases, the depth of the binary tree will be very large. For example, [1,2,3,4,5,6,7] will generate the following tree:

In the following situation, in the worst case, it takes 7 times to check. The desired results can be found, and the query time becomes O(n).

In order to optimize this situation, there is a balanced binary search tree (AVL tree). An AVL tree refers to a tree in which the height difference between the left and right subtrees does not exceed 1. The search time complexity is O(logn) , this is already an ideal search tree, but in a database with tens of millions of rows of records, the depth of the tree will still be very high, and it is still not the most ideal structure.

B tree

So, if you expand from a binary tree to an N-ary tree, it is easy to imagine that the N-ary tree can greatly reduce the depth of the tree. In fact, the 4-layer tree structure is It can already support dozens of terabytes of data.

B-tree (Balance Tree) is such an N-ary tree. B-tree is also called B-tree, which satisfies the following definition:
Let k be the degree of B-tree (degree, representing each The maximum number of child nodes a node can have),

Each disk block contains at most k - 1 keywords and k pointers to child nodes
In leaf nodes, only keywords , there is no child node pointer
The keywords in each node are arranged in ascending order. All keywords in the left subtree of each keyword are smaller than it, and the keywords in the right subtree are smaller than it. All keywords are greater than it.
All leaf nodes are on the same layer.

As mentioned above, each I/O will pre-read the data of a disk block, which is one page in size. The content of a disk block is used to represent an I/O. The structure of the B-tree is as follows Picture (Source: Geek Time SQL must know):
In-depth understanding of MySQL index structure
The B tree is also ordered. Since the child node pointer must be 1 more than the keyword, it can be divided into sub-trees by keywords. The section of the node, as in the example in the figure, each node has 2 keys and 3 child nodes, such as disk block 2, the key of the first byte point is 3, 5 is less than its first child node 8 , the second child node's 9, 10 are between 8 and 12, the third child node's value 13, 15 is greater than its own second child node 12.

Suppose we want to find 9 now, the steps are as follows:

Compare with the root node disk block 1 (17,35), it is less than 17, continue to search in the pointer P1, the corresponding disk Block 2
is compared with disk block 2 (8,12), it is located between the two, continue to search at pointer P2, corresponding to disk block 6
and disk block 6 (9, 10) Compare and find 9

. You can see that although many comparison operations have been performed, due to pre-reading, the comparison within the disk block is performed in memory and does not consume disk. I/O, the above operation only requires 3 I/O times to complete, which is already an ideal structure.

B-tree index

B-tree is further improved on the basis of B-tree. The differences between B-tree and B-tree are as follows:

The way the B-tree is constructed is that, for the keywords in the parent node, all the keywords of the left subtree are less than it, and all the keywords of the right subtree are greater than or equal to it.
Non-leaf nodes are only used for indexing, No data records will be stored
The keywords of the parent node will also appear in the child nodes, and they are the maximum values (or minimum values) in the child nodes
All keywords will appear in Among the leaf nodes, the leaf nodes form an ordered linked list, sorted from small to large.

The example is as follows. In this example, the keywords of the parent node are the minimum values among the child nodes (Source: Geek Time SQL must know): In-depth understanding of MySQL index structure
Assumption To find the keyword 16, the search steps are as follows:

Compare with the root node disk 1 (1,18,35), 16 is between 1 and 18, get the pointer P1, pointing to disk 2
Find disk 2 (1,8,14), 16 is greater than 14, get pointer P3, pointing to disk 7
Find disk 7 (14,16,17), find 16

Advantages of B tree:

Internal nodes do not store data, so the number of records that each internal node can store is much larger than that of B tree. The height of the tree is lower and I/O is less. The data page read every time I/O has more content
Can support range query, just traverse the ordered linked list composed of leaf nodes
All data is stored in the leaf nodes , so the query efficiency is more stable

HASH index

The default index structure of MySQL's memory storage engine is the Hash index. Hash is a function called a hash function, which is passed through a specific Algorithms (such as MD5, SHA1, SHA2, etc.) convert input of arbitrary length into output of fixed length. Input and output correspond one to one. This article will not give an in-depth introduction to the hash function. For details, please refer to Baidu Encyclopedia.

Hash search efficiency is O(1), which is very efficient. Python's dict, golang's map, and java's hash map are all implemented based on hash. Key-Value databases such as Redis are also implemented. Implemented by Hash.

For precise searches, Hash indexes are more efficient than B-tree indexes, but Hash indexes have some limitations and are therefore not the most mainstream index structure.

Because the data pointed to by the Hash index is unordered, the Hash index cannot be range-queried, nor does it support ORDER BY sorting.
Since Hash is an exact match, fuzzy queries cannot be performed.
Hash index does not support the leftmost matching principle of joint index, and joint index only takes effect when there is a complete match. Because the Hash index calculates the Hash value by merging the indexes and then calculating the Hash value together, instead of calculating the individual Hash value of each index.
If the indexed field has many duplicate values, it will cause a large number of hash conflicts, and the query will become very time-consuming.

Based on the above reasons, the Mysql InnoDB engine does not support Hash index, but there is an adaptive Hash index function in the memory structure. When an index value is used very frequently, it will be in B Based on the tree index, automatically creates a Hash index to improve query performance.

Adaptive Hash index can be understood as a kind of "index of indexes". The Hash index is used to store the page address in the B-tree index and quickly locate the corresponding leaf node. It can be viewed through the innodb_adaptive_hash_index variable.

Recommended learning: mysql tutorial

The above is the detailed content of In-depth understanding of MySQL index structure. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:CSDN. If there is any infringement, please contact admin@php.cn delete

MySQL String Types: Storage, Performance, and Best PracticesMay 10, 2025 am 12:02 AM

MySQLstringtypesimpactstorageandperformanceasfollows:1)CHARisfixed-length,alwaysusingthesamestoragespace,whichcanbefasterbutlessspace-efficient.2)VARCHARisvariable-length,morespace-efficientbutpotentiallyslower.3)TEXTisforlargetext,storedoutsiderows,

Understanding MySQL String Types: VARCHAR, TEXT, CHAR, and MoreMay 10, 2025 am 12:02 AM

MySQLstringtypesincludeVARCHAR,TEXT,CHAR,ENUM,andSET.1)VARCHARisversatileforvariable-lengthstringsuptoaspecifiedlimit.2)TEXTisidealforlargetextstoragewithoutadefinedlength.3)CHARisfixed-length,suitableforconsistentdatalikecodes.4)ENUMenforcesdatainte

What are the String Data Types in MySQL?May 10, 2025 am 12:01 AM

MySQLoffersvariousstringdatatypes:1)CHARforfixed-lengthstrings,2)VARCHARforvariable-lengthtext,3)BINARYandVARBINARYforbinarydata,4)BLOBandTEXTforlargedata,and5)ENUMandSETforcontrolledinput.Eachtypehasspecificusesandperformancecharacteristics,sochoose

How to Grant Permissions to New MySQL UsersMay 09, 2025 am 12:16 AM

TograntpermissionstonewMySQLusers,followthesesteps:1)AccessMySQLasauserwithsufficientprivileges,2)CreateanewuserwiththeCREATEUSERcommand,3)UsetheGRANTcommandtospecifypermissionslikeSELECT,INSERT,UPDATE,orALLPRIVILEGESonspecificdatabasesortables,and4)

How to Add Users in MySQL: A Step-by-Step GuideMay 09, 2025 am 12:14 AM

ToaddusersinMySQLeffectivelyandsecurely,followthesesteps:1)UsetheCREATEUSERstatementtoaddanewuser,specifyingthehostandastrongpassword.2)GrantnecessaryprivilegesusingtheGRANTstatement,adheringtotheprincipleofleastprivilege.3)Implementsecuritymeasuresl

MySQL: Adding a new user with complex permissionsMay 09, 2025 am 12:09 AM

ToaddanewuserwithcomplexpermissionsinMySQL,followthesesteps:1)CreatetheuserwithCREATEUSER'newuser'@'localhost'IDENTIFIEDBY'password';.2)Grantreadaccesstoalltablesin'mydatabase'withGRANTSELECTONmydatabase.TO'newuser'@'localhost';.3)Grantwriteaccessto'

MySQL: String Data Types and CollationsMay 09, 2025 am 12:08 AM

The string data types in MySQL include CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT. The collations determine the comparison and sorting of strings. 1.CHAR is suitable for fixed-length strings, VARCHAR is suitable for variable-length strings. 2.BINARY and VARBINARY are used for binary data, and BLOB and TEXT are used for large object data. 3. Sorting rules such as utf8mb4_unicode_ci ignores upper and lower case and is suitable for user names; utf8mb4_bin is case sensitive and is suitable for fields that require precise comparison.

MySQL: What length should I use for VARCHARs?May 09, 2025 am 12:06 AM

The best MySQLVARCHAR column length selection should be based on data analysis, consider future growth, evaluate performance impacts, and character set requirements. 1) Analyze the data to determine typical lengths; 2) Reserve future expansion space; 3) Pay attention to the impact of large lengths on performance; 4) Consider the impact of character sets on storage. Through these steps, the efficiency and scalability of the database can be optimized.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Nordhold: Fusion System, Explained

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Blue Prince: How To Get To The Basement

1 months agoByDDD

Hot Tools

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux latest version

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SublimeText3 English version

Recommended: Win version, supports code prompts!

Hot Topics

1664

1423

1318

1269

1248