Finally understand that MySQL index needs to use B+tree, and it's so fast
mysql tutorial column introduces the B tree of understanding the index.
Free recommendation: mysql tutorial(Video)
Preface
When you encounter a slow SQL
and need to optimize it, what optimization method can you think of immediately?
Most people’s first reaction may be Add index. In most cases, index can add a query to a SQL
statement. The efficiency is improved by several orders of magnitude.
Essence of index: A data structure used to quickly find records.
Commonly used indexesData structure:
- Binary tree
- Red-black tree
- Hash table
-
B-tree
(B tree is not called B minus tree) B tree
Data structure GraphicalWebsite: https://www.cs.usfca.edu/~galles/visualization/Algorithms.html
Index query
Everyone knowsselect * from t where col = 88
If such a SQL
statement is searched without using the index, the normal search is full table scan: search row by row starting from the first row of the table , comparing the value of the col
field in each row with 88, this is obviously very inefficient.
And if you use an index, the query process is completely different (assuming that a balanced binary tree data structure is used to store our index columns )
The storage structure of the binary tree at this time (Key - Value): Key is the data of the index field, and Value is the disk file address of the row where the index is located.
When 88 is finally found, you can take out the disk file address corresponding to its Value, and then go directly to the disk to find this line of data. The speed at this time It will be much faster than a full table scan.
Butactually MySQL
The bottom layer does not use binary tree to store index data, it uses B tree (B tree) .
Why not use a binary tree
Assuming that an ordinary binary tree is used to record the id
index column, we must maintain the binary tree index field while inserting a row of records.
At this time, when I want to find the data with id = 7
, the search process is as follows:
At this time, we searched for the row id = 7
7 times, which is not much different from our full table scan. Obviously, the binary tree is actually a data structure that is not suitable as an index for this increasing data column.
Why not use Hash table
Hash table: a data structure for fast search, the time complexity of search is O(1)
Hash function: convert a Any type of key can be converted into an int type subscript
Assuming that the Hash table is used to record the id
index column, we insert a row of records at the same time Maintain Hash table index fields.
At this time, the tree node with id = 7
was searched only 1 times, which is very efficient.
But the index of MySQL
stilldoes not use the Hash table that can be accurately positioned. Because it does not apply to range queries .
The red-black tree is a specialized AVL tree (balanced binary tree), which is maintained through specific operations during insertion and deletion operations. Balance of binary search trees; If a binary search tree is a red-black tree, then any of its subtrees must be a red-black tree.
Assuming that the red-black tree record id
index column is used at this time, we must maintain the red-black tree index field while inserting a row of records.
During the insertion process, you will find that it is different from ordinary binary trees in that when the height difference between the left and right subtrees of a tree is > 1, it will spin Operation to keep the tree balanced.
At this time, the tree node with id = 7
was searched only 3 times, which is faster than the so-called ordinary binary tree.
But MySQL
’s index stilldoes not use which is excellent in precise positioning and range queryred-black tree.
Because when MySQL
the amount of data is large, the size of the index will also be very large, and the memory may not be able to store it, so related reading and writing need to be done from the disk. If the tree level is too large If it is high, the number of disk reads and writes (I/O interactions) will be greater, and the performance will be worse.
B-tree
The only shortcoming of the red-black tree is that the height of the tree is uncontrollable, so now our entry point is the tree the height of.
Currently, a node is only allocated to store 1 element. If we want to control the height, we can allocate a larger space to a node and let it store multiple elements horizontally , at this time the height is controllable. Through such a transformation process, it became
B-tree
.
B-tree
is an absolutely balanced multi-way tree. There are two concepts in its structure: Degree: the number of child nodes (subtrees) that a node has. (Some places use
to explainis a balanced m-way search tree. It may be an empty tree, or it may meet the following characteristics:B-tree, explain it here)
)Order: the maximum number of child nodes of a node . (Usually represented by
mKeyword: data index.
An m-order
B-tree
##⌈
-
⌈#⌈
All leaf nodes are located on the same layer. - The meaning of the name (off topic, relax)
The following is excerpted from Wikipedia
B-tree
in 1972 while working at Boeing Research Labs, but they did not explain that B stood for What meaning (if any).Douglas Comer explains: Neither author ever explained the original meaning of B-tree
. We might feel that balanced, broad or bushy might be appropriate. Others suggested the letter B stood for Boeing. Due to his sponsorship, however, it seems more appropriate to think ofB-tree as a Bayer tree.
B-tree in his paper titled "CS144C classroom lecture about disk storage and B-trees" published in May 1980 Name interpretation, suggesting that B may mean Boeing or Bayer's name.
Search
The search for B-tree is actually very similar to a binary tree:
B-tree
has k keywords and (k 1) branches.The search in the binary tree only considers whether to go left or right, while
B-tree
B-tree
- First find the node. Since
B-tree
is usually stored on the disk, this step requires disk IO operation; - Find the key word, when a node is found, the node is read into the memory and then the keyword is found through sequential or binary search. If the keyword is not found, you need to judge the size to find a suitable branch to continue searching.
Operation process
Now you need to find the element: 88
First time: Disk IO
The second time: Disk IO
The third time: Disk IO
Then there is a memory comparison, which is compared with 70 and 88 respectively, and finally 88 found.
From the search process, we found that B-tree
The number of comparisons and the number of disk IOs are actually not much different from those of binary trees. It seems that there is no difference. What advantages.
But if you take a closer look, you will find that the comparison is completed in memory, does not involve disk IO, and the time consumption is negligible.
In addition, B-tree
can store many keywords (the number is determined by the order) in one node, and the same number of keywordsThe nodes generated in B-tree
are far less than the nodes in the binary tree, and the difference in the number of nodes is equivalent to the number of disk IOs. After reaching a certain number, the performance difference becomes apparent.
Insertion
When B-tree
wants to insert a keyword, it directly finds the leaf node and performs the operation.
- Find the leaf node to be inserted according to the keyword to be inserted;
- Because the maximum number (order) of child nodes of a node is m , so it is necessary to determine whether the number of keywords in the current node is less than (m - 1).
- Yes: Insert directly
- No: Node split occurs. The node is divided into left and right parts based on the middle keyword of the node, and the middle keyword is placed Just go to the parent node.
Operation process
For example, we now need to insert elements in the B-tree
with Max Degree (order) of 3: 72
-
Find the leaf node to be inserted
Node split: It should be with [70,88] On the same disk block, but when a node has 3 keywords, it may have 4 child nodes, which exceeds the maximum degree 3 of the limit we defined, so split## must be performed at this time #: Divide the node into two using the middle keyword as the boundary, generate a new node, and move the middle keyword up to the parent node.
Tip: When there are two middle keywords, the left keyword is usually used Move up the split.
DeleteThe deletion operation will be more troublesome than search and insertion, because the keyword to be deleted may or may not be on the leaf node, and deletion may also causeB-tree is unbalanced, and operations such as merging and rotation are required to maintain the balance of the entire tree.
The above is the detailed content of Finally understand that MySQL index needs to use B+tree, and it's so fast. For more information, please follow other related articles on the PHP Chinese website!

InnoDBBufferPool reduces disk I/O by caching data and indexing pages, improving database performance. Its working principle includes: 1. Data reading: Read data from BufferPool; 2. Data writing: After modifying the data, write to BufferPool and refresh it to disk regularly; 3. Cache management: Use the LRU algorithm to manage cache pages; 4. Reading mechanism: Load adjacent data pages in advance. By sizing the BufferPool and using multiple instances, database performance can be optimized.

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages have advantages in their respective fields such as data analytics, enterprise applications, and system programming.

MySQL is worth learning because it is a powerful open source database management system suitable for data storage, management and analysis. 1) MySQL is a relational database that uses SQL to operate data and is suitable for structured data management. 2) The SQL language is the key to interacting with MySQL and supports CRUD operations. 3) The working principle of MySQL includes client/server architecture, storage engine and query optimizer. 4) Basic usage includes creating databases and tables, and advanced usage involves joining tables using JOIN. 5) Common errors include syntax errors and permission issues, and debugging skills include checking syntax and using EXPLAIN commands. 6) Performance optimization involves the use of indexes, optimization of SQL statements and regular maintenance of databases.

MySQL is suitable for beginners to learn database skills. 1. Install MySQL server and client tools. 2. Understand basic SQL queries, such as SELECT. 3. Master data operations: create tables, insert, update, and delete data. 4. Learn advanced skills: subquery and window functions. 5. Debugging and optimization: Check syntax, use indexes, avoid SELECT*, and use LIMIT.

MySQL efficiently manages structured data through table structure and SQL query, and implements inter-table relationships through foreign keys. 1. Define the data format and type when creating a table. 2. Use foreign keys to establish relationships between tables. 3. Improve performance through indexing and query optimization. 4. Regularly backup and monitor databases to ensure data security and performance optimization.

MySQL is an open source relational database management system that is widely used in Web development. Its key features include: 1. Supports multiple storage engines, such as InnoDB and MyISAM, suitable for different scenarios; 2. Provides master-slave replication functions to facilitate load balancing and data backup; 3. Improve query efficiency through query optimization and index use.

SQL is used to interact with MySQL database to realize data addition, deletion, modification, inspection and database design. 1) SQL performs data operations through SELECT, INSERT, UPDATE, DELETE statements; 2) Use CREATE, ALTER, DROP statements for database design and management; 3) Complex queries and data analysis are implemented through SQL to improve business decision-making efficiency.

The basic operations of MySQL include creating databases, tables, and using SQL to perform CRUD operations on data. 1. Create a database: CREATEDATABASEmy_first_db; 2. Create a table: CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY, titleVARCHAR(100)NOTNULL, authorVARCHAR(100)NOTNULL, published_yearINT); 3. Insert data: INSERTINTObooks(title, author, published_year)VA


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Dreamweaver Mac version
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

WebStorm Mac version
Useful JavaScript development tools