Home >Database >Mysql Tutorial >Super detailed explanation of mysql storage engine-InnoDB
If you want to see the storage engine used by your database by default, you can use the command:
SHOW VARIABLES LIKE 'storage_engine';
1. InnoDB storage engine
1. InnoDB is the preferred engine for transactional databases
Supports transaction security tables (ACID)
ACID properties of transactions: atomicity, consistency, isolation, and durability
a.Atomicity: Atomicity means that this set of statements is either all executed or not executed at all. If an error occurs halfway through the execution of the transaction, the database will be rolled back to the place where the transaction started. .
Implementation: Mainly based on the redo and undo mechanism of the MySQ log system. A transaction is a set of SQL statements that have functions such as selection, query, and deletion. There will be one node for each statement execution. For example, after the delete statement is executed, a record is saved in the transaction. This record stores when and what we did. If something goes wrong, it will be rolled back to the original position. What I have done has been stored in the redo, and then it can be executed in reverse.
b.Consistency: Before and after the transaction starts and ends, the integrity constraints of the database are not violated. (eg: For example, if A transfers money to B, it is impossible for A to deduct the money but B not receive it)
c.Isolation: At the same time, only one transaction is allowed to request the same data. Different transactions do not interfere with each other;
If isolation is not considered, several problems will occur:
i. Dirty read: refers to reading in a transaction Data in another uncommitted transaction is read during processing (when a transaction is modifying a certain data multiple times, and the multiple modifications in this transaction have not yet been committed, then a concurrent transaction comes to access This data will cause the data obtained by the two transactions to be inconsistent); (read the uncommitted dirty data of another transaction)
ii. Non-repeatable read: In the database For certain data, multiple queries within a transaction range returned different data values. This is because it was modified and submitted by another transaction during the query interval; (the data submitted by the previous transaction was read, and all the queried data were are the same data item)
iii, Virtual reading (phantom reading) : It is a phenomenon that occurs when transactions are not executed independently (eg: transaction T1 reads all rows in a table A data item was modified from "1" to "2". At this time, transaction T2 inserted a row of data items into the table, and the value of this data item was still "1" and submitted to the database. If the user operating transaction T1 looks at the data just modified, he will find that there is still one row that has not been modified. In fact, this row was added from transaction T2, as if he was hallucinating); (the data submitted by the previous transaction is read. , for a batch of data as a whole)
d.Persistence: After the transaction is completed, all updates to the database by the transaction will be saved to the database and cannot be rolled back
2.InnoDB is the default storage engine of mySQL
The default isolation level is RR, and one step further under the RR isolation level, through multi-version concurrency control ( MVCC) solves the non-repeatable read problem, and adds gap lock (that is, concurrency control) to solve the phantom read problem. Therefore, InnoDB's RR isolation level actually achieves the effect of serialization level while retaining better concurrency performance.
MySQL database provides us with four isolation levels:
a, Serializable
(serialization): which can avoid dirty reads, non-repeatable reads, and phantom reads occurs;
b, Repeatable read
(repeatable read): can avoid the occurrence of dirty reads and non-repeatable reads;
c, Read committed
(read committed): can avoid the occurrence of dirty reads;
d, Read uncommitted
(read uncommitted): the lowest level, cannot be guaranteed under any circumstances;
From a----d isolation level from high to low, the higher the level, the lower the execution efficiency
3.InnoDB supports row-level locks.
Row-level locks can support concurrency to the greatest extent. Row-level locks are implemented by the storage engine layer.
Lock: The main function of the lock is to manage concurrent access to shared resources and to achieve transaction isolation.
Type: shared lock (read lock), exclusive lock (write lock)
Strength of MySQL locks: table-level locks (low overhead, low concurrency), usually implemented at the server layer
Row-level locks (high overhead, high concurrency), only at the storage engine level Implementation
4. InnoDB is designed for maximum performance for processing huge amounts of data.
Its CPU efficiency may be unmatched by any disk-based relational database engine
5. InnoDB storage engine is fully integrated with the MySQL server
The InnoDB storage engine maintains its own buffer pool for caching data and indexes in main memory. InnoDB places its tables and indexes in a logical table space, and the table space can contain several files (or original disk files);
6, InnoDB supports complete foreign keys Sexual restraint
When storing data in a table, each table is stored in the order of the primary key. If the primary key is not displayed in the table definition, specify the primary key. InnoDB will generate a 6-byte ROWID for each row and use it as the primary key
7. InnoDB is used in many large database sites that require high performance
8. The number of rows in the table is not saved in InnoDB (eg: when selecting count(*) from table, InnoDB needs to scan the entire table to calculate how many rows there are); when clearing the entire table, InnoDB is one row Deletion of one row is very slow;
InnoDB does not create a directory. When using InnoDB, MySQL will create a 10MB automatically extended data file named ibdata1 and two named ib_logfile0 in the MySQL data directory. and the 5MB log file of ib_logfile1
2. The underlying implementation of the InnoDB engine
InnoDB has two storage files, and the suffix names are .frm and .idb. ; Among them, .frm is the definition file of the table, and .idb is the data file of the table.
1. The InnoDB engine uses the B Tree structure as the index structure
B-Tree (balanced multi-path search tree): a balanced search tree designed for external storage devices such as disks
When the system reads data from the disk to the memory, the basic unit is the disk block. Data located in the same disk block will be read out at once, rather than on demand.
InnoDB storage engine uses pages as data reading units. Pages are the smallest unit of disk management. The default page size is 16k.
The storage space of a disk block in the system is often not that large. , so every time InnoDB applies for disk space, it will use several consecutive disk blocks with addresses to reach the page size of 16KB.
InnoDB will use pages as the basic unit when reading disk data into the disk. When querying data, if each piece of data in a page can help locate the location of the data record, this will reduce The number of disk I/Os improves query efficiency.
The data in the B-Tree structure allows the system to efficiently find the disk block where the data is located.
Each node in the B-Tree can contain a large amount of keyword information and branches according to the actual situation. Example
Each node occupies one disk block of disk space. There are two ascending-order keys on a node and three pointers to the root node of the subtree. The pointers What is stored is the address of the disk block where the child node is located.
Take the root node as an example. The keywords are 17 and 35. The data range of the subtree pointed by the P1 pointer is less than 17. The data range of the subtree pointed by the P2 pointer is 17----35. The data range of the P3 pointer is 17----35. The data range of the pointed subtree is greater than 35;
Simulate the process of searching for keyword 29:
a. Find disk block 1 according to the root node and read it into the memory. [Disk I/O operation for the first time]
b. Compare keyword 29 in the interval (17,35) and find the pointer P2 of disk block 1;
c. Find it based on the P2 pointer Disk block 3, read into memory. [Disk I/O operation for the second time]
d. Compare keyword 29 in the interval (26, 30) and find the pointer P2 of disk block 3;
e. Find based on the P2 pointer Disk block 8, read into memory. [The third disk I/O operation]
f. Find keyword 29 in the keyword list in disk block 8.
MySQL’s InnoDB storage engine is designed to use the root The node is resident in memory, so the depth of the tree should not exceed 3, that is, I/O does not need to exceed three times;
Analyzing the above results, it is found that three disk I/O operations and three memory searches are required operate. Since the keywords in the memory are an ordered list structure, binary search can be used to improve efficiency; three disk I/O operations are the decisive factor affecting the entire B-Tree search efficiency.
B Tree
B Tree is an optimization based on B-Tree, making it more suitable for implementing external storage index structures. Each B-Tree There are keys and data in the nodes, and the storage space of each page is limited. If the data data is large, the number of keys that can be stored in each node (that is, one page) will be very small. When the amount of data stored is large, the depth of the B-Tree will also be larger, which will increase the number of disk I/Os during query, thus affecting query efficiency.
In B Tree, all data record nodes are stored on leaf nodes of the same layer in order of key value. Only key value information is stored on non-leaf nodes, which can greatly increase the storage capacity of each node. The number of key values reduces the height of B Tree;
Usually there are two head pointers on B Tree, one points to the root node and the other points to the leaf node with the smallest key , and there is a chain ring structure between all leaf nodes (ie data nodes).
Therefore, two search operations can be performed on B Tree, one is a range search and paging search for the primary key, and the other is a random search starting from the root node.
B Tree in InnoDB
InnoDB is a data storage indexed by ID
There are two data storage files using the InnoDB engine, one is the definition file and the other is the data document.
InnoDB builds an index on the ID through the B Tree structure, and then stores the record in the leaf node
If the indexed field is not the primary key ID, create an index for the field, then store the primary key of the record in the leaf node, and then find the corresponding record through the primary key index.
For more related questions, please visit the PHP Chinese website: PHP Video Tutorial
The above is the detailed content of Super detailed explanation of mysql storage engine-InnoDB. For more information, please follow other related articles on the PHP Chinese website!