Home >Database >Mysql Tutorial >Completely master the indexing skills of mysql (summary sharing)
This article brings you relevant knowledge about mysql indexes, including the logical structure of mysql and sql execution statements. I hope it will be helpful to you.
The storage engine architecture of MySQL separates query processing from data storage/retrieval. The following is the logical architecture diagram of MySQL:
Each client connection corresponds to a thread on the server. A thread pool is maintained on the server to avoid creating and destroying a thread for each connection. When a client connects to a MySQL server, the server authenticates it. Authentication can be done through username and password, or through SSL certificate. After the login authentication is passed, the server will also verify whether the client has the authority to execute a certain query.
Compiling SQL and optimizing it (such as adjusting the reading order of the table, selecting appropriate indexes, etc.). For SELECT statements, before parsing the query, the server will first check the query cache. If the corresponding query result can be found in it, the query result will be returned directly without the need for query parsing, optimization, etc. Stored procedures, triggers, views, etc. are all implemented in this layer.
The storage engine is responsible for storing data in MySQL, extracting data, starting a transaction, etc. The storage engine communicates with the upper layer through APIs. These APIs shield the differences between different storage engines, making these differences transparent to the upper layer query process. The storage engine will not parse SQL.
MyISAM: Each MyISAM is stored in three files on the disk. They are: table definition files, data files, and index files. The name of the first file begins with the name of the table, and the extension indicates the file type. .frm files store table definitions. The data file extension is .MYD (MYData). The extension of the index file is .MYI (MYIndex).
InnoDB: All tables are stored in the same data file (or multiple files, or independent table space files). The size of the InnoDB table is only limited by the size of the operating system file. Generally 2GB.
MyISAM: MyISAM supports three different storage formats: static table (default, but note that there cannot be spaces at the end of the data, it will be removed ), dynamic tables, compressed tables. After the table is created and data is imported, no modification operations will be performed. You can use compressed tables to greatly reduce disk space usage.
InnoDB: Requires more memory and storage, it will establish its own dedicated buffer pool in main memory for caching data and indexes.
MyISAM: Data is stored in the form of files, so it is very convenient for cross-platform data transfer. You can perform operations on a table individually during backup and recovery.
InnoDB: Free solutions include copying data files, backing up binlog, or using mysqldump, which is relatively painful when the data volume reaches dozens of gigabytes.
MyISAM: The emphasis is on performance. Each query is atomic and its execution times are faster than the InnoDB type, but it does not provide transactions. support.
InnoDB: Provides transaction support, foreign keys and other advanced database functions. Transaction-safe (ACID compliant) tables with transaction (commit), rollback (rollback), and crash recovery capabilities.
MyISAM: You can create a joint index with other fields. The engine's automatic growth column must be an index. If it is a combined index, the automatic growth column does not need to be the first column. It can be sorted according to the previous columns and then incremented.
InnoDB: InnoDB must contain an index with only this field. The engine's auto-growing column must be an index, and if it is a composite index, it must also be the first column of the composite index.
MyISAM: Only table-level locks are supported. When users operate myisam tables, select, update, delete, and insert statements will all be automatically assigned to the table. Locking, if the locked table satisfies insert concurrency, new data can be inserted at the end of the table.
InnoDB: Supporting transactions and row-level locks is the biggest feature of innodb. Row locks greatly improve the performance of multi-user concurrent operations. However, InnoDB's row lock is only valid on the primary key of WHERE. Any non-primary key WHERE will lock the entire table.
MyISAM: supports FULLTEXT type full-text index
InnoDB: does not support FULLTEXT type full-text index, but innodb can use it The sphinx plug-in supports full-text indexing and the effect is better.
MyISAM: Allows tables without any indexes and primary keys to exist. The indexes are the addresses where rows are saved.
InnoDB: If the primary key or non-empty unique index is not set, a 6-byte primary key (invisible to the user) will be automatically generated. The data is part of the primary index, and the additional index saves the value of the primary index.
MyISAM: Saves the total number of rows in the table. If you select count() from table; it will be taken out directly. value.
InnoDB: The total number of rows in the table is not saved. If you use select count(*) from table; it will traverse the entire table, which consumes a lot of money. However, after adding the wehre condition, myisam and innodb process it. The way is the same.
MyISAM: If you execute a large number of SELECTs, MyISAM is a better choice.
InnoDB: If your data performs a large number of INSERTs or UPDATEs, you should use an InnoDB table for performance reasons.
MyISAM: Not supported
InnoDB: Supported
Low performance, too long execution time, too long waiting time, connection query, and index failure.
(1) Writing process
select distinct ... from ... join ... on ... where ... group by ... having ... order by ... limit ...
(2) Parsing process
from ... on ... join ... where ... group by ... having ... select distinct ... order by ... limit ...
The index is equivalent to the table of contents of the book.
The data structure of the index is a B-tree.
(1) Improve query efficiency (reduce IO usage)
(2) Reduce CPU usage
For example, when querying order by age desc, because the B index tree itself is sorted, if the index is triggered by the query, there is no need to query again.
(1) The index itself is large and can be stored in memory or on the hard disk, usually on the hard disk.
(2) Indexes are not used in all situations, such as ① a small amount of data ② frequently changing fields ③ rarely used fields
(3) Indexes will reduce the efficiency of additions, deletions and modifications
(1) Single value index
(2) Unique index
(3) Union index
(4) Primary key index
Note: The only difference between unique index and primary key index: primary key index cannot be null
alter table user add INDEX `user_index_username_password` (`username`,`password`)
The underlying data structure of MySQL index is B tree
B Tree is in B- An optimization based on Tree makes it more suitable for implementing external storage index structures. The InnoDB storage engine uses B Tree to implement its index structure.
Each node in the B-Tree structure diagram contains not only the key value of the data, but also the data value. The storage space of each page is limited. If the data data is large, the number of keys that can be stored in each node (i.e. one page) will be very small. When the amount of stored data is large, it will also lead to B- The depth of Tree is larger, which increases the number of disk I/Os during query, thereby affecting query efficiency. In B Tree, all data record nodes are stored on leaf nodes of the same layer in order of key value. Only key value information is stored on non-leaf nodes. This can greatly increase the number of key values stored in each node. Reduce the height of B Tree.
B Tree has several differences compared to B-Tree:
Non-leaf nodes only store key value information.
There is a link pointer between all leaf nodes.
Data records are stored in leaf nodes.
Optimize the B-Tree in the previous section. Since the non-leaf nodes of B Tree only store key value information, assuming that each disk block can store 4 key values and pointer information, it will become the structure of B Tree. As shown in the figure below:
Usually there are two head pointers on the B Tree, one points to the root node, the other points to the leaf node with the smallest keyword, and all leaf nodes ( That is, there is a chain ring structure between data nodes). Therefore, two search operations can be performed on B Tree: one is a range search and paging search for the primary key, and the other is a random search starting from the root node.
Maybe there are only 22 data records in the above example, and the advantages of B Tree cannot be seen. Here is a calculation:
The page size in the InnoDB storage engine is 16KB, and the primary key type of the general table It is INT (occupies 4 bytes) or BIGINT (occupies 8 bytes), and the pointer type is generally 4 or 8 bytes, which means that one page (a node in B Tree) stores approximately 16KB/( 8B 8B) = 1K key values (because it is an estimate, to facilitate calculation, the value of K here is 〖10〗^3). In other words, a B Tree index with a depth of 3 can maintain 10^3 * 10^3 * 10^3 = 1 billion records.
In actual situations, each node may not be fully filled, so in the database, the height of B Tree is generally between 2 and 4 levels. MySQL's InnoDB storage engine is designed so that the root node is resident in memory, which means that only 1 to 3 disk I/O operations are needed to find the row record of a certain key value.
The B Tree index in the database can be divided into clustered index and secondary index. The above B Tree example diagram is implemented in the database as a clustered index. The leaf nodes in the B Tree of the clustered index store the row record data of the entire table. The difference between an auxiliary index and a clustered index is that the leaf nodes of the auxiliary index do not contain all the data of the row record, but the clustered index key that stores the corresponding row data, that is, the primary key. When querying data through the secondary index, the InnoDB storage engine will traverse the secondary index to find the primary key, and then find the complete row record data in the clustered index through the primary key.
(1) Using all the index keys of the joint index can trigger the joint index
(2) Using all the index keys of the joint index, but connecting with or , the joint index cannot be triggered
(3) When the first field on the left of the joint index is used alone, the joint index can be triggered
(4) When using other fields of the joint index alone, the joint index cannot be triggered
explain can simulate sql optimization and execute sql statements.
(1) User table
(2) Department table
(3) Untriggered index
(4) Triggered index
(5 ) Result analysis
The table appearing in the first row of explain is the driver table.
. Sorting the driven table directly will trigger the index, while sorting the non-driven table will not trigger the index.
(1) id: SELECT identifier. This is the query sequence number of SELECT.
(2) select_type: SELECT type:
SIMPLE: Simple SELECT (does not use UNION or subquery)
PRIMARY: The outermost SELECT
UNION: The second or subsequent SELECT statement in UNION
DEPENDENT UNION: The second SELECT statement in UNION The second or subsequent SELECT statement depends on the outer query
UNION RESULT: the result of UNION
SUBQUERY: the subquery A SELECT
DEPENDENT SUBQUERY: The first SELECT in the subquery, depending on the outer query
DERIVED: SELECT of the derived table (Subquery of FROM clause)
(3) table: table name
(4) type: connection type
system: The table has only one row (=system table). This is a special case of the const join type.
const: The table has at most one matching row, which will be read at the beginning of the query. Because there is only one row, the column values in this row can be treated as constants by the rest of the optimizer. const is used when comparing all parts of a PRIMARY KEY or UNIQUE index with a constant value.
eq_ref: For each combination of rows from the previous table, read one row from this table. This is probably the best join type, besides const types. It is used when all parts of an index are used in the join and the index is UNIQUE or PRIMARY KEY. eq_ref can be used on indexed columns compared using the = operator. The comparison value can be a constant or an expression that uses a column from a table that was read before this table.
ref: For each combination of rows from the previous table, all rows with matching index values will be read from this table. Use ref if the join uses only the leftmost prefix of the key, or if the key is not UNIQUE or PRIMARY KEY (in other words, if the join cannot select a single row based on the key). This join type is good if you are using keys that only match a small number of rows. ref can be used on indexed columns using the = or <=> operators.
ref_or_null: This join type is like ref, but adds MySQL to specifically search for rows containing NULL values. This join type of optimization is often used in solving subqueries.
index_merge: This join type indicates that the index merge optimization method is used. In this case, the key column contains the list of indexes used, and key_len contains the longest key element of the index used.
unique_subquery: This type replaces the ref of the IN subquery of the following form: value IN (SELECT primary_key FROM single_table WHERE some_expr); unique_subquery is an index lookup function that can completely replace the subquery, higher efficiency.
index_subquery: This join type is similar to unique_subquery. IN subqueries can be replaced, but only for non-unique indexes in subqueries of the following form: value IN (SELECT key_column FROM single_table WHERE some_expr)
range: retrieve only the given range Rows, use an index to select rows. The key column shows which index was used. key_len contains the longest key element of the index used. The ref column is NULL in this type. When using =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN or IN operators, you can use range# when comparing key columns with constants.
Recommended learning: mysql video tutorial
The above is the detailed content of Completely master the indexing skills of mysql (summary sharing). For more information, please follow other related articles on the PHP Chinese website!