Home >Database >Mysql Tutorial >Implementation principle of Mysql index
Mysql Index Discussion
In MySQL, the index is a storage engine level concept. Different storage engines implement indexes in different ways. This article mainly discusses the index implementation of the two storage engines MyISAM and InnoDB. Way.
MyISAM index implementation
The MyISAM engine uses B+Tree as the index structure, and the data field of the leaf node stores the address of the data record. The following figure is the schematic diagram of the MyISAM index:
Assume that the table has three columns in total. Assuming that we use Col1 as the primary key, Figure 8 is the primary index of a MyISAM table ( Primary key) indicates. It can be seen that MyISAM's index file only saves the address of the data record. In MyISAM, there is no structural difference between the primary index and the secondary index (Secondary key), except that the primary index requires the key to be unique, while the key of the secondary index can be repeated. If we create an auxiliary index on Col2, the structure of this index is as shown below:
is also a B+Tree, and the data field stores the address of the data record. . Therefore, the index retrieval algorithm in MyISAM is to first search the index according to the B+Tree search algorithm. If the specified Key exists, the value of its data field is taken out, and then the value of the data field is used as the address to read the corresponding data record.
MyISAM's indexing method is also called "non-clustered". The reason why it is called so is to distinguish it from InnoDB's clustered index.
InnoDB index implementation
Although InnoDB also uses B+Tree as the index structure, the specific implementation method is completely different from MyISAM.
The first major difference is that InnoDB's data files themselves are index files. As we know from the above, the MyISAM index file and data file are separated, and the index file only saves the address of the data record. In InnoDB, the table data file itself is an index structure organized by B+Tree, and the leaf node data field of this tree saves complete data records. The key of this index is the primary key of the data table, so the InnoDB table data file itself is the primary index.
Figure 10 is a schematic diagram of the InnoDB main index (also a data file). You can see that the leaf nodes contain complete data records. This kind of index is called a clustered index. Because InnoDB's data files themselves are aggregated by primary key, InnoDB requires that the table must have a primary key (MyISAM may not have one). If it is not explicitly specified, the MySQL system will automatically select a column that can uniquely identify the data record as the primary key. If it does not exist, For this type of column, MySQL automatically generates an implicit field as the primary key for the InnoDB table. The length of this field is 6 bytes and the type is long.
The second difference from the MyISAM index is that InnoDB's auxiliary index data field stores the value of the corresponding record's primary key instead of the address. In other words, all secondary indexes in InnoDB reference the primary key as the data field. For example, Figure 11 shows an auxiliary index defined on Col3:
Here, the ASCII code of English characters is used as the comparison criterion. The clustered index implementation makes searching by primary key very efficient, but auxiliary index search requires retrieving the index twice: first, retrieve the auxiliary index to obtain the primary key, and then use the primary key to retrieve the records in the primary index.
Understanding the index implementation methods of different storage engines is very helpful for the correct use and optimization of indexes. For example, after knowing the index implementation of InnoDB, it is easy to understand why it is not recommended to use overly long fields as primary keys, because all auxiliary indexes All refer to the primary index. A long primary index will make the auxiliary index too large. For another example, using non-monotonic fields as primary keys is not a good idea in InnoDB because the InnoDB data file itself is a B+Tree. Non-monotonic primary keys will cause the data file to maintain the characteristics of the B+Tree when inserting new records. Frequent split adjustments are very inefficient, and using auto-increment fields as primary keys is a good choice.
The above is the content of the implementation principle of Mysql index. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!