InnoDB存储引擎之InnoDB关键特性-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

InnoDB存储引擎之InnoDB关键特性

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:12 PM

innodbThe essentialstorageengineinsertcharacteristicbuffer

1.插入缓冲 A.Insert Buffer 听名字会让人理解为插入缓冲是缓冲池中的一部分。其实不是这个样子的，InnoDB缓冲池中有Insert Buffer信息，但是Insert Buffer和数据页一样，也是物理页的一个组成部分。在InnoDB存储引擎中，行记录的插入顺序是按照主键递增的顺

1.插入缓冲
A.Insert Buffer
听名字会让人理解为插入缓冲是缓冲池中的一部分。其实不是这个样子的，InnoDB缓冲池中有Insert Buffer信息，但是Insert Buffer和数据页一样，也是物理页的一个组成部分。在InnoDB存储引擎中，行记录的插入顺序是按照主键递增的顺序进行插入的。因此插入聚集索引(Primary Key)一般是顺序的，不需要磁盘的随机读取。但是并不是所有的主键都是顺序的。如主键是UUID这类的，那么插入和辅助索引一样都是随机的。所以在建表时主键是关键一般都是自增ID且非空。
对于非聚集索引的插入或者更新操作，不是每一次直接插入到索引页中，而是先判定插入的非聚集索引页是否在缓冲池中，若在则直接插入；如不在则先放入到Inset Buffer中。然后再以一定的频率和情况进行Insert Buffer和辅助索引页子节点的merge操作。这时通常能将多少插入合并到一个操作中(因为在一个索引页中),这就大大提高了对于非聚集索引的性能。但是Inset Buffer的使用需要同时满足一下两个条件:1.索引是辅助索引；2.索引不是唯一的。如果是唯一索引的话，数据库会去查找索引页来判断插入记录的唯一性，这个样子又会有离散读取的情况发生，从而导致Insert Buffer失去意义。可以通过命令show engine innodb status来查看插入缓冲的信息。但是在写密集的情况下，插入缓冲会占用过多的缓冲池，默认最大可以占到这个缓冲池的1/2。这对于其他的操作可能会带来一定的影响。Percona发布一些patch来修正这个情况。可以通过ibuf_pool_size_per_max_size参数来设置。具体的可以到官网进行查找。
B.Change Buffer

InnoDB从1.0.x版本开始引入了Change Buffer。对DML操作-insert、delete、update都进行缓冲。分别是：Insert Buffer、Delete Buffer、Purge Buffer。Change Buffer使用的对象依然是非唯一的辅助索引。对一条记录进行update操作可能分为两个过程:1.将记录标记未已删除;2.真正将记录删除。因此delete Buffer对呀update操作的第一个过程，Purge Buffer对应update操作的第二个过程。可以通过参数innodb_change_buffering来开启各种Buffer的选项。该参数的可选值有:inserts、deletes、purges、all、none。changes表示启用inserts和deletes，all表示启用所有，none表示都不启用。默认all。在InnoDB 1.2.x还可以通过参数innodb_change_buffer_max_size(百分比)来控制最大使用的内存数量。如图

有图可以看到这里显示了merged Operations和discarded operations。并且下边都具体显示Change Buffer中每个操作的次数。insert表示Insert Buffer;delete mark表示Delete Buffer;delete表示 Purge Buffer；discarded Operations表示当Change Buffer发生merge时，表已经被删除，此时就无需将记录合并到辅助索引中。
C.Insert Buffer的内部实现
在Mysql 4.1之前的版本中每张表都有一棵insert buffer B+树。而现在的版本中只有一棵全局的insert buffer B+树，负责对所有的表的非唯一辅助索引进行Insert Buffer。而这棵B+树放在共享表空间中。因此，试图通过独立表空间ibd文件恢复表中的数据时，往往会导致check table失败。这是因为表的辅助索引中的数据可能还在Insert Buffer中，所以通过ibd文件恢复后，还需要通过repair table来重建表中的辅助索引。

Insert Buffer是一棵B+树，因此也由叶节点和非叶节点组成，非叶节点存放的是查询额search key(键值),具体构造如下图:喎?http://www.2cto.com/kf/ware/vc/" target="_blank" class="keylink">vcD4KPHA+PGltZyBzcmM9"http://www.2cto.com/uploadfile/Collfiles/20141209/2014120909174544.jpg" alt="\">

search key共占用9个字节，其中space(占用4个字节)表示待插入记录所在表的表空间id(在InnoDB存储引擎中，每个表都有一个唯一的space id,可以通过space id查询得到是那张表)。marker占用1字节，用来兼容老版本的Insert Buffer。offset表示页所在的偏移量，占4字节。

当一个辅助索引要插入到页(space, offset)时,如果这个页不在缓冲池中，那么InnoDB存储引擎首先根据上述规则构造一个search key，接下来查询Insert Buffer这棵B+树，然后将这条记录插入到Insert Buffer B+树的叶节点。对于插入到InnoDB Buffer B+树的叶节点的记录，并不是直接插入，而是需要根据如下的规则进行构造：

space、marker、offset字段的含义和非叶节点的含义相同。metadata占用4字节，其存储的内容如下:

IBUF_REC_OFFSET_COUNT保存2字节的整数，用来排序每个记录进入Insert Buffer的顺序。从Insert Buffer叶节点的第5列开始，就是实际插入记录的各个字段啦。因此较之原插入记录，Insert Buffer B+树需要额外13字节的开销。

因为启用Insert Buffer索引后，辅助索引页(space, page_no)中的记录可能被插入到Insert Buffer B+树中，所以为了保证每次Merge Insert Buffer页必须成功，还需要有一个特殊的页用来标记每个辅助索引页(space,page_no)的可用空间。这个页的类型称之为Insert Buffer Bitmap。每个Insert Buffer Bitmap页用来追踪16384(256个区(Extent))个辅助索引页,每个Insert Buffer Bitmap页都在16384个页的第二个页中。每个辅助索引页在Insert Buffer Bitmap页中占用4位(bit)，具体结构如下:

D.Merge Insert Buffer
概括地说，Merge Insert Buffer的操作可能发生在以下几种情况:
1.辅助索引页被读取到缓冲池时;
2.Insert Buffer Bitmap页追踪到该辅助索引页页无可用空间;
3.Master Thread;
第一种情况为当辅助索引页被读取到缓冲池时，列如这在执行SELECT查询操作，这时需要检查Insert Buffer Bitmap页，然后该辅助索引页是否有记录存放在Insert Buffer B+树中。有则将Insert Buffer B+树中该页的记录插入到辅助索引索引页中。
第二种情况是，Insert Buffer Bitmap页用来追中每个辅助页的可用空间，并至少有1/32页的空间，若插入辅助索引记录时检测到插入记录后可用空间小于1/32页，则会强制进行一次合并，即强制读取辅助索引页，将Insert Buffer B+树中该索引页的记录及待插入的记录插入到辅助索引页中。
第三种情况，在Master Thread线程中每秒活每10秒进行一次Merge Insert Buffer的操作。不同之处在于每次进行Merge操作页的数量不一样。每次Merge操作的不止一个页，而是根据srv_innodb_io_capactiy的百分比来决定真正要合并多少个辅助索引页。在Insert Buffer B+树中，辅助索引页根据(space, offset)都已排序好，故可以根据(space, offset)的排序顺序进行页的选择。然而，对于Insert Buffer页的选择，InnoDB存储引擎并非采用这个方式，它随机地选择Insert Buffer B+树的一个页，读取该页中的space及以后所需要数量的页。若进行merge时，要进行merge操作的表已经被删除，此时可以直接丢弃已经被Insert/Change Buffer的数据记录。

2.两次写

Insert Buffer使InnoDB存储引擎的性能提升，而doublewrite(两次写)带给InnoDB存储引擎的数据页的可靠性。这是因为，当数据库宕机是，InnoDB存储引擎可能正在写入某个页到表中，而这个时候只写了一部分(如16K的页，只写了前4K)，这情况被称为部分写失效(partial page write)。可能你会想着用重做日志进行恢复。这是一个办法。但是重做日志记录的是对页的物理操作，如偏移量800，写"aaaa'记录。如果这个页本身已经发生啦损坏，在对其进行重做是没有意思的。这就是在应用重做日志前，需要一个页的副本，当写入失效时，先通过页的副本来还原该页，再进行重做。这就是doublewrite。如下图

doublewrite由两部分组成，一部分是内存中的doublewrite buffer，大小为2M，另一部分为物理磁盘上共享表空间中连续的128个页(即2个区(extent))大小也是2M。在对缓冲池中的脏页进行刷新是，并不是直接写入磁盘，而是通过memcpy函数将脏页复制到内存中的doublewrite buffer，之后通过doublewrite buffer分两次，每次1M顺序的写入共享表空间的物理磁盘上，然后马上调用fsync函数，同步磁盘，避免缓冲写带来的问题。在完成doublewrite页的写入后，在将doublewrite buffer中的页写入各个表空间文件中，这个时候的写入是离散的。可以通过命令show global status like "%innodb_dblwr%';如图

可以看到doublewrite一共写了1413988个页，但实际写入次数为111623。如果innodb_dblwr_pages_written:innodb_dblwr_writes小于64:1，说明系统写入压力并不是很高。参数innodb_buffer_pool_pages_flushed表示当前从缓冲池中刷新到磁盘页的数量。从上边介绍的可以知道，在生产环境中如果需要统计数据的写入量，最安全的方法还是应该通过innodb_dblwr_pages_written参数进行通过。可以通过参数innodb_doublewrite来设置设置是否开启doublewrite功能。skip_innodb_doublewrite也可以禁止使用doublewrite功能。
注意:有些文件系统本身就提供了部分写失效的防范机制，如ZFS文件系统。在这种情况下，就可以不用启用doublewrite。
3.自适应哈希索引
哈希是一种非常快的查找方法，在一般情况时间复杂度为O(1)。而B+树的查找次数，取决于B+树的高度，在生成环境中，B+树的高度一般为3-4层，不需要查询3-4次。InnoDB存储引擎会监控对表上各索引页的查询。如果观察到简历哈希索引可以提升速度，这简历哈希索引，称之为自适应哈希索引(Adaptive Hash Index, AHI)。AHI是通过缓冲池的B+树页构造而来的。因此建立的速度非常快，且不要对整张表构建哈希索引。InnoDB存储哟inquiry会自动根据房屋的频率和陌生来自动的为某些热点页建立哈希索引。

AHI有一个要求，对这个页的连续访问模式(查询条件)必须一样的。例如联合索引(a,b)其访问模式可以有以下情况:1.WHERE a=XXX;2.WHERE a=xxx AND b=xxx。若交替进行上述两张查询，InnoDB存储引擎不会对该页构造AHI。此外AHI还有如下要求：a.以该模式访问了100次；b.页通过该模式访问了N次，其中N=页中记录/16。根据官方文档显示，启用AHI后，读取和写入的速度可以提高2倍，负责索引的链接操作性能可以提高5倍。其设计思想是数据库自由化的，无需DBA对数据库进行人为调整。

n块r
</p>

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Explain the role of InnoDB redo logs and undo logs.Apr 15, 2025 am 12:16 AM

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

What are the key metrics to look for in an EXPLAIN output (type, key, rows, Extra)?Apr 15, 2025 am 12:15 AM

Key metrics for EXPLAIN commands include type, key, rows, and Extra. 1) The type reflects the access type of the query. The higher the value, the higher the efficiency, such as const is better than ALL. 2) The key displays the index used, and NULL indicates no index. 3) rows estimates the number of scanned rows, affecting query performance. 4) Extra provides additional information, such as Usingfilesort prompts that it needs to be optimized.

What is the Using temporary status in EXPLAIN and how to avoid it?Apr 15, 2025 am 12:14 AM

Usingtemporary indicates that the need to create temporary tables in MySQL queries, which are commonly found in ORDERBY using DISTINCT, GROUPBY, or non-indexed columns. You can avoid the occurrence of indexes and rewrite queries and improve query performance. Specifically, when Usingtemporary appears in EXPLAIN output, it means that MySQL needs to create temporary tables to handle queries. This usually occurs when: 1) deduplication or grouping when using DISTINCT or GROUPBY; 2) sort when ORDERBY contains non-index columns; 3) use complex subquery or join operations. Optimization methods include: 1) ORDERBY and GROUPB

Describe the different SQL transaction isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable) and their implications in MySQL/InnoDB.Apr 15, 2025 am 12:11 AM

MySQL/InnoDB supports four transaction isolation levels: ReadUncommitted, ReadCommitted, RepeatableRead and Serializable. 1.ReadUncommitted allows reading of uncommitted data, which may cause dirty reading. 2. ReadCommitted avoids dirty reading, but non-repeatable reading may occur. 3.RepeatableRead is the default level, avoiding dirty reading and non-repeatable reading, but phantom reading may occur. 4. Serializable avoids all concurrency problems but reduces concurrency. Choosing the appropriate isolation level requires balancing data consistency and performance requirements.

MySQL vs. Other Databases: Comparing the OptionsApr 15, 2025 am 12:08 AM

MySQL is suitable for web applications and content management systems and is popular for its open source, high performance and ease of use. 1) Compared with PostgreSQL, MySQL performs better in simple queries and high concurrent read operations. 2) Compared with Oracle, MySQL is more popular among small and medium-sized enterprises because of its open source and low cost. 3) Compared with Microsoft SQL Server, MySQL is more suitable for cross-platform applications. 4) Unlike MongoDB, MySQL is more suitable for structured data and transaction processing.

How does MySQL index cardinality affect query performance?Apr 14, 2025 am 12:18 AM

MySQL index cardinality has a significant impact on query performance: 1. High cardinality index can more effectively narrow the data range and improve query efficiency; 2. Low cardinality index may lead to full table scanning and reduce query performance; 3. In joint index, high cardinality sequences should be placed in front to optimize query.

MySQL: Resources and Tutorials for New UsersApr 14, 2025 am 12:16 AM

The MySQL learning path includes basic knowledge, core concepts, usage examples, and optimization techniques. 1) Understand basic concepts such as tables, rows, columns, and SQL queries. 2) Learn the definition, working principles and advantages of MySQL. 3) Master basic CRUD operations and advanced usage, such as indexes and stored procedures. 4) Familiar with common error debugging and performance optimization suggestions, such as rational use of indexes and optimization queries. Through these steps, you will have a full grasp of the use and optimization of MySQL.

Real-World MySQL: Examples and Use CasesApr 14, 2025 am 12:15 AM

MySQL's real-world applications include basic database design and complex query optimization. 1) Basic usage: used to store and manage user data, such as inserting, querying, updating and deleting user information. 2) Advanced usage: Handle complex business logic, such as order and inventory management of e-commerce platforms. 3) Performance optimization: Improve performance by rationally using indexes, partition tables and query caches.

See all articles