Home  >  Article  >  Database  >  MySQL multi-version concurrency control MVCC instance analysis

MySQL multi-version concurrency control MVCC instance analysis

PHPz
PHPzforward
2023-06-03 11:51:02811browse

    1. What is MVCC

    MVCC (Multiversion Concurrency Control), multi-version concurrency control. As the name suggests, MVCC implements concurrency control of the database through multiple version management of data rows. This technology ensures consistent read operations under InnoDB's transaction isolation level. In other words, it is to query some rows that are being updated by another transaction, and you can see the values ​​before they were updated, so that you do not have to wait for another transaction to release the lock when doing the query. There is no formal standard for MVCC. The implementation of MVCC may be different in different DBMS, and it is not universally used (you can refer to the relevant DBMS documentation). Here we explain the implementation mechanism of MVCC in InnoDB (other MySQL storage engines do not support it)

    2 Snapshot reading and current reading

    The implementation of MVCC in MySQL InnoDB is mainly to improve database concurrency Performance, use a better way to handle

    read-write conflicts

    , so that even if there are read-write conflicts, no locking, non-blocking concurrent reading can be achieved , and this read refers to the snapshot read, not the current read. The current read is actually a locking operation and an implementation of pessimistic locking. The essence of MVCC is a way of using optimistic locking thinking. 2.1 Snapshot read

    Snapshot read is also called consistent read, and it reads snapshot data.

    Simple SELECT without locking is a snapshot read

    , is a non-blocking read without locking; for example:

    select * from player where ...
    The reason why snapshot reading occurs , is based on the consideration of improving concurrency performance. The implementation of snapshot reading is based on MVCC, which avoids locking operations and reduces overhead in many cases.

    Since it is based on multiple versions, the snapshot read may not necessarily read the latest version of the data, but may be the previous historical version.

    The premise of snapshot reading is that the isolation level is not the serial level. Snapshot reading at the serial level will degenerate into current reading.

    2.2 Current Reading

    The current reading reads the latest version of the record (the latest data, not the historical version of the data). When reading, it must also ensure that other concurrent transactions cannot modify the current record. , the read records will be locked. A locked SELECT, or addition, deletion, or modification of data will result in current reading.

    For example:

    MySQL multi-version concurrency control MVCC instance analysis3. Review

    3.1 Let’s talk about the isolation level again

    We know that transactions have With 4 isolation levels, there may be three concurrency problems:

    MySQL multi-version concurrency control MVCC instance analysisIn MysQL, the default isolation level is repeatable read, which can solve dirty reads and non-repeatable reads. If we look at the problem of reading only from the perspective of definition, it does not solve the problem of phantom reading. If we want to solve the phantom read problem, we need to use serialization, that is, to increase the isolation level to the highest level, but this will significantly reduce the transaction concurrency capability of the database.

    MVCC can not use the lock mechanism, but solve the problem of non-repeatable reads and phantom reads through optimistic locking! It can replace row-level locks in most cases and reduce system overhead.

    MySQL multi-version concurrency control MVCC instance analysis3.2 Hidden fields, Undo Log version chain

    Let’s review the undo log version chain. For tables using the InnoDB storage engine, its aggregation Cluster index records contain two necessary hidden columns.

    trx_id: Every time a transaction changes a clustered index record, the transaction ID of the transaction will be assigned to the trx_id hidden column. roll_pointer: Every time a clustered index record is modified, the old version will be written to the undo log. Then this hidden column is equivalent to a pointer, through which the information before the record can be found is modified.

    MySQL multi-version concurrency control MVCC instance analysis

    MySQL multi-version concurrency control MVCC instance analysis

    MySQL multi-version concurrency control MVCC instance analysis

    ##4. MVCC Implementation Principle of ReadViewMySQL multi-version concurrency control MVCC instance analysis

    The implementation of MVCC depends on:

    Hidden fields, Undo Log, Read View

    4.1 What is ReadView

    In the MVCC mechanism, multiple transactions pair Updating the same row record will generate multiple historical snapshots, and these historical snapshots are saved in the Undo Log. If a transaction wants to query this row record, which version of the row record needs to be read? At this time, ReadView needs to be used, which helps us solve the row visibility problem.

    When a transaction uses the MVCC mechanism to perform a snapshot read operation, a read view will be generated. This view is ReadView. When a transaction is started, a current snapshot of the database system will be generated. InnoDB constructs an array for each transaction to record and maintain the ID of the current active transactions in the system ("active" refers to started but not yet submitted. ).

    4.2 Design Idea

    Use READ UNCOMMITTED isolation level transactions. Since you can read the records modified by uncommitted transactions, you can directly read the latest records of the records. The version is just fine.

    Use SERIALIZABLE isolation level transactions, InnoDB stipulates to use locking to access records.

    Use READ COMMITTED Transactions with isolation level REPEATABLE READ must ensure that they read the records modified by has been submitted by the transaction. If another transaction has modified the record but has not yet submitted it, it cannot directly When reading the latest version of records, the core problem is to determine which version in the version chain is visible to the current transaction. This is the main problem to be solved by ReadView.

    This ReadView mainly contains 4 The more important contents are as follows:

    MySQL multi-version concurrency control MVCC instance analysis

    4.3 Rules of ReadView

    With this ReadView, when accessing a certain record, Just follow the steps below to determine whether a certain version of the record is visible.

    • If the trx_id attribute value of the accessed version is the same as the creator_trx_id value in ReadView, it means The current transaction is accessing its own modified records, so this version can be accessed by the current transaction.

    • If the value of the trx_id attribute of the accessed version is less than the up_limit_id# in ReadView ## value, indicating that the transaction that generated this version has been committed before the current transaction generates ReadView, so this version can be accessed by the current transaction.

    • If the trx_id attribute value of the accessed version is greater than or Equal to the

      low_limit_id value in ReadView, indicating that the transaction that generated this version was opened after the current transaction generated ReadView, so this version cannot be accessed by the current transaction.

    • If the trx_id attribute value of the accessed version is between ReadView's

      up_limit_id and low_limit_id, then you need to determine whether the trx_id attribute value is in the trx_ids list. If it is, it means that the transaction that generated this version when the ReadView was created is still active, and this version cannot be accessed. If it is not there, it means that the transaction that generated this version when the ReadView was created has been committed, and this version can be accessed. 4.4 MVCC overall operation process

    After understanding these concepts, let’s take a look at how the system finds it through MVCC when querying a record:

    • 1. First get the version number of the transaction itself, which is the transaction ID;

    • 2. GetReadView;

    • 3. Query the obtained data and then compare it with the transaction version number in ReadView;

    • 4. If it does not meet the Readview rules, you need to obtain the historical snapshot from the Undo Log;

    • 5. Finally, data that conforms to the rules is returned.

    If a certain version of data is not visible to the current transaction, then follow the version chain to find the next version of data, continue to follow the above steps to determine the visibility, and so on. , up to the last version in the version chain. If the most recent version is not visible, the record is not visible to the transaction and is not included in the query results.

    In InnoDB, MVCC reads data through Undo Log Read View. Undo Log saves historical snapshots, and Read View rules help us determine whether the current version of data is visible.

    When the isolation level is Read Committed, each select query in a transaction will re-obtain a Read View.

    MySQL multi-version concurrency control MVCC instance analysis

    When the isolation level is rereadable, non-repeatable reads are avoided. This is because a transaction only obtains a Read View once when it is selected for the first time. All subsequent selects will reuse this Read View,

    is as follows:

    MySQL multi-version concurrency control MVCC instance analysis

    5. Example

    Assume that there is only one record inserted into the student table by the

    transaction whose transaction id is 8:

    MySQL multi-version concurrency control MVCC instance analysis

    MVCC can only work under two isolation levels: READ COMMITTED and REPEATABLE READ. Next, let’s take a look at the difference in the so-called timing of generating Readview between

    READ COMMITTED and REPEATABLE READ.

    5.1 READ COMMITTED

    Under isolation level:

    READ COMMITTED: A ReadView is generated before each data reading

    There are currently two transactions with transaction IDs 10 and 20 being executed:

    MySQL multi-version concurrency control MVCC instance analysis

    Description: Transaction execution During the process, only when the record is actually modified for the first time (such as using INSERT, DELETE, UPDATE statements), a separate transaction ID will be assigned, and this transaction ID will be incremented. That's why we updated some records in other tables in transaction 2 in order to let it assign transaction ids.

    At this moment, the version linked list obtained by the record with id 1 in the student table is as follows:

    MySQL multi-version concurrency control MVCC instance analysis

    Assume now A transaction using the READ COMMITED isolation level begins execution:

    MySQL multi-version concurrency control MVCC instance analysis

    The execution process of this SELECT1 is as follows:

    Step 1: When executing the SELECT statement, a ReadView will be generated first. The content of the trx_ids list of ReadView is [10, 20], up_limit_id is 10, low_limit_id is 21, creator_trx_id is 0.
    Step 2: Select visible records from the version chain. As can be seen from the figure, the content of the column name of the latest version is '王五', and the # of this version The value of ##trx_id is 10, which is in the trx_ids list, so it does not meet the visibility requirements and jumps to the next version according to roll_pointer. Step 3: The content of the column
    name of the next version is '李思', and the trx_id value of this version is also 10 is also in the trx_ids list, so it does not meet the requirements and continues to jump to the next version. Step 4: The content of the column
    name of the next version is ‘Zhang San', and the trx_id value of this version is 8, is smaller than the up_limit_id value 10 in ReadView, so this version meets the requirements. The version returned to the user is the record whose column name is "Zhang San".

    After that, we submit the transaction

    id to 10:

    MySQL multi-version concurrency control MVCC instance analysis

    and then go to

    Transaction id is 20 Update the record in table student where id is 1:

    MySQL multi-version concurrency control MVCC instance analysis

    At this moment, the version chain of the record with id 1 in the student table looks like this:

    MySQL multi-version concurrency control MVCC instance analysis

    Then continue to search for the record with id 1 in the transaction that just used the READ COMMITTED isolation level,

    is as follows:

    MySQL multi-version concurrency control MVCC instance analysis

    Execution of this SELECT2 The process is as follows:

    Step 1: When executing the SELECT statement, a ReadView will be generated separately. The content of the trx_ids list of the ReadView is [20], and the up_limit_id is 20. low_limit_id is 21, creator_trx_id is 0.
    Step 2:Select visible records from the version chain. As can be seen from the figure, the content of the column name of the latest version is "Song Ba", and the tr_id value of this version is 20, is in the trx_ids list, so it does not meet the visibility requirements and jumps to the next version according to roll.pointer.
    Step 3:The content of the column name of the next version is "Qian Qi". The trx_id value of this version is 20, which is also in the trx_ids list, so it does not meet the requirements. Continue to jump. to the next version. Step 4: The content of the column name of the next version is "王五". The trx_id value of this version is 10, which is less than the up_limit.id value 20 in ReadView, so this version meets the requirements, and finally returns The version given to users is the record whose column name is "王五".
    By analogy, if the record with transaction ID 20 is also submitted later, when the record with ID value 1 in table student is queried again in a transaction using READ CONMMITTED isolation level, the result obtained is "Song Ba" ;, we will not analyze the specific process.

    5.2 REPEATABLE READ

    Under the isolation level:

    For transactions using the REPEATABLE READ isolation level, the query statement will only be executed for the first time When a ReadView is generated, subsequent queries will not be generated repeatedly.

    For example, there are two transactions with transaction IDs 10 and 20 being executed in the system:

    MySQL multi-version concurrency control MVCC instance analysis

    此刻,表student中id为1的记录得到的版本链表如下所示:

    MySQL multi-version concurrency control MVCC instance analysis

    假设现在有一个使用REPEATABLE READ隔离级别的事务开始执行:

    MySQL multi-version concurrency control MVCC instance analysis

    此时执行过程与read committed相同

    MySQL multi-version concurrency control MVCC instance analysis

    MySQL multi-version concurrency control MVCC instance analysis

    然后再到刚才使用REPEATABLE READ隔离级别的事务中继续查找id为1的记录,如下:

    MySQL multi-version concurrency control MVCC instance analysis

    这个SELECT2的执行过程如下:

    步骤1:因为当前事务的隔离级别为REPEATABLE READ,而之前在执行SELECT1时已经生成过ReadView了,所以此时直接复用之前的ReadView,之前的ReadView的trx_ids列表的内容就是[10,20],up_limit_id为10, low_limit_id为21 , creator_trx_id为0。
    步骤2:然后从版本链中挑选可见的记录,从图中可以看出,最新版本的列name的内容是’宋八’trx_id值为20,在trx_ids列表内,所以不符合可见性要求,根据roll_pointer跳到下一个版本。
    步骤3:下一个版本的列name的内容是’钱七’,该版本的trx_id值为20,也在trx_ids列表内合要求,继续跳到下一个版本。
    步骤4:下一个版本的列name的内容是’王五’,该版本的trx_id值为10,而trx_ids列表中是包含值为10的事务id的,所以该版本也不符合要求,同理下一个列name的内容是’李四’的版本也不符合要求。继续跳到下个版本。
    步聚5∶下一个版本的列name的内容是’张三’,该版本的trx_id值为80,小于Readview中的up_limit_id值10,所以这个版本是符合要求的,最后返回给用户的版本就是这条列c为‘张三’的记录。
    两次SELECT查询得到的结果是重复的,记录的列c值都是’张三’,这就是可重复读的含义。如果我们之后再把事务id为20的记录提交了,然后再到刚才使用REPEATABLE READ隔离级刷的事务中继续查找这个id为1的记录,得到的结果还是’张三’,具体执行过程大家可以自己分析一下。

    5.3 如何解决幻读

    假设现在表student中只有一条数据,数据内容中,主键id=1,隐藏的trx_id=10,它的undo log如下图所示。

    MySQL multi-version concurrency control MVCC instance analysis

    假设现在有事务A和事务B并发执行,事务A的事务id为20,事务B的事务id为30。
    步骤1:事务A开始第一次查询数据,查询的SQL语句如下。

    select * from student where id > 1;

    在开始查询之前,MySQL会为事务A产生一个ReadView,此时ReadView的内容如下: trx_ids=[20, 30 ] ,up_limit_id=20 , low_limit_id=31 , creator_trx_id=20。

    因为表student只有一条符合条件 where id>=1 的数据,所以会被查询出来。然后根据ReadView机制,发现该行数据的trx_id=10,小于事务A的ReadView里up_limit_id,这表示这条数据是事务A开启之前,其他事务就已经提交了的数据,因此事务A可以读取到。

    结论:事务A的第一次查询,能读取到一条数据,id=1。

    步骤2∶接着事务B(trx_id=30),往表student中新插入两条数据,并提交事务。

    insert into student(id,name) values(2,'李四');
    insert into student(id,name) values(3,'王五');

    此时表student中就有三条数据了,对应的undo如下图所示:

    MySQL multi-version concurrency control MVCC instance analysis

    Step 3: Then transaction A starts the second query. According to the rules of repeatable read isolation level, transaction A will not regenerate ReadView at this time. At this time, the three pieces of data in the student table all meet the condition of where id>=1, so they will be found first. Then according to the ReadView mechanism, it is judged whether each piece of data can be seen by transaction A.
    1) First of all, the data with id=1, as mentioned before, can be seen by transaction A.
    2) Then there is the data with id=2, and its trx_id=30. At this time, transaction A finds that this value is between up_limit_id and low_limit_id, so it is necessary to determine whether 30 is in the trx_ids array. Since transaction A's trx_ids=[20,30], in the array, this means that the data with id=2 was submitted by other transactions started at the same time as transaction A, so this data cannot be seen by transaction A. .
    3) Similarly, the trx_id of this data with id=3 is also 30, so it cannot be seen by transaction A.

    MySQL multi-version concurrency control MVCC instance analysis

    Conclusion:The second query of transaction A can only query the data with id=1. This is the same as the result of the first query of transaction A, so there is no phantom reading phenomenon. Therefore, under the repeatable isolation level of MySQL, there is no phantom reading problem.

    The above is the detailed content of MySQL multi-version concurrency control MVCC instance analysis. For more information, please follow other related articles on the PHP Chinese website!

    Statement:
    This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete