Home  >  Article  >  Database  >  An article briefly analyzing how MySQL solves the phantom reading problem

An article briefly analyzing how MySQL solves the phantom reading problem

青灯夜游
青灯夜游forward
2023-02-06 20:14:162051browse

How does MySQL solve the phantom read problem? The following article will let you talk about this issue. Let’s read the article with questions!

An article briefly analyzing how MySQL solves the phantom reading problem

##  Among the high-frequency interview questions of Jin Busan and Yin Busi, MySQL’s transaction characteristics, isolation level and other issues are also one of the very classic eight-part essays. Faced with this kind of Question, it is estimated that most friends are familiar with it:

Transaction characteristics (ACID): Atomicity (Atomicity),Isolation(Isolation), Consistency(Consistency) and Persistence

Isolation Level: READ UNCOMMITTEDREAD UNCOMMITTED), READ COMMITTEDREAD COMMITTED), Repeatable Read (REPEATABLE READ), Serializable (SERIALIZABLE)

And each kind of isolation The problems caused by the level are:

  • READ UNCOMMITTEDUnder the isolation level, dirty reads, non-repeatable reads and ## may occur. #phantom readproblem
  • READ COMMITTED
  • Under the isolation level, non-repeatable read and phantom read problems may occur, but they are not possible Dirty readproblem
  • REPEATABLE READ
  • Under the isolation level, phantom readproblem may occur, but dirty read# cannot occur ## and Non-repeatable readproblemsSERIALIZABLE
  • Under the isolation level, various problems cannot occur
  • For MySQL InnoDB The default isolation level supported by the storage engine is
  • REPEATABLE-READ (repeatable)
. From the definition of the four isolation levels of the SQL standard above,

REPEATABLE-READ (repeatable) It is impossible to prevent phantom reads, but we all know that the MySQL InnoDB storage engine solves the problem of phantom reads, so how does it solve it? 1. Row format

 Before entering the topic, we first have a general understanding of what the row format is. This will help us understand the following MVCC. The row format is The way row records in the table are stored on disk, Innodb

storage engine has a total of 4 different types of row formats:

compact, redundant, dynamic , compress; Although there are many line formats, they are basically the same in principle, as follows, the compact line format: As can be seen from the figure, A complete record can actually be divided into two parts: recorded additional information and An article briefly analyzing how MySQL solves the phantom reading problemrecorded real data. recorded additional information are respectively changes. Long field length list , NULL value list and record header information , and recorded real data In addition to our own defined columns, MySQL will Add some default columns to each record. These default columns are also called hidden columns. The specific columns are as follows:

Column nameLengthDescriptionrow_id6 bytes Row ID, uniquely identifies one Recordtransaction_id6 bytesTransaction IDroll_pointer7 bytesRollback pointer

We don’t need to worry about the value of the hidden column. InnoDB The storage engine will generate it for us. Let’s draw it in more detail. compactThe row format is as follows:

An article briefly analyzing how MySQL solves the phantom reading problem

  • transaction_id: Transaction id. When a transaction modifies a row record, the transaction id of this transaction will be assigned to this column.
  • roll_pointer: Every time a row is modified When the record is changed, the old version of the data will be written to the undolog log, and then roll_pointer will point to the undolog, so this column is equivalent to a pointer, through which you can find Modify the previous information

2. MVCC detailed explanation

2.1 Version chain

Assume there is a record as follows: An article briefly analyzing how MySQL solves the phantom reading problemThe transaction id inserted into the record is 80, and the roll_pointer pointer is NULL (for ease of understanding, readers can understand that the pointer is NULL, in fact roll_pointer The first bit marks the type of undo log it points to. If the value of this bit is 1, it means that the undo log type it points to is insert undo)

Assume the next twoTransaction IDThe transactions with 100 and 200 respectively perform UPDATE operations on this record:

 -- 事务id=100
 update person set grade =20 where id =1;
 update person set grade =40 where id =1;
 -- 事务id=200
 update person set grade =70 where id =1;

Each time When the record is modified, an undo log will be recorded. Each undo log also has a roll_pointer attribute (INSERT corresponding to the operation) ##undo log does not have this attribute, because the record does not have an earlier version), you can connect these undo log into a linked list, so the current situation is like the picture below Same:

 Every time the record is updated, the old value will be placed in an An article briefly analyzing how MySQL solves the phantom reading problemundo log, even if it is an old version of the record, as the number of updates increases increase, all versions will be connected into a linked list by the roll_pointer attribute. We call this linked list version chain. The head node of the version chain is the latest value of the current record. In addition, each version also contains the corresponding transaction id

#2.2 ReadView

# Four isolation levels for the database: 1)

read uncommitted; 2) read committed; 3) REPEATABLE READ; 4)SERIALIZABLE; for example, READ UNCOMMITTED, just read the latest data of the version chain each time; SERIALIZABLE, mainly controlled by locking; and read committed and REPEATABLE READ They all read things that have been submitted, so for these two isolation levels, the core issue is which things in the version chain are visible to the current thing; in order to solve this problem, MySQL proposed the read view concept, which contains four core Concept:

  • m_ids: When generating read view, the active thing id collection
  • min_trx_id:# The minimum value of ##m_ids, that is, the minimum value of active things when generating read view
  • max_trx_id
  • : Indicates that when generating read view, the system The next thing id value
  • creator_trx_id
  • should be assigned: the thing id that creates read view, which is the current thing id.
  • With this
ReadView

, when accessing a record, you only need to follow the steps below to determine whether a certain version of the record is visible:An article briefly analyzing how MySQL solves the phantom reading problem

When the recorded thing id is equal to
    creator_trx_id
  • , it means that the current thing is accessing the record it modified, so this version is visible If the accessed version thing id is less than
  • min_trx_id
  • , it means that when read view is created, the thing has been submitted, and this version is readable to the current thingIf the accessed version of the thing If the id is greater than or equal to
  • max_trx_id
  • , it means that when the read view is created, the thing id that generates the version record is not opened until the Read view is generated. Therefore, this version cannot be read by the current thing If the accessed version thing
  • transaction_id
  • is in the m_ids collection, it means that Read view is generated time, the transaction is still active and has not been submitted, then the version cannot be accessed; if not, it means that the transaction that generated the version when ReadView was created has been submitted and can be accessed
  • Note: The thing id for reading things is 0

In MySQL, a very big difference between READ COMMITTED and REPEATABLE READ isolation levels is that they generate ReadView at different times:

  • READ COMMITTED —— Generate a ReadView
  • REPEATABLE READ —— before reading data every time When reading data once, a ReadView

is generated. Let’s use detailed examples to illustrate the difference between the two:

#②BEGIN;BEGIN;③update person set grade =20 where id = 1;④##⑤SELECT * FROM person WHERE id = 1; ⑨##In time ④, due to transaction In time ⑥, due to transaction
Time number
trx 100 trx 200
BEGIN;





##update person set grade =40 where id =1;



##⑥
##COMMIT;

##update person set grade =70 where id =1;

SELECT * FROM person WHERE id = 1;


COMMIT;

?
COMMIT;

trx 100
the transaction was executed Submit, the version chain recorded in the id=1 line is as follows:
trx 200

, the transaction commit was executed, the version chain recorded in the id=1 line As follows:

An article briefly analyzing how MySQL solves the phantom reading problemAt time ⑤, transaction trx 100 will first generate a

ReadView## when executing the

selectAn article briefly analyzing how MySQL solves the phantom reading problem statement. #, the content of the

m_ids

list of ReadView is [100, 200], min_trx_id is 100,max_trx_id is 201, creator_trx_id is 0, at this time, select the visible record from the version chain, and traverse the version chain from top to bottom: Because grade=40, the value of trx_id is 100, which is in m_ids, so the record is not visible. Similarly, the record with grade=20 is also invisible. Continue traversing down, grade=20, trx_id value is 80, which is less than min_trx_id value 100# in ReadView ##, so this version meets the requirements, and records with level 10 are returned to the user. In time ⑧, if the isolation level of the transaction is READ COMMITTED, a separate ReadView will be generated, the ## of the ReadView The content of the #m_ids list is

[200]

, min_trx_id is 200, max_trx_id is 201 , creator_trx_id is 0. At this time, select the visible record from the version chain, and the version chain is traversed from top to bottom: because grade=70, the value of trx_id is 200, in m_ids, so the record is not visible, continue to traverse, grade=40, trx_id value is 100, It is less than the min_trx_id value 200 in ReadView, so this version meets the requirements, and a record with level 40 is returned to the user. In time ⑧, if the isolation level of the transaction is REPEATABLE READ, in time ⑧, a ReadView will not be generated separately, but the one of time 5 will be used. ReadView, so the level returned to the user is 10. The result of the two selects is the same. This is the meaning of repeatable reading.

3. Summary By analyzing the detailed explanation of MVCC, it can be concluded that based on MVCC, under the RR isolation level, it is very easy to solve phantom reading Problem, but we know that select for update generates current reads and is no longer snapshot reads. In this case, how does MySQL solve the

phantom read

problem? Based on time issues (it does take a lot of time to organize and draw pictures), I will give the conclusion first, and then analyze how MySQL solves the phantom reading problem under the current reading situation:

Current reading: Use Next-Key Lock (gap lock) to lock to ensure that phantom reading does not occurHow gap lock is used in the current reading situation If you want to solve the problem of phantom reading, interested friends can add a follow and like [Related recommendations:

mysql video tutorial
    ]

The above is the detailed content of An article briefly analyzing how MySQL solves the phantom reading problem. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete