Home  >  Article  >  Database  >  What are the four transaction levels of MySQL InnoDB and dirty reads, non-repeated reads, and phantom reads?

What are the four transaction levels of MySQL InnoDB and dirty reads, non-repeated reads, and phantom reads?

一个新手
一个新手Original
2017-09-19 09:59:423656browse

1. MySQL InnoDB transaction isolation level dirty read, repeatable read, phantom read

MySQL InnoDB transaction isolation level has four levels, the default is "repeatable read" (REPEATABLE) READ).

·                                                                                                                                                                                                                                   Read Uncommitted (READUNCOMMITTED). Another transaction has modified the data but has not yet submitted it, and the SELECT in this transaction will read the uncommitted data (dirty read) ( The lowest isolation level and high concurrency performance ).

·         2). Submit reading (READCOMMITTED). What this transaction reads is the latest data (after other transactions are committed). The problem is that in the same transaction, the same SELECT will read different results twice (without repeated reading). There will be non-repeatable reading and phantom reading problems (locking the row being read)

· 3). Repeatable read (REPEATABLEREAD). In the same transaction, the result of SELECT is the state at the time when the transaction starts. Therefore, the results read by the same SELECT operation will be consistent. However, there will be phantom reading (explained later). Phantom reads occur (all rows read are locked).

· 4).Serialization (SERIALIZABLE). Read operations implicitly acquire shared locks, which ensures mutual exclusion (lock table) between different transactions.


Four levels gradually increase in intensity, each solving a problem.

· 1).Dirty reading. Another transaction has modified the data but has not yet committed it, and the SELECT in this transaction will read the uncommitted data.

· 2). No repeated reading. After solving the dirty read, you will encounter that during the execution of the same transaction, another transaction submitted new data, so the data results read twice by this transaction will be inconsistent.

· 3). Phantom reading. It solves the problem of non-repeated reading and ensures that in the same transaction, the results of query are in the state (consistency) at the beginning of the transaction. However, if another transaction submits new data at the same time, when this transaction updates, it will be "surprised" to find these new data. It seems that the data read before is a "ghost" illusion. .

Specifically:

## 1). Dirty read

First distinguish dirty pages and dirty data

Dirty pages are pages that have been modified in the memory buffer pool and have not been flushed to the hard disk in time, but have been written to the redo log middle. Reading and modifying pages in the buffer pool is normal and can improve efficiency. Flush can be synchronized. Dirty data means that the transaction has modified the row record in the buffer pool, but has not yet submitted it! ! ! , if uncommitted row data in the buffer pool is read at this time, it is called a dirty read, which violates the isolation of transactions. Dirty reading means that when a transaction is accessing data and has modified the data, but the modification has not yet been submitted to the database, another transaction also accesses the data and then uses the data. ## 2). Non-repeatable read

refers to reading the same data multiple times within a transaction. Before this transaction ends, another transaction also accesses the same data. Then, between the two reads of data in the first transaction, the second transaction has been committed due to the modifications of the second transaction. Then the data read twice by the first transaction may be different. In this way, the data read twice within a transaction is different, so it is called non-repeatable read. For example, an editor reads the same document twice, but between reads the author rewrites the document. When the editor reads the document a second time, the document has changed. Raw reads are not repeatable. This problem can be avoided if editors can only read the document after the author has finished writing it

## 3). Phantom reading:

It refers to a phenomenon that occurs when transactions are not executed independently. For example, the first transaction modifies the data in a table, and this modification involves all data rows in the table. At the same time, the second transaction also modifies the data in this table. This modification inserts a row of new data into the table. Then, in the future, the user who operates the first transaction will find that there are still unmodified data rows in the table, as if a hallucination has occurred. For example, an editor changes a document submitted by an author, but when production merges their changes into the master copy of the document, it is discovered that the author has added new, unedited material to the document. This problem can be avoided if no one can add new material to the document until the editors and production department have finished working on the original document.

2. Isolation level experiment The following experiment is based on the blogger MySQL Server 5.6

First create a table, as follows:

USE test;  
CREATE TABLE `t` (  
  
  `a` int(11) NOT NULL PRIMARY KEY  
  
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


##2.1. Experiment 1: Explain the dirty read and repeatable read issues



##Transaction C-1 REPEATABLE-READTransaction C-2 REPEATABLE-READTransactionD SERIALIZABLE##set autocommit =0;##start transaction ;start transaction;insert into t(a)values( 4);select * from t;1,2,3,4(dirty read: read data in uncommitted transactions)1,2,3 (solve dirty reads) 1,2, 31,2,31,2,3select * from t:


2.2. Experiment 2: Testing READ-COMMITTED and REPEATABLE-READ

Transaction A READ-UNCOMMITTED

Transaction B READ-COMMITTED ,

select * from t;

select * from t;

select * from t;

select * from t;

##commit;

1,2,3,4

select * from t:

1,2,3,4

##select * from t:

1,2,3,4 (not in the same transaction as the above, so the latest read after the transaction is committed, so 4 can be read)

select * from t:

1,2,3 (Repeated reading: Since it is in the same transaction as the above, only the data at the beginning of the transaction is read, that is, repeated reading)

select * from t:

1,2,3,4

commit (submit the transaction, the following is a new transaction, so you can read the latest data after the transaction is submitted)

select * from t:

1,2,3,4

READ-UNCOMMITTED will generate dirty reads and is rarely applicable to actual scenarios, so it is basically not used.

##set autocommit =0 ;##start transaction ;insert into t(a)values(4);##select * from t;1,2,31,2,3select * from t:READ-COMMITTED just ensures that the data that the latest transaction has committed is read.
##Transaction A

Transaction B READ-COMMITTED

Transaction C REPEATABLE-READ

start transaction;

start transaction;

select * from t;

##commit;

1,2,3,4

select * from t:

1,2,3 (repeated reading: Since it is in the same transaction as the above, only the data of the transaction start transaction is read, that is, repeated reading )

commit (commit the transaction, the following is a new transaction, so you can read the transaction commit The latest data in the future)

select * from t: 1,2, 3,4

##REPEATABLE-READ can ensure that the data read in a transaction is repeatable, that is, the same read (the first read In the future, even if other transactions have submitted new data, selecting again in the same transaction will not be read).

当然数据的可见性都是对不同事务来说的,同一个事务,都是可以读到此事务中最新数据的。如下,

  1. start transaction;  
    insert into t(a)values(4);  
    select *from t;    
    1,2,3,4;  
    insert into t(a)values(5);  
    select *from t;  
    1,2,3,4,5;


2.3、实验三:测试SERIALIZABLE事务对其他的影响


事务A SERIALIZABLE

事务B READ-UNCOMMITTED

事务C READ-COMMITTED,

事务D REPEATABLE-READ

事务E SERIALIZABLE

set autocommit =0;

       

start transaction ;

   

start transaction;

 

select a from t union all select sleep(1000) from dual;

       
 

insert into t(a)values(5);

insert into t(a)values(5);

insert into t(a)values(5);

insert into t(a)values(5);

 

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

 

SERIALIZABLE 串行化执行,导致所有其他事务不得不等待事务A结束才行可以执行,这里特意使用了sleep函数,直接导致事务B,C,D,E等待事务A持有释放的锁。由于我sleep了1000秒,而innodb_lock_wait_timeout为120s。所以120s到了就报错HY000错误。

SERIALIZABLE is a very strict serialization execution mode. Whether it is reading or writing, it will affect other transactions that read the same table. It is a strict table-level read-write exclusive lock. It also loses the advantages of the innodb engine. Practical applications are few.


2.4. Experiment 4: Phantom Read

Some articles write that InnoDB’s repeatable read avoids “phantom read” (phantom read). This statement is not accurate. . Do an experiment: (All the following experiments should pay attention to the storage engine and isolation level)

  1. CREATE TABLE `t_bitfly` (

  2. ##`id` bigint(20) NOT NULL default '0',

  3. ##`value`

    varchar(32) default NULL,

  4. ##PRIMARY

    KEY (`id`)

    ##) ENGINE=InnoDB
  5. DEFAULT
  6. CHARSET=utf8;

  7. select @@global.tx_isolation, @@tx_isolation;  

  8. +-----------------------+-----------------+  

  9. | @@global.tx_isolation | @@tx_isolation  |  

  10. +-----------------------+-----------------+  

  11. REPEATABLE-READ       | REPEATABLE-READ |  

  12. +-----------------------+-----------------+  

实验4-1:


Session A

Session B

start transaction ; start transaction ;

SELECT * FROM t_bitfly;
empty set

 


INSERT INTO t_bitfly VALUES (1, 'a');COMMIT;
SELECT * FROM t_bitfly;
| empty set


INSERT INTO t_bitfly VALUES (1, 'a');
|ERROR 1062 (23000):
|Duplicate entry '1' for key 1
( You just told me clearly that there is no such record)
I

In this way, phantom reading occurs, thinking that there is no data in the table, but in fact the data It already existed. After submitting, I found that the data conflicted.

Experiment 4-2:


#SELECT * FROM t_bitfly;

In this transaction, a row is read for the first time, and after an update is made, the data submitted in another transaction appears. It can also be seen as a kind of phantom reading.

Attached explanation


So, what is the reason why InnoDB pointed out that phantom reads can be avoided?

http://dev.mysql.com/doc/refman/5.0/en/innodb-record-level-locks.html

By default, InnoDB operates in REPEATABLE READ transaction isolation level and with the innodb_locks_unsafe_for_binlog system variable disabled. In this case, InnoDB uses next-key locks for searches and index scans, which prevents phantom rows (see Section 13.6.8.5, "Avoiding the Phantom Problem Using Next-Key Locking").

The prepared understanding is that when the isolation level is repeatable read and innodb_locks_unsafe_for_binlog is disabled, in Search and scan index Next-keylocks can be used to avoid phantom reads.

The key point is, does InnoDB also add next-key locks to a normal query by default, or does the application need to add the locks itself? If you just read this sentence, you may think that InnoDB also adds locks to ordinary queries. If so, what is the difference between it and serialization (SERIALIZABLE)?

There is another paragraph in the MySQL manual:

13.2.8.5. Avoiding the PhantomProblem Using Next-Key Locking (http://dev.mysql.com/doc/refman/5.0/en/ innodb-next-key-locking.html)

Toprevent phantoms, InnoDB usesan algorithm called next-key locking that combinesindex-row locking with gap locking.

You can use next-key locking to implement a uniqueness check in your application:If you read your data in share mode and do not see a duplicate for a row you are going to insert, then you can safely insert your row and know that the next-key lock set on the success or of your row during the read prevents anyone mean while inserting a duplicate for your row. Thus, the next-key locking enables you to “lock” the nonexistence of something in your table.

My understanding is Say, InnoDB provides next-key locks, but the application needs to lock it by itself. An example is provided in the manual:

SELECT * FROM child WHERE id> 100 FOR UPDATE;

这样,InnoDB会给id大于100的行(假如child表里有一行id为102),以及100-102,102+的gap都加上锁。

可以使用show engine innodb status来查看是否给表加上了锁。


再看一个实验,要注意,表t_bitfly里的id为主键字段。

实验4-3:


##Session A

Session B

start transaction ;

start transaction ;

SELECT * FROM t_bitfly;

| +------+-------+
| | id | value |
| +------+-------+
| | 1 |a |
| +------+-------+

INSERT INTO t_bitfly VALUES (2, 'b');

SELECT * FROM t_bitfly;

| +------+-------+
| | id | value |
| +------ +-------+
| | 1 |a |
| +------+-------+

COMMIT;

SELECT * FROM t_bitfly;

| +------+-------+
| | id | value |
| +------+-------+
| | 1 |a |
| +------+-------+

UPDATE t_bitfly SET value='z';

| Rows matched: 2 Changed:2 Warnings: 0

(How to get one more row)

| +------+-------+

| | id | value |
| +------+-------+
| | 1 |z |
| | 2 |z |
| +------+ -------+

Session A

Session B

start transaction ;

start transaction ;

SELECT * FROM t_bitfly
 WHERE id<=1
 FOR UPDATE;
| +------+-------+
| | id   | value |
| +------+-------+
| |    1 |a     |
| +------+-------+

 
 

 INSERT INTO t_bitfly   VALUES (2, 'b');
| Query OK, 1 row affected

SELECT * FROM t_bitfly;
| +------+-------+
| | id   | value |
| +------+-------+
| |    1 |a     |
| +------+-------+

 

 

INSERT INTO t_bitfly VALUES (0, '0');
|  (waiting for lock ...
|   then timeout) ERROR 1205 (HY000):Lock wait timeout exceeded;
|try restarting transaction

SELECT * FROM t_bitfly;
| +------+-------+
| | id   | value |
| +------+-------+
| |    1 |a     |
| +------+-------+

 

 

COMMIT;

SELECT * FROM t_bitfly;
| +------+-------+
| | id   | value |
| +------+-------+
| |    1 |a     |
| +------+-------+

 

You can see that the lock added with id<=1 only locks the range of id<=1. The record with id 2 can be added successfully. When adding the record with id 0, it will wait for the lock to be released.

Attached note:

Detailed explanation of locks in repeatable reads in the MySQL manual:

http://dev.mysql.com/doc/refman/5.0 /en/set-transaction.html#isolevel_repeatable-read

For locking reads (SELECT with FOR UPDATE or LOCK IN SHARE MODE),UPDATE, and DELETE statements, locking depends on whether the statement uses a unique index with a unique search condition, or a range-type search condition. For a unique index with a unique search condition, InnoDB locks only the index record found, not the gap before it. For other search conditions, InnoDB locks the index range scanned, using gap locks or next-key (gap plus index-record)locks to block insertions by other sessions into the gaps covered by the range.

Consistency read and commit read, first look at the experiment ,

Experiment 4-4:


##Session A

Session B

start transaction ;

start transaction ;

SELECT * FROM t_bitfly;
| +------+-------+
| | id   | value |
| +------+-------+
| |    1 |a     |
| +------+-------+

 
 

 INSERT INTO t_bitfly   VALUES (2, 'b');

 

COMMIT;

SELECT * FROM t_bitfly;
| +------+-------+
| | id   | value |
| +------+-------+
| |    1 |a     |
| +------+-------+

 

SELECT * FROM t_bitfly LOCK IN SHARE MODE;
| +----+-------+
| | id | value |
| +----+-------+
| |  1 |a     |
| |  2 |b     |
| +----+-------+

 

SELECT * FROM t_bitfly FOR UPDATE;
| +----+-------+
| | id | value |
| +----+-------+
| |  1 |a     |
| |  2 |b     |
| +----+-------+

 

SELECT * FROM t_bitfly;
| +----+-------+
| | id | value |
| +----+-------+
| |  1 |a     |
| +----+-------+

 

Attached note: If you use ordinary reading, you will get consistent results. If you use locked reading, you will read the "latest" "committed" reading results.

itself, repeatable read and committed read are contradictory. In the same transaction, if repeatable reading is guaranteed, the commits of other transactions will not be visible, which violates committed read; if committed read is guaranteed, the results of the two previous reads will be inconsistent, which violates repeatable. read.

It can be said that InnoDB provides such a mechanism. In the default repeatable read isolation level, you can use locked read to query the latest data.

http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-read.html

If you want to see the “freshest” state of the database , you should use either theREAD COMMITTED isolation level or a locking read:
SELECT * FROM t_bitfly LOCK IN SHARE MODE;

------

3. Summary

Conclusion: The default isolation level of MySQL InnoDB transactions is repeatable read, which does not guarantee the avoidance of phantom reads. The application needs to use locked reads to ensure this. The mechanism used for this locking degree is next-key locks.

The above is the detailed content of What are the four transaction levels of MySQL InnoDB and dirty reads, non-repeated reads, and phantom reads?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn