What are the four transaction levels of MySQL InnoDB and dirty reads, non-repeated reads, and phantom reads?-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

What are the four transaction levels of MySQL InnoDB and dirty reads, non-repeated reads, and phantom reads?

一个新手

Sep 19, 2017 am 09:59 AM

innodbmysqllevel

1. MySQL InnoDB transaction isolation level dirty read, repeatable read, phantom read

MySQL InnoDB transaction isolation level has four levels, the default is "repeatable read" (REPEATABLE) READ).

· Read Uncommitted (READUNCOMMITTED). Another transaction has modified the data but has not yet submitted it, and the SELECT in this transaction will read the uncommitted data (dirty read) ( The lowest isolation level and high concurrency performance ).

· 2). Submit reading (READCOMMITTED). What this transaction reads is the latest data (after other transactions are committed). The problem is that in the same transaction, the same SELECT will read different results twice (without repeated reading). There will be non-repeatable reading and phantom reading problems (locking the row being read)

· 3). Repeatable read (REPEATABLEREAD). In the same transaction, the result of SELECT is the state at the time when the transaction starts. Therefore, the results read by the same SELECT operation will be consistent. However, there will be phantom reading (explained later). Phantom reads occur (all rows read are locked).

· 4).Serialization (SERIALIZABLE). Read operations implicitly acquire shared locks, which ensures mutual exclusion (lock table) between different transactions.

‘

Four levels gradually increase in intensity, each solving a problem.

· 1).Dirty reading. Another transaction has modified the data but has not yet committed it, and the SELECT in this transaction will read the uncommitted data.

· 2). No repeated reading. After solving the dirty read, you will encounter that during the execution of the same transaction, another transaction submitted new data, so the data results read twice by this transaction will be inconsistent.

· 3). Phantom reading. It solves the problem of non-repeated reading and ensures that in the same transaction, the results of query are in the state (consistency) at the beginning of the transaction. However, if another transaction submits new data at the same time, when this transaction updates, it will be "surprised" to find these new data. It seems that the data read before is a "ghost" illusion. .

Specifically:

## 1). Dirty read

First distinguish dirty pages and dirty data

Dirty pages are pages that have been modified in the memory buffer pool and have not been flushed to the hard disk in time, but have been written to the redo log middle. Reading and modifying pages in the buffer pool is normal and can improve efficiency. Flush can be synchronized. Dirty data means that the transaction has modified the row record in the buffer pool, but has not yet submitted it! ! ! , if uncommitted row data in the buffer pool is read at this time, it is called a dirty read, which violates the isolation of transactions. Dirty reading means that when a transaction is accessing data and has modified the data, but the modification has not yet been submitted to the database, another transaction also accesses the data and then uses the data. ## 2). Non-repeatable read

refers to reading the same data multiple times within a transaction. Before this transaction ends, another transaction also accesses the same data. Then, between the two reads of data in the first transaction, the second transaction has been committed due to the modifications of the second transaction. Then the data read twice by the first transaction may be different. In this way, the data read twice within a transaction is different, so it is called non-repeatable read. For example, an editor reads the same document twice, but between reads the author rewrites the document. When the editor reads the document a second time, the document has changed. Raw reads are not repeatable. This problem can be avoided if editors can only read the document after the author has finished writing it

## 3). Phantom reading:

It refers to a phenomenon that occurs when transactions are not executed independently. For example, the first transaction modifies the data in a table, and this modification involves all data rows in the table. At the same time, the second transaction also modifies the data in this table. This modification inserts a row of new data into the table. Then, in the future, the user who operates the first transaction will find that there are still unmodified data rows in the table, as if a hallucination has occurred. For example, an editor changes a document submitted by an author, but when production merges their changes into the master copy of the document, it is discovered that the author has added new, unedited material to the document. This problem can be avoided if no one can add new material to the document until the editors and production department have finished working on the original document.

2. Isolation level experiment The following experiment is based on the blogger MySQL Server 5.6

First create a table, as follows:

USE test;  
CREATE TABLE `t` (  
  
  `a` int(11) NOT NULL PRIMARY KEY  
  
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

##2.1. Experiment 1: Explain the dirty read and repeatable read issues

##Transaction C-1 REPEATABLE-READTransaction C-2 REPEATABLE-READTransactionD SERIALIZABLE##set autocommit =0;##start transaction ;start transaction;insert into t(a)values( 4);select * from t;1,2,3,4(dirty read: read data in uncommitted transactions)1,2,3 (solve dirty reads) 1,2, 31,2,31,2,3select * from t:

2.2. Experiment 2: Testing READ-COMMITTED and REPEATABLE-READ

	Transaction A READ-UNCOMMITTED	Transaction B READ-COMMITTED ,



	select * from t;	select * from t;	select * from t;	select * from t;

					##commit;

1,2,3,4	select * from t: 1,2,3,4	##select * from t: 1,2,3,4 (not in the same transaction as the above, so the latest read after the transaction is committed, so 4 can be read)	select * from t: 1,2,3 (Repeated reading: Since it is in the same transaction as the above, only the data at the beginning of the transaction is read, that is, repeated reading)	select * from t: 1,2,3,4
				commit (submit the transaction, the following is a new transaction, so you can read the latest data after the transaction is submitted)
				select * from t: 1,2,3,4
READ-UNCOMMITTED will generate dirty reads and is rarely applicable to actual scenarios, so it is basically not used.

##set autocommit =0 ;##start transaction ;insert into t(a)values(4);##select * from t;1,2,31,2,3select * from t:READ-COMMITTED just ensures that the data that the latest transaction has committed is read.

##Transaction A

Transaction B READ-COMMITTED

Transaction C REPEATABLE-READ

start transaction;

select * from t;

##commit;

1,2,3,4

select * from t:

1,2,3 (repeated reading: Since it is in the same transaction as the above, only the data of the transaction start transaction is read, that is, repeated reading )

commit (commit the transaction, the following is a new transaction, so you can read the transaction commit The latest data in the future)

select * from t:

1,2, 3,4

##REPEATABLE-READ can ensure that the data read in a transaction is repeatable, that is, the same read (the first read In the future, even if other transactions have submitted new data, selecting again in the same transaction will not be read).

当然数据的可见性都是对不同事务来说的，同一个事务，都是可以读到此事务中最新数据的。如下，

start transaction;  
insert into t(a)values(4);  
select *from t;    
1,2,3,4;  
insert into t(a)values(5);  
select *from t;  
1,2,3,4,5;

2.3、实验三：测试SERIALIZABLE事务对其他的影响

事务A SERIALIZABLE	事务B READ-UNCOMMITTED	事务C READ-COMMITTED,	事务D REPEATABLE-READ	事务E SERIALIZABLE
set autocommit =0;
start transaction ;			start transaction;
select a from t union all select sleep(1000) from dual;
	insert into t(a)values(5);	insert into t(a)values(5);	insert into t(a)values(5);	insert into t(a)values(5);
	ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction	ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction	ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction	ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
	SERIALIZABLE 串行化执行，导致所有其他事务不得不等待事务A结束才行可以执行，这里特意使用了sleep函数，直接导致事务B,C,D,E等待事务A持有释放的锁。由于我sleep了1000秒，而innodb_lock_wait_timeout为120s。所以120s到了就报错HY000错误。
SERIALIZABLE is a very strict serialization execution mode. Whether it is reading or writing, it will affect other transactions that read the same table. It is a strict table-level read-write exclusive lock. It also loses the advantages of the innodb engine. Practical applications are few.

2.4. Experiment 4: Phantom Read

Some articles write that InnoDB’s repeatable read avoids “phantom read” (phantom read). This statement is not accurate. . Do an experiment: (All the following experiments should pay attention to the storage engine and isolation level)

CREATE TABLE `t_bitfly` (
##`id` bigint(20) NOT NULL default '0',
##`value`
varchar(32) default NULL,
##PRIMARY
KEY (`id`)
##) ENGINE=InnoDB
CHARSET=utf8;
select @@global.tx_isolation, @@tx_isolation;
+-----------------------+-----------------+
| @@global.tx_isolation | @@tx_isolation |
+-----------------------+-----------------+
| REPEATABLE-READ | REPEATABLE-READ |
+-----------------------+-----------------+

实验4-1：

Session A	Session B
start transaction ;	start transaction ;
SELECT * FROM t_bitfly; empty set
	INSERT INTO t_bitfly VALUES (1, 'a');COMMIT;
SELECT * FROM t_bitfly; \| empty set
INSERT INTO t_bitfly VALUES (1, 'a'); \|ERROR 1062 (23000): \|Duplicate entry '1' for key 1 ( You just told me clearly that there is no such record)	I

In this way, phantom reading occurs, thinking that there is no data in the table, but in fact the data It already existed. After submitting, I found that the data conflicted.

Experiment 4-2:

#SELECT * FROM t_bitfly;

In this transaction, a row is read for the first time, and after an update is made, the data submitted in another transaction appears. It can also be seen as a kind of phantom reading.

Attached explanation

So, what is the reason why InnoDB pointed out that phantom reads can be avoided?

http://dev.mysql.com/doc/refman/5.0/en/innodb-record-level-locks.html

By default, InnoDB operates in REPEATABLE READ transaction isolation level and with the innodb_locks_unsafe_for_binlog system variable disabled. In this case, InnoDB uses next-key locks for searches and index scans, which prevents phantom rows (see Section 13.6.8.5, "Avoiding the Phantom Problem Using Next-Key Locking").

The prepared understanding is that when the isolation level is repeatable read and innodb_locks_unsafe_for_binlog is disabled, in Search and scan index Next-keylocks can be used to avoid phantom reads.

The key point is, does InnoDB also add next-key locks to a normal query by default, or does the application need to add the locks itself? If you just read this sentence, you may think that InnoDB also adds locks to ordinary queries. If so, what is the difference between it and serialization (SERIALIZABLE)?

There is another paragraph in the MySQL manual:

13.2.8.5. Avoiding the PhantomProblem Using Next-Key Locking (http://dev.mysql.com/doc/refman/5.0/en/ innodb-next-key-locking.html)

Toprevent phantoms, InnoDB usesan algorithm called next-key locking that combinesindex-row locking with gap locking.

You can use next-key locking to implement a uniqueness check in your application:If you read your data in share mode and do not see a duplicate for a row you are going to insert, then you can safely insert your row and know that the next-key lock set on the success or of your row during the read prevents anyone mean while inserting a duplicate for your row. Thus, the next-key locking enables you to “lock” the nonexistence of something in your table.

My understanding is Say, InnoDB provides next-key locks, but the application needs to lock it by itself. An example is provided in the manual:

SELECT * FROM child WHERE id> 100 FOR UPDATE;

这样，InnoDB会给id大于100的行（假如child表里有一行id为102），以及100-102，102+的gap都加上锁。

可以使用show engine innodb status来查看是否给表加上了锁。

再看一个实验，要注意，表t_bitfly里的id为主键字段。

实验4-3：

##Session A	Session B
start transaction ;	start transaction ;
SELECT * FROM t_bitfly; \| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|a \| \| +------+-------+
	INSERT INTO t_bitfly VALUES (2, 'b');
SELECT * FROM t_bitfly; \| +------+-------+ \| \| id \| value \| \| +------ +-------+ \| \| 1 \|a \| \| +------+-------+
	COMMIT;
SELECT * FROM t_bitfly; \| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|a \| \| +------+-------+
UPDATE t_bitfly SET value='z'; \| Rows matched: 2 Changed:2 Warnings: 0 (How to get one more row)
\| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|z \| \| \| 2 \|z \| \| +------+ -------+

Session A	Session B
start transaction ;	start transaction ;
SELECT * FROM t_bitfly WHERE id<=1 FOR UPDATE; \| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|a \| \| +------+-------+
	INSERT INTO t_bitfly VALUES (2, 'b'); \| Query OK, 1 row affected
SELECT * FROM t_bitfly; \| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|a \| \| +------+-------+
	INSERT INTO t_bitfly VALUES (0, '0'); \| (waiting for lock ... \| then timeout) ERROR 1205 (HY000):Lock wait timeout exceeded; \|try restarting transaction
SELECT * FROM t_bitfly; \| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|a \| \| +------+-------+
	COMMIT;
SELECT * FROM t_bitfly; \| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|a \| \| +------+-------+

You can see that the lock added with id

Attached note:

Detailed explanation of locks in repeatable reads in the MySQL manual:

http://dev.mysql.com/doc/refman/5.0 /en/set-transaction.html#isolevel_repeatable-read

For locking reads (SELECT with FOR UPDATE or LOCK IN SHARE MODE),UPDATE, and DELETE statements, locking depends on whether the statement uses a unique index with a unique search condition, or a range-type search condition. For a unique index with a unique search condition, InnoDB locks only the index record found, not the gap before it. For other search conditions, InnoDB locks the index range scanned, using gap locks or next-key (gap plus index-record)locks to block insertions by other sessions into the gaps covered by the range.

Consistency read and commit read, first look at the experiment ,

Experiment 4-4:

##Session A	Session B
start transaction ;	start transaction ;
SELECT * FROM t_bitfly; \| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|a \| \| +------+-------+
	INSERT INTO t_bitfly VALUES (2, 'b');
	COMMIT;
SELECT * FROM t_bitfly; \| +------+-------+ \| \| id \| value \| \| +------+-------+ \| \| 1 \|a \| \| +------+-------+
SELECT * FROM t_bitfly LOCK IN SHARE MODE; \| +----+-------+ \| \| id \| value \| \| +----+-------+ \| \| 1 \|a \| \| \| 2 \|b \| \| +----+-------+
SELECT * FROM t_bitfly FOR UPDATE; \| +----+-------+ \| \| id \| value \| \| +----+-------+ \| \| 1 \|a \| \| \| 2 \|b \| \| +----+-------+
SELECT * FROM t_bitfly; \| +----+-------+ \| \| id \| value \| \| +----+-------+ \| \| 1 \|a \| \| +----+-------+

Attached note: If you use ordinary reading, you will get consistent results. If you use locked reading, you will read the "latest" "committed" reading results.

itself, repeatable read and committed read are contradictory. In the same transaction, if repeatable reading is guaranteed, the commits of other transactions will not be visible, which violates committed read; if committed read is guaranteed, the results of the two previous reads will be inconsistent, which violates repeatable. read.

It can be said that InnoDB provides such a mechanism. In the default repeatable read isolation level, you can use locked read to query the latest data.

http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-read.html

If you want to see the “freshest” state of the database , you should use either theREAD COMMITTED isolation level or a locking read:
SELECT * FROM t_bitfly LOCK IN SHARE MODE;

------

3. Summary

Conclusion: The default isolation level of MySQL InnoDB transactions is repeatable read, which does not guarantee the avoidance of phantom reads. The application needs to use locked reads to ensure this. The mechanism used for this locking degree is next-key locks.

The above is the detailed content of What are the four transaction levels of MySQL InnoDB and dirty reads, non-repeated reads, and phantom reads?. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Explain the role of InnoDB redo logs and undo logs.Apr 15, 2025 am 12:16 AM

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

What are the key metrics to look for in an EXPLAIN output (type, key, rows, Extra)?Apr 15, 2025 am 12:15 AM

Key metrics for EXPLAIN commands include type, key, rows, and Extra. 1) The type reflects the access type of the query. The higher the value, the higher the efficiency, such as const is better than ALL. 2) The key displays the index used, and NULL indicates no index. 3) rows estimates the number of scanned rows, affecting query performance. 4) Extra provides additional information, such as Usingfilesort prompts that it needs to be optimized.

What is the Using temporary status in EXPLAIN and how to avoid it?Apr 15, 2025 am 12:14 AM

Usingtemporary indicates that the need to create temporary tables in MySQL queries, which are commonly found in ORDERBY using DISTINCT, GROUPBY, or non-indexed columns. You can avoid the occurrence of indexes and rewrite queries and improve query performance. Specifically, when Usingtemporary appears in EXPLAIN output, it means that MySQL needs to create temporary tables to handle queries. This usually occurs when: 1) deduplication or grouping when using DISTINCT or GROUPBY; 2) sort when ORDERBY contains non-index columns; 3) use complex subquery or join operations. Optimization methods include: 1) ORDERBY and GROUPB

Describe the different SQL transaction isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable) and their implications in MySQL/InnoDB.Apr 15, 2025 am 12:11 AM

MySQL/InnoDB supports four transaction isolation levels: ReadUncommitted, ReadCommitted, RepeatableRead and Serializable. 1.ReadUncommitted allows reading of uncommitted data, which may cause dirty reading. 2. ReadCommitted avoids dirty reading, but non-repeatable reading may occur. 3.RepeatableRead is the default level, avoiding dirty reading and non-repeatable reading, but phantom reading may occur. 4. Serializable avoids all concurrency problems but reduces concurrency. Choosing the appropriate isolation level requires balancing data consistency and performance requirements.

MySQL vs. Other Databases: Comparing the OptionsApr 15, 2025 am 12:08 AM

MySQL is suitable for web applications and content management systems and is popular for its open source, high performance and ease of use. 1) Compared with PostgreSQL, MySQL performs better in simple queries and high concurrent read operations. 2) Compared with Oracle, MySQL is more popular among small and medium-sized enterprises because of its open source and low cost. 3) Compared with Microsoft SQL Server, MySQL is more suitable for cross-platform applications. 4) Unlike MongoDB, MySQL is more suitable for structured data and transaction processing.

How does MySQL index cardinality affect query performance?Apr 14, 2025 am 12:18 AM

MySQL index cardinality has a significant impact on query performance: 1. High cardinality index can more effectively narrow the data range and improve query efficiency; 2. Low cardinality index may lead to full table scanning and reduce query performance; 3. In joint index, high cardinality sequences should be placed in front to optimize query.

MySQL: Resources and Tutorials for New UsersApr 14, 2025 am 12:16 AM

The MySQL learning path includes basic knowledge, core concepts, usage examples, and optimization techniques. 1) Understand basic concepts such as tables, rows, columns, and SQL queries. 2) Learn the definition, working principles and advantages of MySQL. 3) Master basic CRUD operations and advanced usage, such as indexes and stored procedures. 4) Familiar with common error debugging and performance optimization suggestions, such as rational use of indexes and optimization queries. Through these steps, you will have a full grasp of the use and optimization of MySQL.

Real-World MySQL: Examples and Use CasesApr 14, 2025 am 12:15 AM

MySQL's real-world applications include basic database design and complex query optimization. 1) Basic usage: used to store and manage user data, such as inserting, querying, updating and deleting user information. 2) Advanced usage: Handle complex business logic, such as order and inventory management of e-commerce platforms. 3) Performance optimization: Improve performance by rationally using indexes, partition tables and query caches.

See all articles