Home  >  Article  >  Database  >  A brief understanding of MySQL locks, transactions, and MVCC

A brief understanding of MySQL locks, transactions, and MVCC

coldplay.xixi
coldplay.xixiforward
2020-11-04 17:31:122326browse

mysql tutorial This column introduces a simple understanding of MySQL locks, transactions, and MVCC.

A brief understanding of MySQL locks, transactions, and MVCC

More related free learning recommendations: mysql tutorial(video)

When a single SQL statement is executed, will it be submitted as a transaction?

The following content is excerpted from "High-Performance MySQL" (3rd Edition)

"

MySQL uses automatic submission by default ( AUTOCOMMIT) mode. That is, if you do not explicitly start a transaction, each query is treated as a transaction to perform a commit operation. In the current connection, you can enable or disable the automatic commit mode by setting the AUTOCOMMIT variable

How does MySQL implement ACID for transactions?

Transactions have the four major characteristics of ACID. So how does MySQL implement these four attributes of transactions?

  • Atomicity Either all succeed or all fail. MySQL achieves atomicity by recording undo_log. undo_log is rollback log. Undo_log is written to the disk before the real SQL is executed, and then the database data is operated. If an exception or rollback occurs, you can perform reverse operations based on undo_log to restore the data as it was before the transaction was executed.

  • Persistence Once a transaction is committed normally, its impact on the database should be permanent. Even if the system crashes at this time, the modified data will not be lost. InnoDB is the storage engine of MySQL, and data is stored on the disk. However, if disk IO is required every time to read and write data, the efficiency will be very low. To this end, InnoDB provides a cache (Buffer Pool) as a buffer for accessing the database: when reading data from the database, it will first be read from the Buffer Pool. If it is not in the Buffer Pool, it will be read from the disk and put into the Buffer. Pool; when writing data to the database, it will be written to the Buffer Pool first, and the modified data in the Buffer Pool will be regularly refreshed to the disk.

    Such a design also brings corresponding problems: If the data is submitted and the data is still in the buffer pool (the disk has not been flushed yet), what should I do if MySQL goes down or loses power? Will data be lost?

    The answer is no, MySQL ensures durability through the redo_log mechanism. redo_log is redo log. Simply put, when the data is modified, in addition to modifying the data in the Buffer Pool, the operation will also be recorded in the redo_log; when the transaction is submitted, the fsync interface will be called to flush the redo_log. plate. If MySQL goes down, you can read the data in redo_log and recover the database when it restarts.

  • Isolation

    Isolation is the most complex one in ACID, which involves the concept of isolation level, there are four in total

    • Read uncommitted
    • Read committed
    • Repeatable read
    • Serializable

    Simply put, the isolation level stipulates: modification of data in a transaction, which transactions are visible and which are not visible. Isolation is to manage the access sequence of multiple concurrent read and write requests.

    MySQL We will discuss the specific implementation of isolation later.

  • Consistency

    Achieve consistency through rollback, recovery and isolation in concurrent environments.

Problems that may be caused by transaction concurrency

Through the previous question, I Knowing that a single DDL execution will be automatically submitted as a transaction, whether it is the concurrency of multiple SQLs or the concurrency of multiple manually organized transactions containing multiple SQLs, it will lead to transaction concurrency problems.

Specifically:

  • Dirty write (data submitted by one transaction overwrites uncommitted data by another transaction)
  • Dirty read (one transaction reads uncommitted data from another transaction)
  • Non-repeatable read (the key point is that the data read multiple times within a transaction is different between update and delete)
  • Phantom reading (the key point is that the number of records read multiple times in an insert transaction is different)

We mentioned the isolation level of the transaction above , all isolation levels of MySQL can guarantee that dirty writes will not occur, so the only problems left are dirty reads, non-repeatable reads and phantom reads.

Let’s take a closer look at how each isolation level solves or does not solve the above problems:

Read uncommitted

Uncommitted read. This level does not add any locks during the reading process. It only locks during the write request. Therefore, the write operation modifies the data during the reading process. Will cause dirty reads. Non-repeatable reads and phantom reads will naturally occur.

Read committed

Committed read, like uncommitted read, is not locked for reading and locked for writing. . The difference is that the MVCC mechanism is used to avoid the problem of dirty reads. There are also problems of non-repeatable reads and phantom reads. We will talk about MVCC in detail later.

Repeatable read

MySQL default isolation level, at this level MySQL uses two methods to solve the problem

  1. Read-write lock A read lock is added when reading in parallel, and reading and reading share the lock. As long as there is a write request, a write lock is added, so that reading and writing are serial. Locking is performed when reading data, and other transactions cannot modify the data. Therefore, non-repeatable reads will not occur. Locks are also required when modifying and deleting data. Other transactions cannot read the data, so dirty reads will not occur. The first method is what we often call "pessimistic lock". The data is locked during the entire transaction process, which is relatively conservative and has a relatively large performance overhead.
  2. MVCC (discussed later)

In addition, Next-Key lock is also used to solve the problem of phantom reading to a certain extent. We’ll talk about this later.

Serializable

Under this isolation level, transactions are executed serially. If autocommit is disabled, InnoDB implicitly converts all ordinary SELECT statements to SELECT ... LOCK IN SHARE MODE. That is, a read shared lock is implicitly added to the read operation, thereby avoiding the problems of dirty reads, non-repeatable reads, and phantom reads.

MVCC

Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages ​​to implement transactional memory (MCC or MVCC) is a concurrency control method that is usually used by database management systems to provide concurrent access to the database and to implement transaction storage in programming languages.

Simply put, it is a method used by the database to control concurrency. Each database may have a different implementation of MVCC.

Taking our commonly used MySQL as an example, MySQL's InnoDB engine implements MVCC.

What problems can MVCC solve?

From the above definition, we can see that MVCC mainly solves transaction concurrency issues. Data consistency issues

How InnoDB implements MVCC

The following picture is from "High Performance MySQL" (3rd Edition)

This book is well written and translated well. My initial systematic understanding of MySQL was also due to reading this book. However, I personally think there are some problems with the description of how MVCC is implemented. of.

Let’s see where the problem is

  • First, let’s take a look at the official documentation of MySQL. I compared 5.1, 5.6, and 5.7. The description of this part of MVCC in two versions of document[1] is almost the same.

According to the document, it is obvious that three hidden columns are added to each piece of data:

  • ## The #6-byte DB_TRX_ID field indicates the transaction ID of the most recent insertion or update of the record.
  • The 7-byte DB_ROLL_PTR field points to the undo log record of the rollback segment of the record.
  • The 6-byte DB_ROW_ID will be automatically incremented when new data is inserted. When there is no user primary key on the table, InnoDB will automatically generate a clustered index, including the DB_ROW_ID field.
Here I add a MySQL internal structure diagram including rollback segment

Version chain

Before we I have talked about the concept of undo_log. Each undo log has a roll_pointer attribute. Then all versions will be connected into a linked list by the roll_pointer attribute. We call this linked list a version chain. The head node of the version chain is the latest value of the current record. .

ReadView

By hiding columns and version chains, MySQL can restore data to a specified version; but which version to restore to specifically needs to be determined based on ReadView. The so-called ReadView means that a transaction (recorded as transaction A) takes a snapshot of the entire transaction system (trx_sys) at a certain moment. When a read operation is performed later, the transaction ID in the read data will be compared with the trx_sys snapshot, so that Determine whether the data is visible to the ReadView, that is, whether it is visible to transaction A.

So far we have discovered that MVCC is implemented based on hidden fields, undo_log chains and ReadView.

MVCC in Read committed

We talked about the solution using MVCC in Read committed isolation level earlier Dirty read problem. Here I refer to two articles:

  • https://cloud.tencent.com/developer/article/1150633
  • https:/ /cloud.tencent.com/developer/article/1150630

InnoDB will only look for data rows whose version is earlier than the current transaction version (that is, the version number of the row is less than or equal to the transaction version system version number), this ensures that the rows read by the data either already exist before the transaction starts, or have been inserted or modified by the transaction itself. Therefore no dirty reads will occur.

Read committed The occurrence of non-repeatable reads under the isolation level is due to the generation mechanism of read view. At the Read committed level, data that has been committed before the current statement is executed is visible. During each statement execution, the read view is closed and the current read view is re-created. In this way, the transaction interval of the read view can be created based on the current global transaction list. Simply put, under the Read committed isolation level, MVCC generates a snapshot version for each select, so each select will read different versions of data, so non-repeatable reads will occur.

MVCC in Repeatable read

##Repeatable read isolation level solves the problem of non-repeatable read in a transaction Multiple reads will not produce different results, ensuring repeatable reads. In the previous article, we said that Repeatable read has two implementation methods, one is the pessimistic locking method, and the opposite MVCC is the optimistic locking method.

Repeatable read The fundamental reason why the isolation level can solve the problem of non-repeatable reading is actually that the generation mechanism of read view is different from that of Read committed.

  • Read committed: As long as the data has been submitted before the current statement is executed, it is visible.
  • Repeatable read: As long as the data has been submitted before the current transaction is executed, it is visible.

Unlike Read committed, under the isolation level of Repeatable read, when a transaction is created, the current global read view is generated and maintained until the end of the transaction. This enables repeatable reads.

Phantom read and Next-Key lock

Current read and snapshot Reading

Through the MVCC mechanism, although the data can be read repeatedly, the data we read may be historical data, untimely data, and not the current data of the database. data! For this way of reading historical data, we call it snapshot read (snapshot read), and the way of reading the current version of the database data is called current read (current read) Reference[3]

  • Snapshot reading: select
    • select * from table….;
  • Current read: Special read operations, insert/update/delete operations, belong to current read, process the current data, and need to be locked.
    • select * from table where ? lock in share mode;
    • select * from table where ? for update;
    • insert;
    • update ;
    • delete;

##Solution to phantom reading

In order to solve the phantom reading problem in

current reading, MySQL transaction uses next-key lock.

Repeatable read avoids phantom reading through the next-key lock mechanism.

The InnoDB storage engine has three row lock algorithms, which are:

  • Record Lock: Lock on a single record
  • Gap Lock: Gap Lock, locks a range, but does not include the record
  • Next-Key Lock: Gap Lock Record Lock

next-key lock is a type of row lock, and its implementation is equivalent to record lock (record lock) gap lock (gap lock); its characteristic is that it not only locks the record itself (the function of record lock), but also locks a range ( gap lock function).

When InnoDB scans the index record, it will first add a row lock (Record Lock) to the index record, and then add a gap lock (Gap Lock) to the gap on both sides of the index record. After adding the gap lock, other transactions cannot modify or insert records in this gap.

When the queried index contains unique attributes, Next-Key Lock will be optimized and downgraded to Record Lock, which only locks the index itself, not the range.

The above is the detailed content of A brief understanding of MySQL locks, transactions, and MVCC. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.im. If there is any infringement, please contact admin@php.cn delete