Home  >  Article  >  Database  >  Comprehensive understanding of transactions in MySql_MySQL

Comprehensive understanding of transactions in MySql_MySQL

WBOY
WBOYOriginal
2016-08-20 08:48:08910browse

I have been working on order-based projects recently, using transactions. Our database uses MySql, and the storage engine uses innoDB, which has good support for transactions. In this article, let’s take a look at the knowledge related to affairs.

Why do we need affairs?

Transactions are widely used in various scenarios such as order systems and banking systems. If there is the following scenario: User A and User B are depositors of the bank. Now A wants to transfer 500 yuan to B. Then you need to do the following things:

1. Check A’s account balance > 500 yuan;

2. Deduct 500 yuan from account A;

3. Add 500 yuan to account B;

After the normal process, 500 was deducted from account A and 500 was added to account B. Everyone was happy. What if the system fails after money is deducted from account A? A lost 500 in vain, and B did not receive the 500 that should have belonged to him. In the above case, there is a prerequisite hidden: A deducting money and B adding money, either succeed at the same time or fail at the same time. That's what business requires.

What is the transaction?

Instead of defining a transaction, let’s talk about the characteristics of the transaction. As we all know, transactions need to meet the four ACID properties.

1. A(atomicity) atomicity. The execution of a transaction is considered as an indivisible minimum unit. The operations in the transaction must either be executed successfully or all failed and rolled back. You cannot execute only part of them.

2. C(consistency) consistency. The execution of a transaction should not violate the integrity constraints of the database. If the system crashes after the second operation in the above example is executed, it is guaranteed that the total money of A and B will not change.

3. I(isolation) isolation. Generally speaking, the behavior of transactions should not affect each other. However, in actual situations, the degree of interaction between transactions is affected by the isolation level. Details will be given later in the article.

4. D(durability) persistence. After the transaction is committed, the committed transaction needs to be persisted to disk. Even if the system crashes, the submitted data should not be lost.

Four isolation levels of transactions

As mentioned in the previous article, the isolation of transactions is affected by the isolation level. So what is the isolation level of a transaction? A transaction's isolation level can be thought of as the degree of "selfishness" of a transaction, which defines the visibility between transactions. Isolation levels are divided into the following types:

1.READ UNCOMMITTED (uncommitted reading). Under the RU isolation level, the modifications made to the data by transaction A are visible to transaction B even if they are not committed. This problem is called dirty reading. This is an isolation level with a lower degree of isolation. It can cause many problems in practical applications, so it is generally not commonly used.

2.READ COMMITTED. Under the isolation level of RC, there will be no dirty read problem. The modifications made by transaction A to the data will be visible to transaction B after submission. For example, when transaction B is opened, it reads data 1, then transaction A is opened, changes the data to 2, submits, and B reads the data again, it will Read the latest data 2. Under the isolation level of RC, the problem of non-repeatable reads will occur. This isolation level is the default isolation level for many databases.

3.REPEATABLE READ(repeatable read). Under the isolation level of RR, there will be no non-repeatable read problem. After the modifications made to the data by transaction A are submitted, they will not be visible to transactions started before transaction A. For example, when transaction B is opened, data 1 is read. Then transaction A is opened, changes the data to 2, and commits. B reads the data again, and still can only read 1. Under the isolation level of RR, the problem of phantom reading will occur. The meaning of phantom reading is that when a transaction reads a value in a certain range, and another transaction inserts a new record in this range, then the previous transaction reads the value in this range again and will read Newly inserted data. Mysql's default isolation level is RR, but mysql's innoDB engine gap lock successfully solves the problem of phantom reads.

4.SERIALIZABLE (serializable). Serializable is the highest isolation level. This isolation level forces all things to be executed serially. At this isolation level, every row of data read is locked, which will lead to a large number of lock acquisition problems and the worst performance.

To help understand the four isolation levels, here is an example. As shown in Figure 1, transaction A and transaction B are opened one after another, and data 1 is updated multiple times. Four villains start transactions at different times. What values ​​​​can they see in data 1?

Picture 1

The first villain may read anything between 1-20. Because under the isolation level of uncommitted read, data modifications by other transactions are also visible to the current transaction. The second villain may read 1, 10 and 20. He can only read data that has been submitted by other transactions. The data read by the third villain depends on the time when its own transaction is started. The value read when the transaction is started will be the value read before the transaction is committed. The fourth villain can only read data when it is turned on between A end and B start. However, the data cannot be read during the execution of transaction A and transaction B. Because the fourth villain needs to be locked when reading data, during the execution of transactions A and B, the write lock of the data will be occupied, causing the fourth villain to wait for the lock.

Figure 2 lists the problems faced by different isolation levels.

Picture 2

Obviously, the higher the isolation level, the greater the resource consumption (locks) it brings, so its concurrency performance is lower. To be precise, under the serializable isolation level, there is no concurrency.

Picture 3

Transactions in MySql

The implementation of transactions is based on the database storage engine. Different storage engines have different levels of support for transactions. The storage engines that support transactions in mysql include innoDB and NDB. innoDB is the default storage engine of mysql. The default isolation level is RR, and it goes one step further under the isolation level of RR. It solves the non-repeatable read problem through multi-version concurrency control (MVCC, Multiversion Concurrency Control), plus gap lock ( That is, concurrency control) solves the phantom reading problem. Therefore, innoDB's RR isolation level actually achieves the effect of serialization level and retains relatively good concurrency performance. The isolation of transactions is achieved through locks, while the atomicity, consistency and durability of transactions are achieved through transaction logs. When it comes to transaction logs, what I have to say is redo and undo.

1.redo log

In the innoDB storage engine, transaction logs are implemented through redo logs and the log buffer (InnoDB Log Buffer) of the innoDB storage engine. When a transaction is started, the operations in the transaction will first be written to the log buffer of the storage engine. Before the transaction is committed, these buffered logs need to be flushed to disk in advance for persistence. This is what DBAs often call "log first" "(Write-Ahead Logging). After the transaction is committed, the data files mapped in the Buffer Pool will be slowly refreshed to the disk. At this time, if the database crashes or is down, when the system is restarted for recovery, the database can be restored to a state before the crash based on the logs recorded in the redo log. Unfinished transactions can continue to be submitted or rolled back, depending on the recovery strategy.

When the system starts, a continuous storage space has been allocated for the redo log.

The redo log is recorded in a sequential append method and improves performance through sequential IO. All transactions share the storage space of the redo log, and their redo logs are recorded together alternately in the order of statement execution. Here is a simple example:

Record 1:

Record 2:

Record 3:

Record 4:

Record 5:

2.undo log

undo log mainly serves transaction rollback. During the transaction execution process, in addition to recording redo log, a certain amount of undo log will also be recorded. The undo log records the status of the data before each operation. If a rollback is required during transaction execution, the rollback operation can be performed based on the undo log. The rollback of a single transaction will only roll back the operations performed by the current transaction and will not affect the operations performed by other transactions.

The following is the simplified process of undo+redo transaction

Suppose there are 2 values, A and B respectively, with values ​​1 and 2

1. start transaction;

2. Record A=1 to undo log;

3. update A = 3;

4. Record A=3 to redo log;

5. Record B=2 to undo log;

6. update B = 4;

7. Record B = 4 to redo log;

8. Refresh redo log to disk

9. commit

If the system goes down at any step 1-8 and the transaction is not committed, the transaction will not have any impact on the data on the disk. If it goes down between 8 and 9, you can choose to roll back after recovery, or you can choose to continue to complete the transaction submission, because the redo log has been persisted at this time. If the system crashes after 9 and the changed data in the memory map is not flushed back to the disk, then after the system is restored, the data can be flushed back to the disk according to the redo log.

So, redo log actually guarantees the durability and consistency of transactions, while undo log guarantees the atomicity of transactions.

Distributed transactions

There are many ways to implement distributed transactions. You can use the native transaction support provided by innoDB, or you can use message queues to achieve the ultimate consistency of distributed transactions. Here we mainly talk about innoDB’s support for distributed transactions.

As shown in the picture, mysql's distributed transaction model. The model is divided into three parts: application (AP), resource manager (RM), and transaction manager (TM).

The application defines the boundaries of the transaction and specifies what transactions need to be done;

The resource manager provides methods to access transactions. Usually a database is a resource manager;

The transaction manager coordinates and participates in various transactions in the global transaction.

Distributed transactions adopt a two-phase commit method. In the first phase, all transaction nodes start preparing and tell the transaction manager that they are ready. The second-stage transaction manager tells each node whether to commit or rollback. If a node fails, all global nodes need to be rolled back to ensure the atomicity of the transaction.

Summary

When do you need to use transactions? I think that as long as the business needs to meet ACID scenarios, transaction support is needed. Especially in order systems and banking systems, transactions are indispensable. This article mainly introduces the characteristics of transactions and mysql innoDB's support for transactions. There is far more knowledge related to affairs than what is stated in the article. This article is only an introduction. I hope readers will forgive me for any shortcomings.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn