Home  >  Article  >  Database  >  A comprehensive guide to MySQL logs

A comprehensive guide to MySQL logs

WBOY
WBOYforward
2022-10-07 09:00:292402browse

This article brings you relevant knowledge about mysql, which mainly introduces issues related to logs. Mysql’s log system ensures that data will not be lost no matter when it crashes. The key, let’s take a look at it together, I hope it will be helpful to everyone.

A comprehensive guide to MySQL logs

Recommended learning: mysql video tutorial

Mysql’s logging system is Mysql’s guarantee that data will not be lost no matter when it crashes Key

As we all know, Mysql is a persistent database. All data is persisted to the hard disk to ensure that data will not be lost.

Mysql ensures that data will not be lost from the following two aspects. To reflect the

  • Able to restore the data state at any time

  • No matter if the transaction crashes before or after the transaction is submitted, the data will not be lost.

Crash during the transaction can be restored to the state before the transaction was submitted

Crash after the transaction is submitted, the submitted data will not be lost

MySQL guarantees the above The key to these two points is achieved through the three logs of undo log, redo log and binlog. Next, we will introduce one by one

undo log rollback log

undo log is the return of Mysql Roll log, store the old version of data

Main function

Storage the old version of data

Cooperate with Read View and hidden fields implement Mysql snapshot reading

Used to roll back to the version before the transaction started when the transaction execution fails

What types of undo log

There are two types of undo log

For the insert command, the undo log records the primary key of the newly added record. During rollback, it is based on the primary key in the undo log. Just delete the corresponding record

For the update/delete command, undo log records the old data of the modified record

Each row of data in Mysql has a latest modified current data row The two fields of transaction id and rollback pointer. When the data row is modified, the undo log pointer will point to the old row of data, and the rollback pointer of the newly generated row of data will point to the current undo log pointer. The old data row pointed to

  • In order to avoid concurrency problems when the undo log pointer is modified, Mysql will add an exclusive lock to the undo log pointer before modification to ensure the correct writing of the undo log.

A comprehensive guide to MySQL logs

undo log When to delete

undo log is used to ensure When the transaction is not submitted, it can be smoothly rolled back to the state before the transaction started. When the transaction is submitted, the undo log loses its function and needs to be deleted.

The undo log is responsible for the Purage thread in Mysql Deleted, purage will regularly check the deleted_bit flag in the undo log. This flag will be set to true after the transaction is committed. If the purage thread finds a record that is true, it will be responsible for deleting it.

redo log redo log

The redo log is the physical log of Mysql, responsible for recording what kind of operations a certain data page performs

The role of the redo log

  • Responsible for recording the modification of data by submitted transactions. The recorded content is probably an update to the z offset of page y of table x

  • Let Mysql do not need to wait for the data to persist to the disk when committing a transaction. It only needs to persist the redo log to the disk.

  • The number of uncleared redo logs indicates that the disk has not been flushed The number of dirty pages

Why commit a transaction to choose to persist redo log instead of persisting data to disk

Persisting data to disk is a random IO process, so Mysql chooses to cache the data and wait for a suitable opportunity to write the data to the disk at once to reduce IO

. However, there is a risk of data cache loss in memory, so Mysql chooses to persist the redo log

The redo log is written sequentially, and the efficiency of persistence is higher than that of random writing, and the redo log records the changes in the data. As long as the redo log is there, the data can be restored after Mysql restarts

In InnoDB, redo log is a fixed-size circular queue-like existence. Each write is from the position of write pos at the back. When persisting data, the check point is moved to read forward

A comprehensive guide to MySQL logs

The reason for this design is that the redo log exists to prevent the cached dirty page data from being lost after Mysql crashes

When the data in Mysql is persisted to the disk, it is The redo log in the persistence part is actually useless, so you can free up space to record new data

The difference between undo log and redo log

undo log records the status of old data during transaction execution, and redo log records the status after data update

redo log actually guarantees the persistence and consistency of the transaction, while undo log This ensures the atomicity of transactions

binlog archive log

binlog is a log implemented by the Mysql server layer and is common to all engines

Function

Binlog records the original statement logic of mysql, and is recorded in the form of append writing, so it can be used to restore the database data status of mysql at any time

So it is called Binlog is an archive log

At the same time, binlog is also a dependency of Mysql to implement master-slave replication. The slave library synchronizes the data status of the main library by copying binlog playback from the main library

Definition

Write the log to the disk first, and then write the data to the disk. The write operation of Mysql is not written to the disk immediately, but the log is written first to ensure redo Both log and binlog are persisted to the disk, and then the background thread chooses the opportunity to persist the data to the hard disk.

Why do we need to write the log to the disk first

Because flushing dirty pages is a random reading and writing process, the speed of persisting to disk is definitely not as fast as sequential writing such as redo log | binlog, so we choose to modify the data in memory first, and then select it later. The timing is asynchronously persisted to the disk

So during the period when the dirty pages have not been flushed to the disk, redo log | binlog ensures the persistence of the data and prevents the data in the memory from power outages and restarts. Loss

When the dirty pages are full, the dirty pages need to be written to the disk and then eliminated. Why not eliminate them all and then restore them through redo log the next time they are used

Considering performance, if every time data is read from disk to memory, it needs to be compared and updated with the redo log, which is very inefficient.

MySQL flushes dirty pages and writes them to disk to ensure data pages. As long as it is in the memory, it must be the latest data that can be returned

If there is no data in the memory, you can definitely get the latest correct data by reading it from the disk without having to compare it with the redo log

The writing process of binlog and redo log - the basic guarantee of the WAL mechanism

Both binlog and redo log divide log writing into three processes: writing cache, write and sync

During the execution of the transaction, the binlog and redo log will be written to the corresponding allocated cache, so that they can be written to the disk at one time when the transaction is submitted.

will be performed first when the transaction is submitted. write writes data to the page cache of the operating system. At this time, the data has not actually been written to the file, but it has been handed over to the cache of the operating system for safekeeping. If the Mysql process crashes at this time, this part of the written data will not be lost. Will be lost, the kernel thread of the operating system will be responsible for writing this part of the cached data to the disk

  • But if the operating system crashes, this part of the data will be lost

Finally, mysql manually calls sync to persist the data written in the page cache to the hard disk. After the writing is completed, the data is persisted successfully.

The final write and sync Step mysql provides corresponding parameters to control the writing strategy

redo log is controlled through innodb_flush_log_at_trx_commit

  • When set to 0, it means every time When the transaction is committed, the redo log is only left in the redo log cache.

The risk of loss is the greatest

  • When set to 1, it means that every When each transaction is committed, the redo log will be persisted directly to the disk

The risk of loss is minimal, but the IO usage will be large

  • When set to 2 , which means that every time a transaction is committed, the redo log is only written to the page cache

The IO occupation is centered, and the most IO-consuming process of writing to disk is left to the operating system

Binlog is controlled by the parameter sync_binlog

  • When sync_binlog=0, it means that every time a transaction is submitted, only write is performed, not fsync

  • ## When
  • #sync_binlog=1, it means that fsync will be executed every time a transaction is submitted.

  • When sync_binlog=N(N>1), it means that write will be executed every time a transaction is submitted, but fsync only after N transactions have been accumulated

Two-phase log submission

What is two-phase log submission

A comprehensive guide to MySQL logs

The redo log submission process is divided into two stages: prepare and commit. The binlog log submission is in the middle of these two stages.

When the transaction is submitted, the redo log is submitted first and then enters the prepare state. Then after the binlog submission is completed, the redo log can change the status of the log to commit submitted

Why two-stage log submission is required

It is related to the rollback mechanism of the InnoDB engine. InnoDB's redo log cannot be rolled back after the transaction is submitted. If the binlog fails to be written after the redo log is submitted, there will be two inconsistencies.

If the database restarts abnormally at this time, which one should be based on It is worth thinking about how to recover the data, so two-stage log submission is needed

Assume that the database crashes at time A, because the binlog has not been written and the redo log has not been submitted, so the transaction will be rolled back after restarting, and the two logs will still be in the same state

If it is a time period If B, you need to judge the commit flag of the redo log. Check whether there is a commit flag in the redo log. If there is, there is no problem with the transaction and submit it directly.

  • If redo If there is no commit flag corresponding to the transaction in the log, the binlog will be checked.

  • If the binlog is complete and has a commit flag, the transaction will be submitted and the commit flag will be added after the redo log. If the binlog Roll back the transaction if it is incomplete

Here you can find that the crash occurred in the two-stage log submission is based on the binlog standard. The reason is because the master-slave replication is based on the binlog.

If the integrity of both logs needs to be checked, it will take longer to switch to the slave library if the main library hangs up. Based on binlog, if the main library hangs up, you can directly use the binlog to restore the data from the slave library. That’s it, there is no need to check the integrity of the redo log

In addition, binlog is a common log for the Mysql Server layer, which is why binlog is chosen as the benchmark

Disadvantages of two-stage log submission

  • The number of disk IOs is high

#When submitting the log, there will be flush operations corresponding to redo log and binlog, and the number of IOs is high

  • Fierce competition for locks

In order to ensure that when multiple transactions are submitted, the log records are consistent with the transaction submission order, locks will be used To ensure the relative order of log submission

But the performance will deteriorate when the amount of concurrency is large

##Group submission mechanism

The role of the group submission mechanism

When there is a transaction that has escaped submission, the logs of multiple transactions are merged together for writing, reducing disk IO operations

Implementation of group submission mechanism

The group submission mechanism splits the commit process into three processes, maintains a queue for each process, and uses locks to ensure transaction writing Sequence

  • Dividing locks into three stages can reduce the lock granularity without locking the entire submission process of the transaction

When the queue is empty At that time, the first transaction that enters the queue will become the leader of subsequent transactions, leading subsequent transactions to complete the next phase of operations

Phase 1:

flush phase: Multiple transactions press enter Write the binlog from the cache to the file in the order (without flushing the disk)

The first transaction that enters the flush phase will serve as the leader to lead the subsequent transactions

The leader transaction will lead all The transaction performs a write fsync on the redo log, that is, writes the redo log to the disk and completes the propare phase of the redo log

If Mysql crashes at this stage, this set of transactions will be rolled back after restarting

Phase 2:

sync: Perform fsync operation on the binlog file (merge the binlogs of multiple transactions and flush the disk together)

After writing the binlog to the binlog file in the flush stage , will wait for a period of time before flushing the disk. The purpose is to combine the binlog of more transactions and flush the disk together to reduce consumption.

There will be a time limit and maximum transaction limit for waiting. If one of the conditions is met, the binlog will be flushed immediately. Disk

The sync stage is mainly responsible for the group submission of binlog. If Mysql crashes in the current stage, you can continue to complete the transaction submission by flushing the redo log record after restarting

  • Because the binlog has completed submission at this time, you can continue to submit transactions based on the redo log

Phase 3:

commit: Perform InnoDB commit operations on each transaction

Recommended learning:

mysql video tutorial

The above is the detailed content of A comprehensive guide to MySQL logs. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.im. If there is any infringement, please contact admin@php.cn delete