Home >Database >Mysql Tutorial >How to implement parallel replication in MySQL Replication

How to implement parallel replication in MySQL Replication

王林forward: 2023-05-26 12:23:441879browse

Traditional single-threaded replication instructions

As we all know, before MySQL version 5.6, there were two threads on the slave node of the master-slave replication, respectively. I/O thread and SQL thread.

#The I/O thread is responsible for receiving events from the binary log and writing them to the Relay Log.
The SQL thread reads the Relay Log and plays it back in the database.

The above method may occasionally cause delays, so what are the situations that may cause delays between master and slave nodes?

1 .The main database performs large transactions (such as large table structure changes).
2. Large-scale changes to the main database (such as: large-scale insert, update, delete operations).
3. In ROW synchronization mode, the primary key of the large table in the main database is not frequently updated.
4. The database parameter configuration is unreasonable and there is a bottleneck in slave node performance (for example: the slave node transaction log setting is too small, resulting in frequent disk flushing).
5. The network environment is unstable, and there are delays and reconnections in reading the binlog from the node IO thread.
6. The master and slave hardware configurations are different, and the hardware resource usage of the slave node reaches the upper limit. (For example: master node SSD disk, slave node SAS disk)

You can roughly classify the above reasons for delay.

1. Hardware issues (including disk IO, network IO, etc.)
2. Configuration issues.
3. Database design issues.
4. The master database changes in large quantities, and the slave node SQL single-thread processing is not timely enough.

Summary

After analyzing the above reasons, we can see that in order to reduce the master-slave delay, in addition to improving the hardware conditions, the DBA also needs to pay attention to the database design. and configuration issues. Finally, it is necessary to improve the concurrent processing capabilities of the slave nodes. Changing from single-threaded playback to multi-threaded parallel playback is a better method. The key point is how to resolve data conflicts and fault recovery locations under the premise of multi-thread recovery. Click to confirm the question.

MySQL5.6 parallel replication based on library level

When there are multiple databases in the instance, multiple threads can be started, and each thread corresponds to one database. . In this mode, the slave node will start multiple threads. Threads are divided into two categories Coordinator and WorkThread.

Thread division of labor execution logic

CoordinatorThe thread is responsible for determining whether the transaction can be executed in parallel. If it can be executed in parallel, Distribute the transaction to the WorkThread thread for execution. If it is judged that it cannot be executed, such as DDL, cross-library operation, etc., wait until all worker threads are executed. Then it is executed by Coordinator.

Key configuration information

slave-parallel-type=DATABASE

Proposal shortcomings

This parallel replication model will only have a high degree of parallelism when there are multiple DBs in the instance and the DB transactions are relatively busy. However, in daily maintenance, the transaction processing of a single instance is relatively concentrated. on a DB. Most delays are observed based on the presence of hotspot tables. It would be a good idea to provide table-based parallelism.

MySQL5.7 Parallel Replication Based on Group Submission

Group Submission Instructions

To put it simply, under the double 1 setting, after the transaction is submitted That is, the operation of disk brushing is changed to merge multiple transactions into a group of transactions and then perform unified disk brushing. This processing reduces the pressure of disk IO. For detailed information, please refer to 老叶茶馆Instructions for group submission tweetshttps://mp.weixin.qq.com/s/rcPkrutiLc93aTblEZ7sFg

一Submission of group transactions at the same time means that there is no conflict in the transactions within the group, so the transactions in the group can be executed concurrently on the slave node. The problem is how to distinguish whether the transactions are in the same group, so two new ones appear in the binlog Parameter information last_committed and sequence_number

##How to determine whether a transaction is in a group?

Parsing the binlog, we can find that there are two more parameter information,
last_committed and sequence_number, among which last_committed is duplicated.

sequence_number # This value refers to the sequence number of transaction submission, which increases monotonically.
last_committed # This value has two meanings. 1. The same value means that these transactions are in the same group, 2. This value also represents the previous The maximum number for a set of transactions.

[root@mgr2 GreatSQL]# mysqlbinlog mysql-bin.0000002 | grep last_committed
GTID last_committed=0 sequence_number=1
GTID last_committed=0 sequence_number=2
GTID last_committed=2 sequence_number=3
GTID last_committed=2 sequence_number=4
GTID last_committed=2 sequence_number=5
GTID last_committed=2 sequence_number=6
GTID last_committed=6 sequence_number=7
GTID last_committed=6 sequence_number=8

Database configuration

slave-parallel-type=LOGICAL_CLOCK

Defects in the solution

基于组提交的同步有个不足点，就是当主节点的事务繁忙度较低的时候，导致时间段内组提交fsync刷盘的事务量较少，于是导致从库回放的并行度并不高，甚至可能一组里面只有一个事务，这样从节点的多线程就基本用不到，可以通过设置下面两个参数，让主节点延迟提交。

binlog_group_commit_sync_delay # 等待延迟提交的时间，binlog提交后等待一段时间再 fsync。让每个 group 的事务更多，人为提高并行度。
binlog_group_commit_sync_no_delay_count # 待提交的最大事务数，如果等待时间没到，而事务数达到了，就立即 fsync。达到期望的并行度后立即提交，尽量缩小等待延迟。

MySQL8.0基于writeset的并行复制

writeset 基于事务结果冲突进行判断事务是否可以进行并行回放的方法，他由binlog-transaction-dependency-tracking参数进行控制，默认采用WRITESET方法。

关键参数查看

Command-Line Format	--binlog-transaction-dependency-tracking=value
System Variable	binlog_transaction_dependency_tracking
Scope	Global
Dynamic	Yes
SET_VAR Hint Applies	No
Type	Enumeration
Default Value	COMMIT_ORDER
Valid Values	COMMIT_ORDER WRITESET WRITESET_SESSION

参数配置项说明

COMMIT_ORDER # 使用 5.7 Group commit 的方式决定事务依赖。
WRITESET # 使用写集合的方式决定事务依赖。
WRITESET_SESSION # 使用写集合，但是同一个session中的事务不会有相同的last_committed。

writeset 是一个HASH类型的数组，里面记录着事务的更新信息，通过transaction_write_set_extraction判断当前事务更新的记录与历史事务更新的记录是否存在冲突，判断过后再采取对应处理方法。writeset储存的最大存储值由binlog-transaction-dependency-history-size控制。

需要注意的是，当设置成WRITESET或WRITESET_SESSION的时候，事务提交是无序状态的，可以通过设置 slave_preserve_commit_order=1 强制按顺序提交。

binlog_transaction_dependency_history_size

设定一个上限，限制在内存中缓存之前事务修改的行信息时所使用的行哈希数。一旦达到这个哈希数，就会清除历史记录。

Command-Line Format	--binlog-transaction-dependency-history-size=#
System Variable	binlog_transaction_dependency_history_size
Scope	Global
Dynamic	Yes
SET_VAR Hint Applies	No
Type	Integer
Default Value	25000
Minimum Value	1
Minimum Value	1000000

transaction_write_set_extraction

该模式支持三种算法，默认采用XXHASH64，当从节点配置writeset复制的时候，该配置不能配置为OFF。在MySQL 8.0.26中，该参数已被标记为已弃用，将来将被删除。

Command-Line Format	--transaction-write-set-extraction[=value]
Deprecated	8.0.26
System Variable	binlog_transaction_dependency_history_size
Scope	Global, Session
Dynamic	Yes
SET_VAR Hint Applies	No
Type	Enumeration
Default Value	XXHASH64
Valid Values	OFF MURMUR32 XXHASH64

数据库配置

slave_parallel_type = LOGICAL_CLOCK
slave_parallel_workers = 8
binlog_transaction_dependency_tracking = WRITESET
slave_preserve_commit_order = 1

The above is the detailed content of How to implement parallel replication in MySQL Replication. For more information, please follow other related articles on the PHP Chinese website!

Statement：

This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete

Previous article：How to use mysql command to import and export sql files under LinuxNext article：How to use mysql command to import and export sql files under Linux

See more