Mysql半同步复制原理及问题排查_MySQL-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

Mysql半同步复制原理及问题排查_MySQL

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

May 27, 2016 pm 01:46 PM

mysql半同步复制和异步复制的差别如上述架构图所示：在mysql异步复制的情况下，Mysql Master Server将自己的Binary Log通过复制线程传输出去以后，Mysql Master Sever就自动返回数据给客户端，而不管slave上是否接受到了这个二进制日志。在半同步复制的架构下，当master在将自己binlog发给slave上的时候，要确保slave已经接受到了这个二进制日志以后，才会返回数据给客户端。对比两种架构：异步复制对于用户来说，可以确保得到快速的响应结构，但是不能确保二进制日志确实到达了slave上；半同步复制对于客户的请求响应稍微慢点，但是他可以保证二进制日志的完整性。

1.问题背景

默认情况下，线上的mysql复制都是异步复制，因此在极端情况下，主备切换时，会有一定的概率备库比主库数据少，因此切换后，我们会通过工具进行回滚回补，确保数据不丢失。半同步复制则要求主库执行每一个事务，都要求至少一个备库成功接收后，才真正执行完成，因此可以保持主备库的强一致性。为了确保主备库数据强一致，减少数据丢失，尝试在生产环境中开启mysql的复制的半同步(semi-sync)特性。实际操作过程中，发现大部分实例半同步都可以正常运行，但有少部分实例始终开不起来(只能以普通复制方式运行)，更奇葩的是同一个主机的两个实例，一个能开启，一个不能。最终定位的问题也很简单，但排查出来还是花了一番功夫，下文将描述整个问题的排查过程。

2.半同步复制原理

mysql的主备库通过binlog日志保持一致，主库本地执行完事务，binlog日志落盘后即返回给用户；备库通过拉取主库binlog日志来同步主库的操作。默认情况下，主库与备库并没有严格的同步，因此存在一定的概率备库与主库的数据是不对等的。半同步特性的出现，就是为了保证在任何时刻主备数据一致的问题。相对于异步复制，半同步复制要求执行的每一个事务，都要求至少有一个备库成功接收后，才返回给用户。实现原理也很简单，主库本地执行完毕后，等待备库的响应消息(包含最新备库接收到的binlog(file,pos))，接收到备库响应消息后，再返回给用户，这样一个事务才算真正完成。在主库实例上，有一个专门的线程(ack_receiver)接收备库的响应消息，并以通知机制告知主库备库已经接收的日志，可以继续执行。有关半同步的具体实现，可以参考另外一篇文章，mysql半同步(semi-sync)源码实现。

3.问题分析

前面简单介绍了半同步复制的原理，现在来看看具体问题。在主备库打开半同步开关后，问题实例的状态变量"Rpl_semi_sync_master_status"始终是OFF，表示复制一直运行在普通复制的状态。

(1).修改rpl_semi_sync_master_timeout参数。

半同步复制参数中有一个rpl_semi_sync_master_timeout参数，用以控制主库等待备库响应消息的时间，如果超过该值，则认为备库一直没有收到(备库可能挂了，也可能备库执行很慢，较主库相差很远)，这个时候复制会切换为普通复制，避免主库的执行事务长时间等待。线上这个值默认是50ms，简单想是不是这个值太小了，遂将其改到10s，但问题依然不解。

(2).打印日志

排查问题最简单最笨的方法就是打日志，看看到底是哪个环节出了问题。主库和备库分别有rpl_semi_sync_master_trace_level和rpl_semi_sync_slave_trace_level参数来控制半同步复制打印日志。将两个参数值设置为80(64+16)，记录详细日志信息，以及进出的函数调用。

master:

2016-01-04 18:00:30 13212 [Note] ReplSemiSyncMaster::updateSyncHeader: server(-1721062019), (mysql-bin.000006, 500717950) sync(1), repl(1)
2016-01-04 18:00:40 13212 [Warning] Timeout waiting for reply of binlog (file: mysql-bin.000006, pos: 500717950), semi-sync up to file , position 0.
2016-01-04 18:00:40 13212 [Note] Semi-sync replication switched OFF.

slave:

2016-01-04 18:00:30 38932 [Note] ---> ReplSemiSyncSlave::slaveReply enter
2016-01-04 18:00:30 38932 [Note] ReplSemiSyncSlave::slaveReply: reply (mysql-bin.000006, 500717950)
2016-01-04 18:00:30 38932 [Note]

从master日志可以看到在2016-01-04 18:00:30时，主库设置了半同步标记，并开始等待备库的响应，等待10s后，仍然没有收到响应，则认为超时，遂将半同步模式关闭，切换为普通模式。但从slave日志来看，在2016-01-04 18:00:30已经将(mysql-bin.000006, 500717950)发送给主库，表示已经收到该日志。这就说明，master日志已经打了semi-sync标，slave收到了日志，并且也回了包，master也确实等了10s，就是没有收到包，所以就切换为普通复制。现在问题就变成了，为什么master没有收到？

(3)select函数

前面提到了，主库实例上有一个专门接收响应包的线程(ack_receiver)，它通过select函数监听socket，发现有slave的响应消息后，读取消息，通知工作线程可以继续执行。那么问题是不是出现在select函数上面？因为select是一个系统调用，一直没有怀疑，但已经跟到这里来了，那就得看看。与select函数相关的有几个重要的宏定义和说明。主要实现在/usr/include/bits/typesizes.h，/usr/include/bits/select.h和/usr/include/sys/select.h这三个文件中。

FD_ZERO(fd_set *fdset)：清空fdset与所有文件句柄的联系。FD_SET(int fd, fd_set *fdset)：建立文件句柄fd与fdset的联系。FD_CLR(int fd, fd_set *fdset)：清除文件句柄fd与fdset的联系。FD_ISSET(int fd, fd_set *fdset)：检查fdset联系的文件句柄fd是否可读写，当>0表示可读写。

array
{
__fd_mask __fds_bits[__FD_SETSIZE / __NFDBITS]; 1024/64=16 (long int)
}fd_set
#define __FD_SET_SIZE 1024
typedef long int __fd_mask; //8个字节
#define __NFDBITS (8 * (int) sizeof (__fd_mask)) // 64位
#define __FDMASK(d) ((__fd_mask) 1 << ((d) % __NFDBITS)) //fd%64=N,则在第N位设置为1
#define __FDELT(d) ((d) / __NFDBITS) //表示在第几个long int
#define __FDS_BITS(set) ((set)->__fds_bits) 
#define __FD_SET(d, set) (__FDS_BITS (set)[__FDELT (d)] |= __FDMASK (d))
#define __FD_CLR(d, set) (__FDS_BITS (set)[__FDELT (d)] &= ~__FDMASK (d))
#define __FD_ISSET(d, set) \
((__FDS_BITS (set)[__FDELT (d)] & __FDMASK (d)) != 0)

通过FD_SET可以设置我们想要监听的句柄，句柄信息存储在fd_set位数组中，数组元素的个数由__FD_SETSIZE/64决定，对于__FD_SETSIZE=1024而言，整个数组只有16个long int。每个句柄占有一个位，就是1024个位，可以存储1024个句柄。假设句柄值为138，那么138/64=2,138%64=10，那么这个句柄在数组的标示在第2个long int的第10位置1。那么如果句柄值超出1024呢，这里不就溢出了？我仔细撸了撸代码，发现根本就没有容错判断，如果句柄值超过1024就一定会溢出。由于select函数是遍历数组中的每个位，然后去判断该句柄是否可读可写，因此对于超过1024的句柄，永远也不会去判断，因此主库永远不知道备库是否发送了响应包。

(4)验证

上面只是理论分析，如果实际运行的实例句柄确实是超过了1024，那么问题就定位到了。

1.得到mysql进程mysql-pid

ps –aux | grep mysqld | grep port

2.gdb attach到该进程

gdb –p mysql-pid

3.找到ack_receive线程，并切换

info thread
thread thread_id

4.打印socket的值，这里fd值为2344。

(5)如何解

我们看到了由于__FD_SETSIZE的定义，一般是1024，导致select函数最多只能监听1024个句柄，并且最大句柄值不超过1024。第一个方法是调大该参数，但这种方法需要重新编译linux内核。而且由于select机制，每次都需要遍历的每一位来判断句柄上是否有消息到来，因此如果设置很大，将导致效率非常低。select是一种比较老的IO复用机制，比较先进的poll，epoll都有类似的功能，并且更强大，也没有句柄总数和最大句柄的限制。有关select，poll，epoll等机制，大家可以去网上查资料，这里不展开讨论。

(6)官方版本

看了最新oracle官方版本git上5.7的源代码，这块也是用select来实现的，所以也存在类似的问题。当然，由于句柄号有复用机制，当实例上连接数很少，或者长连接不多时，不容易出现fd>1024的情况，所以这个bug不是很容易出现，但问题是普遍存在的。

(7)问题延生

问题定位后，另外一个问题还困扰我了半天。因为mysql内核中有监听的部分有3块，1是监听端口的select，2是线程池的监听epoll，3是半同步的select监听。slave binlog dump的线程就是普通的工作线程，而工作线程的socket会受epoll的监听，这样一来，binlog dump的socket会同时受半同步的select监听和线程池的epoll监听，这不乱了吗？后来仔细看了看代码，才发现线程池的epoll监听采用的是EPOLLONESHOT模式，每次接收消息后会解绑，需要重新注册，因此不会出现同一个句柄被两种监听机制同时监听的情况。

到此，排查问题过程就结束了，结论是比较简单的，但定位这个问题确实花费了一些功夫。由于select一种比较通用的多路IO复用机制，因此有用到select函数的童鞋，可能要注意下它的限制。

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How do you alter a table in MySQL using the ALTER TABLE statement?Mar 19, 2025 pm 03:51 PM

The article discusses using MySQL's ALTER TABLE statement to modify tables, including adding/dropping columns, renaming tables/columns, and changing column data types.

How do I configure SSL/TLS encryption for MySQL connections?Mar 18, 2025 pm 12:01 PM

Article discusses configuring SSL/TLS encryption for MySQL, including certificate generation and verification. Main issue is using self-signed certificates' security implications.[Character count: 159]

How do you handle large datasets in MySQL?Mar 21, 2025 pm 12:15 PM

Article discusses strategies for handling large datasets in MySQL, including partitioning, sharding, indexing, and query optimization.

What are some popular MySQL GUI tools (e.g., MySQL Workbench, phpMyAdmin)?Mar 21, 2025 pm 06:28 PM

Article discusses popular MySQL GUI tools like MySQL Workbench and phpMyAdmin, comparing their features and suitability for beginners and advanced users.[159 characters]

How do you drop a table in MySQL using the DROP TABLE statement?Mar 19, 2025 pm 03:52 PM

The article discusses dropping tables in MySQL using the DROP TABLE statement, emphasizing precautions and risks. It highlights that the action is irreversible without backups, detailing recovery methods and potential production environment hazards.

How do you create indexes on JSON columns?Mar 21, 2025 pm 12:13 PM

The article discusses creating indexes on JSON columns in various databases like PostgreSQL, MySQL, and MongoDB to enhance query performance. It explains the syntax and benefits of indexing specific JSON paths, and lists supported database systems.

How do you represent relationships using foreign keys?Mar 19, 2025 pm 03:48 PM

Article discusses using foreign keys to represent relationships in databases, focusing on best practices, data integrity, and common pitfalls to avoid.

How do I secure MySQL against common vulnerabilities (SQL injection, brute-force attacks)?Mar 18, 2025 pm 12:00 PM

Article discusses securing MySQL against SQL injection and brute-force attacks using prepared statements, input validation, and strong password policies.(159 characters)

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

4 weeks agoByDDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

4 weeks agoByDDD

Two Point Museum: All Exhibits And Where To Find Them

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SublimeText3 Linux new version

SublimeText3 Linux latest version

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

Hot Topics

Where is the login entrance for gmail email?

7378

1628

1357

1267

1216