一:搭建主主复制环境 1.1 实验环境 两台机器事先都已经装好了 MySQL 单实例。 IP: 10.192.203.201 10.192.203.202 端口都是 3307. 二者的端口号需要保持一致,否则在最后用 vip 连接的时候,不能使用相同端口号连接。 1.2 实验步骤 1.2.1 修改配置文件 修改
两台机器事先都已经装好了MySQL单实例。
IP: 10.192.203.201 10.192.203.202
端口都是3307.
二者的端口号需要保持一致,否则在最后用vip连接的时候,不能使用相同端口号连接。
修改master1:
在[mysqld]下面添加:
server-id = 1 relay-log=/data/server/mysql_3307/binlog/ZabbixServer-relay-bin relay-log-index=/data/server/mysql_3307/binlog/ZabbixServer-relay-bin.index auto-increment-offset= 1 auto-increment-increment= 2 log-slave-updates=true
修改master2:
在[mysqld]下面添加:
server-id = 3 relay-log=/data/server/mysql/binlog/single-relay-bin relay-log-index=/data/server/mysql/binlog/single-relay-bin.index auto-increment-offset= 2 auto-increment-increment= 2 log-slave-updates=true
添加auto-increment-offset那两项,是为了避免在MySQLINSERT时主键冲突。
修改完后记得重启mysql
分别在两台mysql上执行
GRANTREPLICATION SLAVE ON *.* TO 'RepUser'@'%'identified by 'beijing';
两台服务器均为新建立,且无其它写入操作,各服务器只需记录当前自己二进制日志文件及事件位置,以之作为另外的服务器复制起始位置即可。否则,需要先备份主库,在备库进行恢复,从而保持数据一致,然后再指向master。
Master1:
mysql>show master status;
+------------------+----------+--------------+------------------+-------------------+
|File |Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
|mysql-bin.000001 | 302| | | |
+------------------+----------+--------------+------------------+-------------------+
1 row inset (0.00 sec)
Master2:
mysql>show master status;
+------------------+----------+--------------+------------------+-------------------+
|File |Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
|mysql-bin.000001 | 120| | | |
+------------------+----------+--------------+------------------+-------------------+
1 row inset (0.00 sec)
#Master1指向Master2
1. CHANGE MASTER TO MASTER_USER='RepUser',MASTER_HOST='10.192.203.202',MASTER_PASSWORD='beijing',MASTER_PORT=3307,MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=120;
#Master2指向Master1
[1. CHANGE MASTER TO MASTER_USER='RepUser',MASTER_HOST='10.192.203.201',MASTER_PASSWORD='beijing', MASTER_PORT=3307,MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=302;
startslave ;
确保show slave status
Slave_IO_Running:Yes
Slave_SQL_Running:Yes
测试两边是否同步,略。
每个主机分别带有两块以太网卡,其中一块用于网络通信,另一块用于心跳功能。
本实验都是在Oracle virtualbox虚拟机里做的,故添加一块儿用于内部连接的网卡,用于心跳测试,请参考:http://blog.csdn.net/yabingshi_tech/article/details/51445006
在两台机器上分别做以下操作:
yum install PyXML cluster-glue cluster-glue-libs resource-agents –y
wgethttp://dl.fedoraproject.org/pub/epel/6/x86_64/heartbeat-3.0.4-2.el6.x86_64.rpm
wget http://dl.fedoraproject.org/pub/epel/6/x86_64/heartbeat-libs-3.0.4-2.el6.x86_64.rpm
rpm -ivh heartbeat-*
复制配置文件
cp /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/
cp /usr/share/doc/heartbeat-3.0.4/haresources /etc/ha.d/
cp /usr/share/doc/heartbeat-3.0.4/ha.cf /etc/ha.d/
vi /etc/ha.d/authkeys
#如果使用双机对联线(双绞线),可以配置如下:
auth 1
1 crc
#存盘退出,然后
chmod 600 /etc/ha.d/authkeys
vi /etc/ha.d/haresources
#各主机这部分应完全相同。
添加:
PC IPaddr::10.192.203.203
#注意,PC这写你的master的主机名,Ipaddr写的是你的VIP地址。
也可设置heartbeat管理的资源或服务:在该目录下存放服务启动脚本(例如:mysqld),将相同脚本名称添到/etc/ha.d/haresources内容中,从而跟随heartbeat启动而启动该脚本。
如:PC IPaddr::10.192.203.203mysql #
但是,这样当heartbeat关闭的时候,也会关闭mysql,所以这里我就不添加了。
主和从机器除了ucast eth1 10.0.0.2这一行不同外,其他都一样。
vi /etc/ha.d/ha.cf
添加:
logfile/var/log/ha_log/ha-log.log ## ha的日志文件记录位置。如没有该目录,则需要手动添加 bcast eth1 ##使用eht1做心跳监测 ucast eth110.0.0.2 ##心跳网卡连接对方心跳地址 keepalive 2 ##设定心跳(监测)时间时间为2秒 warntime 10 deadtime 30 initdead 120 hopfudge 1 udpport 694 ##使用udp端口694 进行心跳监测 auto_failback off node PC ##节点1,必须要与 uname -n 指令得到的结果一致。 node slave2 ##节点2 ping 10.192.203.254 ##通过ping 网关来监测心跳是否正常。
mkdir -p /var/log/ha_log
chmod 777 /var/log/ha_log/
ipvsadm -A -t 10.192.203.203:3307 -s rr
ipvsadm -a -t 10.192.203.203:3307 -r 10.192.203.201:3307-m
ipvsadm -a -t 10.192.203.203:3307 -r 10.192.203.202:3307-m
[root@PC download]# ipvsadm --list
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP bogon:opsession-prxy rr
-> bogon:opsession-prxy Local 1 0 0
-> bogon:opsession-prxy Masq 1 0 0
[root@slave2 download]# ipvsadm --list
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.192.203.203:opsession-prx rr
-> 10.192.203.201:opsession-prx Masq 1 0 0
-> 10.192.203.202:opsession-prx Local 1 0 0
heartbeat 默认使用udp 694端口进行心跳监测。 如果系统有使用iptables 做防火墙,应记住把这个端口打开。
vi/etc/sysconfig/iptables
添加:-A INPUT -pudp --dport 694 -j ACCEPT
service iptables restart
启动HA: serviceheartbeat start
在主从都启动heartbeat
[root@PC init.d]# service heartbeat start
Starting High-Availability services:INFO: Resource is stopped
Done.
[root@PC ha_log]# service heartbeat status
heartbeat OK [pid 17943 et al] is runningon pc [pc]...
[root@slave2 ha_log]# service heartbeatstatus
heartbeat OK [pid 6536 et al] is running onslave2 [slave2]...
在主上看到虚拟IP了:
[root@PC ha_log]# ip addr
1: lo: <loopback> mtu16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0:<broadcast> mtu 1500 qdisc pfifo_fast state UP qlen1000 link/ether 08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff inet 10.192.203.201/24 brd 10.192.203.255 scope global eth0 inet 10.192.203.203/24 brd 10.192.203.255 scope global secondary eth0 inet6 fe80::a00:27ff:fe04:516/64 scope link tentative dadfailed valid_lft forever preferred_lft forever 3: eth1:<broadcast> mtu 1500 qdisc pfifo_fast state UP qlen1000 link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff inet 10.0.0.1/24 brd 10.0.0.255 scope global eth1 inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link tentative dadfailed valid_lft forever preferred_lft forever</broadcast></broadcast></loopback>
在/var/log/ha_log下的日志文件或者/var/log/messages 都可以看到相关信息。
[root@PC network-scripts]# tail -f/var/log/messages
May 19 01:34:59 PCResourceManager(default)[17985]: info: Running /etc/ha.d/resource.d/IPaddr10.192.203.203 start May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: Adding inet address10.192.203.203/24 with broadcast address 10.192.203.255 to device eth0 May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: Bringing device eth0 up May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: /usr/libexec/heartbeat/send_arp -i200 -r 5 -p /var/run/resource-agents/send_arp-10.192.203.203 eth010.192.203.203 auto not_used not_used May 19 01:35:00 PC/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[18089]:INFO: Success May 19 01:35:00 PCResourceManager(default)[17985]: info: Running /etc/init.d/mysql start May 19 01:35:03 PC heartbeat: [17972]:info: local HA resource acquisition completed (standby). May 19 01:35:03 PC heartbeat: [17943]:info: Standby resource acquisition done [foreign]. May 19 01:35:03 PC heartbeat: [17943]:info: Initial resource acquisition complete (auto_failback) May 19 01:35:03 PC heartbeat: [17943]:info: remote resource transition completed.
测试:
将主201上的心跳关闭
[root@PC ha_log]# service heartbeat stop
Stopping High-Availability services: Done.
查看日志:
May 19 01:46:57 PC heartbeat: [18561]: info:Giving up all HA resources. May 19 01:46:58 PCResourceManager(default)[18574]: info: Releasing resource group: pcIPaddr::10.192.203.203 mysql May 19 01:46:58 PCResourceManager(default)[18574]: info: Running /etc/init.d/mysql stop May 19 01:46:59 PC ResourceManager(default)[18574]:info: Running /etc/ha.d/resource.d/IPaddr 10.192.203.203 stop May 19 01:46:59 PCIPaddr(IPaddr_10.192.203.203)[18652]: INFO: IP status = ok, IP_CIP= May 19 01:46:59 PC/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[18638]:INFO: Success May 19 01:46:59 PC heartbeat: [18561]:info: All HA resources relinquished. May 19 01:47:00 PC heartbeat: [17943]:WARN: 1 lost packet(s) for [slave2] [2777:2779] May 19 01:47:00 PC heartbeat: [17943]:info: No pkts missing from slave2! May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17949 with signal 15 May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17950 with signal 15 May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17951 with signal 15 May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17952 with signal 15 May 19 01:47:01 PC heartbeat: [17943]:info: killing HBFIFO process 17946 with signal 15 May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17947 with signal 15 May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17948 with signal 15 May 19 01:47:01 PC heartbeat: [17943]:info: Core process 17951 exited. 7 remaining May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17946 exited. 6 remaining May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17947 exited. 5 remaining May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17948 exited. 4 remaining May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17949 exited. 3 remaining May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17950 exited. 2 remaining May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17952 exited. 1 remaining May 19 01:47:02 PC heartbeat: [17943]:info: pc Heartbeat shutdown complete.
harc(default)[8578]: 2016/05/19_01:47:00 info: Running /etc/ha.d//rc.d/statusstatus mach_down(default)[8595]: 2016/05/19_01:47:00 info: Taking overresource group IPaddr::10.192.203.203 ResourceManager(default)[8622]: 2016/05/19_01:47:00 info: Acquiring resourcegroup: pc IPaddr::10.192.203.203 mysql /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[8650]: 2016/05/19_01:47:01 INFO: Resource is stopped ResourceManager(default)[8622]: 2016/05/19_01:47:01 info: Running/etc/ha.d/resource.d/IPaddr 10.192.203.203 start IPaddr(IPaddr_10.192.203.203)[8746]: 2016/05/19_01:47:01 INFO: Adding inet address10.192.203.203/24 with broadcast address 10.192.203.255 to device eth0 IPaddr(IPaddr_10.192.203.203)[8746]: 2016/05/19_01:47:01 INFO: Bringing device eth0up IPaddr(IPaddr_10.192.203.203)[8746]: 2016/05/19_01:47:01 INFO:/usr/libexec/heartbeat/send_arp -i 200 -r 5 -p/var/run/resource-agents/send_arp-10.192.203.203 eth0 10.192.203.203 autonot_used not_used /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[8732]: 2016/05/19_01:47:01 INFO: Success ResourceManager(default)[8622]: 2016/05/19_01:47:02 info: Running/etc/init.d/mysql start mach_down(default)[8595]: 2016/05/19_01:47:05 info: /usr/share/heartbeat/mach_down:nice_failback: foreign resources acquired mach_down(default)[8595]: 2016/05/19_01:47:05 info: mach_down takeovercomplete for node pc. May 19 01:47:05 slave2 heartbeat: [6536]:info: mach_down takeover complete. May 19 01:47:31 slave2 heartbeat: [6536]:WARN: node pc: is dead May 19 01:47:31 slave2 heartbeat: [6536]:info: Dead node pc gave up resources. May 19 01:47:31 slave2 heartbeat: [6536]:info: Link pc:eth1 dead.
显示202接管成功了。
在202上能看到vip已经漂移过来:
[root@slave2 ha_log]# ip addr
1: lo: <loopback> mtu16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <broadcast>mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff inet 10.192.203.202/24 brd 10.192.203.255 scope global eth0 inet 10.192.203.203/24 brd 10.192.203.255 scope global secondary eth0 inet6 fe80::a00:27ff:fe04:516/64 scope link valid_lft forever preferred_lft forever 3: eth1:<broadcast> mtu 1500 qdisc pfifo_fast state UP qlen1000 link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff inet 10.0.0.2/24 brd 10.0.0.255 scope global eth1 inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link valid_lft forever preferred_lft forever</broadcast></broadcast></loopback>
201已经没有vip
[root@PC ha_log]# ip addr
1: lo: <loopback> mtu16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0:<broadcast> mtu 1500 qdisc pfifo_fast state UP qlen1000 link/ether08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff inet 10.192.203.201/24 brd 10.192.203.255 scope global eth0 inet6 fe80::a00:27ff:fe04:516/64 scope link tentative dadfailed valid_lft forever preferred_lft forever 3: eth1: <broadcast>mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff inet 10.0.0.1/24 brd 10.0.0.255 scope global eth1 inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link tentative dadfailed valid_lft forever preferred_lft forever</broadcast></broadcast></loopback>
heartbeat只检测心跳也就是只检测设备是否宕机,不会检测MySQL服务,所以我们同样要有一个脚本来检测MySQL服务,如果mysql服务宕掉,则kill掉heartbeat进程实现故障转移(和nginx+keepalived原理一致),脚本内容如下:
vi /root/check_mysql.sh
MYSQL=/usr/local/mysql/bin/mysql MYSQL_HOST=localhost MYSQL_USER=root MYSQL_PASSWORD=system@123 $MYSQL -h $MYSQL_HOST -u $MYSQL_USER-p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1 #$mysqlclient --host=$host --port=$port--user=$user --password=$password -e"show databases;" > /dev/null 2>&1 if [ $? == 0 ] then echo " $host mysql login successfully " exit 0 else #echo " $host mysql login faild" /etc/init.d/heartbeat stop exit 2 fi
这个脚本待写一些邮件通知的操作。
chmod +x /root/check_mysql.sh
设置成定时任务,每分钟检查一次:
*/1 * * * * /root/check_mysql.sh >>/root/check_mysql.log
关闭当前主的mysql,验证下vip是否漂移到了从。
本篇文章参考了以下文章:
http://www.linuxidc.com/Linux/2011-11/46764.htm
http://www.codesky.net/article/201111/173710.html
http://blog.chinaunix.net/uid-20639775-id-3337481.html
http://www.oschina.net/question/163914_31896
https://www.linuxzen.com/heartbeatshi-xian-mysqlshuang-ji-gao-ke-yong.html
http://www.it165.net/admin/html/201308/1702.html
http://blog.csdn.net/wyzxg/article/details/7741116