Failover with the MySQL Utilities – Part 1: mysqlrpladmin

首頁

資料庫

mysql教程

Failover with the MySQL Utilities – Part 1: mysqlrpladmin_MySQL

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 01, 2016 pm 01:07 PM

MySQL Utilitiesare a set of tools provided by Oracle to perform many kinds of administrative tasks. When GTID-replication is enabled, 2 tools can be used for slave promotion:mysqlrpladminandmysqlfailover. We will reviewmysqlrpladmin(version 1.4.3) in this post.

Summary

mysqlrpladmincan perform manual failover/switchover when GTID-replication is enabled.
You need to have your servers configured with--master-info-repository = TABLEor to add the--rpl-useroption for the tool to work properly.
The check for errant transactions is failing in the current GA version (1.4.3) so be extra careful when using it or watchbug #73110to see when a fix is committed.
There are some limitations, for instance the inability to pre-configure the list of slaves in a configuration file or the inability to check that the tool will work well without actually doing a failover or switchover.

Failover vs switchover

mysqlrpladmin can help you promote a slave to be the new master when the master goes down and then automate replication reconfiguration after this slave promotion. There are 2 separate scenarios: unplanned promotion (failover) and planned promotion (switchover). Beyond the words, it has implications on the way you have to execute the tool.

Setup for this test

To test the tool, our setup will be a master with 2 slaves, all using GTID replication.mysqlrpladmincan show us the current replication topology with thehealthcommand:

$ mysqlrpladmin --master=root@localhost:13001 --discover-slaves-login=root health# Discovering slaves for master at localhost:13001# Discovering slave at localhost:13002# Found slave: localhost:13002# Discovering slave at localhost:13003# Found slave: localhost:13003# Checking privileges.## Replication Topology Health:+------------+--------+---------+--------+------------+---------+| host | port | role| state| gtid_mode| health|+------------+--------+---------+--------+------------+---------+| localhost| 13001| MASTER| UP | ON | OK|| localhost| 13002| SLAVE | UP | ON | OK|| localhost| 13003| SLAVE | UP | ON | OK|+------------+--------+---------+--------+------------+---------+# ...done.

$mysqlrpladmin--master=root@localhost:13001--discover-slaves-login=roothealth

# Discovering slaves for master at localhost:13001

# Discovering slave at localhost:13002

# Found slave: localhost:13002

# Discovering slave at localhost:13003

# Found slave: localhost:13003

# Checking privileges.

# Replication Topology Health:

+------------+--------+---------+--------+------------+---------+

+------------+--------+---------+--------+------------+---------+

|localhost |13001 |MASTER |UP |ON |OK |

|localhost |13002 |SLAVE |UP |ON |OK |

|localhost |13003 |SLAVE |UP |ON |OK |

+------------+--------+---------+--------+------------+---------+

# ...done.

As you can see, we have to specify how to connect to the master (no surprise) but instead of listing all the slaves, we can let the tool discover them.

Simple failover scenario

What will the tool do when performing failover? Essentially we will give it the list of slaves and the list of candidates and it will:

Run a few sanity checks
Elect a candidate to be the new master
Make the candidate as up-to-date as possible by making it a slave of all the other slaves
Configure replication on all the other slaves to make them replicate from the new master

After killing -9 the master, let’s try failover:

$ mysqlrpladmin --slaves=root:@localhost:13002,root:@localhost:13003 --candidates=root@localhost:13002 failover

$mysqlrpladmin--slaves=root:@localhost:13002,root:@localhost:13003--candidates=root@localhost:13002failover

This time, the master is down so the tool has no way to automatically discover the slaves. Thus we have to specify them with the--slavesoption.

However we get an error:

# Checking privileges.# Checking privileges on candidates.ERROR: You must specify either the --rpl-user or set all slaves to use --master-info-repository=TABLE.

# Checking privileges.

# Checking privileges on candidates.

ERROR:Youmustspecifyeitherthe--rpl-userorsetallslavestouse--master-info-repository=TABLE.

The error message is clear, but it would have been nice to have such details when running thehealthcommand (maybe a warning instead of an error). That would allow you to check beforehand that the tool can run smoothly rather than to discover in the middle of an emergency that you have to look at the documentation to find which option is missing.

Let’s choose to specify the replication user:

$ mysqlrpladmin --slaves=root:@localhost:13002,root:@localhost:13003 --candidates=root@localhost:13002 --rpl-user=repl:repl failover# Checking privileges.# Checking privileges on candidates.# Performing failover.# Candidate slave localhost:13002 will become the new master.# Checking slaves status (before failover).# Preparing candidate for failover.# Creating replication user if it does not exist.# Stopping slaves.# Performing STOP on all slaves.# Switching slaves to new master.# Disconnecting new master as slave.# Starting slaves.# Performing START on all slaves.# Checking slaves for errors.# Failover complete.## Replication Topology Health:+------------+--------+---------+--------+------------+---------+| host | port | role| state| gtid_mode| health|+------------+--------+---------+--------+------------+---------+| localhost| 13002| MASTER| UP | ON | OK|| localhost| 13003| SLAVE | UP | ON | OK|+------------+--------+---------+--------+------------+---------+# ...done.

$mysqlrpladmin--slaves=root:@localhost:13002,root:@localhost:13003--candidates=root@localhost:13002--rpl-user=repl:replfailover

# Checking privileges.

# Checking privileges on candidates.

# Performing failover.

# Candidate slave localhost:13002 will become the new master.

# Checking slaves status (before failover).

# Preparing candidate for failover.

# Creating replication user if it does not exist.

# Stopping slaves.

# Performing STOP on all slaves.

# Switching slaves to new master.

# Disconnecting new master as slave.

# Starting slaves.

# Performing START on all slaves.

# Checking slaves for errors.

# Failover complete.

# Replication Topology Health:

+------------+--------+---------+--------+------------+---------+

+------------+--------+---------+--------+------------+---------+

|localhost |13002 |MASTER |UP |ON |OK |

|localhost |13003 |SLAVE |UP |ON |OK |

+------------+--------+---------+--------+------------+---------+

# ...done.

Simple switchover scenario

Let’s now restart the old master and configure it as a slave of the new master (by the way, this can be done withmysqlreplicate, another tool from the MySQL Utilities). If we want to promote the old master, we can run:

$ mysqlrpladmin --master=root@localhost:13002 --new-master=root@localhost:13001 --discover-slaves-login=root --demote-master --rpl-user=repl:repl --quiet switchover# Discovering slave at localhost:13001# Found slave: localhost:13001# Discovering slave at localhost:13003# Found slave: localhost:13003+------------+--------+---------+--------+------------+---------+| host | port | role| state| gtid_mode| health|+------------+--------+---------+--------+------------+---------+| localhost| 13001| MASTER| UP | ON | OK|| localhost| 13002| SLAVE | UP | ON | OK|| localhost| 13003| SLAVE | UP | ON | OK|+------------+--------+---------+--------+------------+---------+

$mysqlrpladmin--master=root@localhost:13002--new-master=root@localhost:13001--discover-slaves-login=root--demote-master--rpl-user=repl:repl--quietswitchover

# Discovering slave at localhost:13001

# Found slave: localhost:13001

# Discovering slave at localhost:13003

# Found slave: localhost:13003

+------------+--------+---------+--------+------------+---------+

+------------+--------+---------+--------+------------+---------+

|localhost |13001 |MASTER |UP |ON |OK |

|localhost |13002 |SLAVE |UP |ON |OK |

|localhost |13003 |SLAVE |UP |ON |OK |

+------------+--------+---------+--------+------------+---------+

Notice that the master is available in this case so we can use thediscover-slaves-loginoption. Also notice that we can tune the verbosity of the tool by using--quietor--verboseor even log the output in a file with--log.

We also used--demote-masterto make the old master a slave of the new master. Without this option, the old master will be isolated from the other nodes.

Extension points

In general doing switchover/failover at the database level is one thing but informing the other components of the application that something has changed is most often necessary for the application to keep on working correctly.

This is where the extension points are handy: you can execute a script before switchover/failover with--exec-beforeand after switchover/failover with--exec-after.

For instance with these simple scripts:

# cat /usr/local/bin/check_before#!/bin/bash/usr/local/mysql5619/bin/mysql -uroot -S /tmp/node1.sock -Ee 'SHOW SLAVE STATUS' > /tmp/before# cat /usr/local/bin/check_after#!/bin/bash/usr/local/mysql5619/bin/mysql -uroot -S /tmp/node1.sock -Ee 'SHOW SLAVE STATUS' > /tmp/after

# cat /usr/local/bin/check_before

#!/bin/bash

/usr/local/mysql5619/bin/mysql-uroot-S/tmp/node1.sock-Ee'SHOW SLAVE STATUS'>/tmp/before

# cat /usr/local/bin/check_after

#!/bin/bash

/usr/local/mysql5619/bin/mysql-uroot-S/tmp/node1.sock-Ee'SHOW SLAVE STATUS'>/tmp/after

We can execute:

$ mysqlrpladmin --master=root@localhost:13001 --new-master=root@localhost:13002 --discover-slaves-login=root --demote-master --rpl-user=repl:repl --quiet --exec-before=/usr/local/bin/check_before --exec-after=/usr/local/bin/check_after switchover

$mysqlrpladmin--master=root@localhost:13001--new-master=root@localhost:13002--discover-slaves-login=root--demote-master--rpl-user=repl:repl--quiet--exec-before=/usr/local/bin/check_before--exec-after=/usr/local/bin/check_afterswitchover

And looking the /tmp/before and /tmp/after, we can see that our scripts have been executed:

# cat /tmp/before# cat /tmp/after*************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay logMaster_Host: localhostMaster_User: replMaster_Port: 13002[...]

# cat /tmp/before

# cat /tmp/after

***************************1.row***************************

Slave_IO_State:Queueingmastereventtotherelaylog

Master_Host:localhost

Master_User:repl

Master_Port:13002

[...]

If the external script does not seem to work, using –verbose can be useful to diagnose the issue.

What about errant transactions?

We already mentioned that errant transactions can createlots of issueswhen a new master is promoted in a cluster running GTIDs. So the question is: howmysqlrpladminbehaves when there is an errant transaction?

Let’s create an errant transaction:

# On localhost:13003mysql> CREATE DATABASE test2;mysql> FLUSH LOGS;mysql> SHOW BINARY LOGS;+------------------+-----------+| Log_name | File_size |+------------------+-----------+| mysql-bin.000001 | 69309 || mysql-bin.000002 | 1237667 || mysql-bin.000003 | 617 || mysql-bin.000004 | 231 |+------------------+-----------+mysql> PURGE BINARY LOGS TO 'mysql-bin.000004';

# On localhost:13003

mysql>CREATEDATABASEtest2;

mysql>FLUSHLOGS;

mysql>SHOWBINARYLOGS;

+------------------+-----------+

|Log_name |File_size|

+------------------+-----------+

|mysql-bin.000001| 69309|

|mysql-bin.000002| 1237667|

|mysql-bin.000003| 617|

|mysql-bin.000004| 231|

+------------------+-----------+

mysql>PURGEBINARYLOGSTO'mysql-bin.000004';

and let’s try to promote localhost:13003 as the new master:

$ mysqlrpladmin --master=root@localhost:13001 --new-master=root@localhost:13003 --discover-slaves-login=root --demote-master --rpl-user=repl:repl --quiet switchover[...]+------------+--------+---------+--------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| host | port | role| state| gtid_mode| health|+------------+--------+---------+--------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| localhost| 13003| MASTER| UP | ON | OK|| localhost| 13001| SLAVE | UP | ON | IO thread is not running., Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Slave has 1 transactions behind master.|| localhost| 13002| SLAVE | UP | ON | IO thread is not running., Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Slave has 1 transactions behind master.|+------------+--------+---------+--------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

$mysqlrpladmin--master=root@localhost:13001--new-master=root@localhost:13003--discover-slaves-login=root--demote-master--rpl-user=repl:repl--quietswitchover

[...]

+------------+--------+---------+--------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

|localhost |13003 |MASTER |UP |ON |OK |

|localhost |13001 |SLAVE |UP |ON |IOthreadisnotrunning.,Gotfatalerror1236frommasterwhenreadingdatafrombinarylog:'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.',Slavehas1transactionsbehindmaster. |

|localhost |13002 |SLAVE |UP |ON |IOthreadisnotrunning.,Gotfatalerror1236frommasterwhenreadingdatafrombinarylog:'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.',Slavehas1transactionsbehindmaster. |

Oops! Although it is suggested by the documentation, the tool does not check errant transactions. This is a major issue as you cannot run failover/switchover reliably with GTID replication if errant transactions are not correctly detected.

The documentation suggests errant transactions should be checked and a quick look at the code confirms that, but it does not work! So it has beenreported.

Some limitations

Apart from the missing errant transaction check, I also noticed a few limitations:

You cannot use a configuration file listing all the slaves. This becomes boring once you have a large amount of slaves. In such a case, you should write a wrapper script aroundmysqlrpladminto generate the right command for you
The slave election process is either automatic or it relies on the order of the servers given in the--candidatesoption. This is not very sophisticated.
It would be useful to have a –dry-run mode which would validate that everything is configured correctly but without actually failing/switching over. This is something MHA does for instance.

Conclusion

mysqlrpladminis a very good tool to help you perform manual failover/switchover in a cluster using GTID replication. The main caveat at this point is the failing check for errant transactions, which requires a lot of care before executing the tool.

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

MySQL的位置：數據庫和編程Apr 13, 2025 am 12:18 AM

MySQL在數據庫和編程中的地位非常重要，它是一個開源的關係型數據庫管理系統，廣泛應用於各種應用場景。 1）MySQL提供高效的數據存儲、組織和檢索功能，支持Web、移動和企業級系統。 2）它使用客戶端-服務器架構，支持多種存儲引擎和索引優化。 3）基本用法包括創建表和插入數據，高級用法涉及多表JOIN和復雜查詢。 4）常見問題如SQL語法錯誤和性能問題可以通過EXPLAIN命令和慢查詢日誌調試。 5）性能優化方法包括合理使用索引、優化查詢和使用緩存，最佳實踐包括使用事務和PreparedStatemen

MySQL：從小型企業到大型企業Apr 13, 2025 am 12:17 AM

MySQL適合小型和大型企業。 1)小型企業可使用MySQL進行基本數據管理，如存儲客戶信息。 2)大型企業可利用MySQL處理海量數據和復雜業務邏輯，優化查詢性能和事務處理。

幻影是什麼讀取的，InnoDB如何阻止它們（下一個鍵鎖定）？Apr 13, 2025 am 12:16 AM

InnoDB通過Next-KeyLocking機制有效防止幻讀。 1）Next-KeyLocking結合行鎖和間隙鎖，鎖定記錄及其間隙，防止新記錄插入。 2）在實際應用中，通過優化查詢和調整隔離級別，可以減少鎖競爭，提高並發性能。

mysql：不是編程語言，而是...Apr 13, 2025 am 12:03 AM

MySQL不是一門編程語言，但其查詢語言SQL具備編程語言的特性：1.SQL支持條件判斷、循環和變量操作；2.通過存儲過程、觸發器和函數，用戶可以在數據庫中執行複雜邏輯操作。

MySQL：世界上最受歡迎的數據庫的簡介Apr 12, 2025 am 12:18 AM

MySQL是一種開源的關係型數據庫管理系統，主要用於快速、可靠地存儲和檢索數據。其工作原理包括客戶端請求、查詢解析、執行查詢和返回結果。使用示例包括創建表、插入和查詢數據，以及高級功能如JOIN操作。常見錯誤涉及SQL語法、數據類型和權限問題，優化建議包括使用索引、優化查詢和分錶分區。

MySQL的重要性：數據存儲和管理Apr 12, 2025 am 12:18 AM

MySQL是一個開源的關係型數據庫管理系統，適用於數據存儲、管理、查詢和安全。 1.它支持多種操作系統，廣泛應用於Web應用等領域。 2.通過客戶端-服務器架構和不同存儲引擎，MySQL高效處理數據。 3.基本用法包括創建數據庫和表，插入、查詢和更新數據。 4.高級用法涉及復雜查詢和存儲過程。 5.常見錯誤可通過EXPLAIN語句調試。 6.性能優化包括合理使用索引和優化查詢語句。

為什麼要使用mysql？利益和優勢Apr 12, 2025 am 12:17 AM

選擇MySQL的原因是其性能、可靠性、易用性和社區支持。 1.MySQL提供高效的數據存儲和檢索功能，支持多種數據類型和高級查詢操作。 2.採用客戶端-服務器架構和多種存儲引擎，支持事務和查詢優化。 3.易於使用，支持多種操作系統和編程語言。 4.擁有強大的社區支持，提供豐富的資源和解決方案。

描述InnoDB鎖定機制（共享鎖，獨家鎖，意向鎖，記錄鎖，間隙鎖，下一鍵鎖）。Apr 12, 2025 am 12:16 AM

InnoDB的鎖機制包括共享鎖、排他鎖、意向鎖、記錄鎖、間隙鎖和下一個鍵鎖。 1.共享鎖允許事務讀取數據而不阻止其他事務讀取。 2.排他鎖阻止其他事務讀取和修改數據。 3.意向鎖優化鎖效率。 4.記錄鎖鎖定索引記錄。 5.間隙鎖鎖定索引記錄間隙。 6.下一個鍵鎖是記錄鎖和間隙鎖的組合，確保數據一致性。

See all articles