首页  >  文章  >  数据库  >  使用 Tu_MySQL 的 Percona Toolkit 插件检查 MySQL 从机延迟

使用 Tu_MySQL 的 Percona Toolkit 插件检查 MySQL 从机延迟

WBOY
WBOY原创
2016-05-31 08:46:302039浏览

Continuent Tungsten Replicator 不久前,我对 pt-online-schema-change 的插件接口进行了一些更改,它允许编写自定义复制检查。在添加此功能时,我还向 pt-table-checksum 添加了 --plugin 选项。这是在 Percona Toolkit 2.2.8 中发布的。

通过这些添加,我花了一些时间编写了一个插件,允许 Percona Toolkit 工具使用 Tungsten Replicator 来检查从机延迟,您可以在 https://github.com/grypyrg/percona- 找到代码toolkit-plugin-tungsten-replicator

该插件使用 perl JSON::XS 模块 ( perl-JSON-XS rpm 包,http://search.cpan.org/dist/ JSON-XS/XS.pm ),确保它可用,否则插件将无法工作。

我们需要使用 --recursion-method=dsns 因为 Percona Toolkit 工具无法自动找到连接到主数据库的 tungsten 复制器从属。 (我确实在启动板上添加了一个蓝图,以使其成为可能 https://blueprints.launchpad.net/percona-toolkit/spec/plugin-custom-recursion-method )

dsns递归方法从您指定的数据库表中获取从属列表:

CREATE TABLE `percona`.`dsns` (`id` int(11) NOT NULL AUTO_INCREMENT,`parent_id` int(11) DEFAULT NULL,`dsn` varchar(255) NOT NULL,PRIMARY KEY (`id`));

CREATE TABLE ` percona ` . ` dsns ` (

   ` id ` int ( 11 ) NOT NULL AUTO_INCREMENT ,

   ` parent_id ` int ( 11 ) DEFAULT NULL ,

   ` dsn ` varchar ( 255 ) NOT NULL ,

   PRIMARY KEY ( ` id ` )

) ;

CREATE TABLE ` percona ` 。 ` dsns ` (

   ` id ` int ( 11 ) NOT NULL AUTO_INCRMENT ,

   `parent_id ` int ( 11 ) 默认为 NULL ,
node1 mysql> select * from percona.dsns;+----+-----------+---------+| id | parent_id | dsn |+----+-----------+---------+|2 |NULL | h=node3 |+----+-----------+---------+

node1 mysql > select * from percona .dsns ;

-- -- -- -- -- -- -- - -- -- -- -- -

| id | parent_id | dsn      |

-- -- -- -- -- -- -- - -- -- -- -- -

|    2 |        NULL | h = node3 |

-- -- -- -- -- -- -- - -- -- -- -- -

   ` dsn ` varchar ( 255 ) NOT NULL ,

   主键 ( ` id ` )

) ;
## CONFIGURATION# trepctl command to runmy $trepctl="/opt/tungsten/installs/cookbook/tungsten/tungsten-replicator/bin/trepctl";# what tungsten replicator service to checkmy $service="bravo";# what user does tungsten replicator use to perform the writes?# See Binlog Format for more informationmy $tungstenusername = 'tungsten';

## CONFIGURATION

# trepctl command to run

my $trepctl = "/opt/tungsten/installs/cookbook/tungsten/tungsten-replicator/bin/trepctl" ;

# what tungsten replicator service to check

my $service = "bravo" ;

# what user does tungsten replicator use to perform the writes?

# See Binlog Format for more information

my $tungstenusername = 'tungsten' ;

这里有一个从属

node3

正在从主服务器复制:

node1 mysql>;从 percona .dsns 选择 * ;
# pt-table-checksum-u checksum--no-check-binlog-format--recursion-method=dsn=D=percona,t=dsns--plugin=/vagrant/pt-plugin-tungsten_replicator.pl--databases app--check-interval=5--max-lag=10Created plugin from /vagrant/pt-plugin-tungsten_replicator.pl.PLUGIN get_slave_lag: Using Tungsten Replicator to check replication lagTungsten Replicator status of host node3 is OFFLINE:NORMAL, waitingTungsten Replicator status of host node3 is OFFLINE:NORMAL, waitingReplica node3 is stopped.Waiting.Tungsten Replicator status of host node3 is OFFLINE:NORMAL, waitingReplica lag is 125 seconds on node3.Waiting.Replica lag is 119 seconds on node3.Waiting.Checksumming app.large_table:22% 00:12 remainTS ERRORSDIFFS ROWSCHUNKS SKIPPEDTIME TABLE07-03T10:49:54002097152 7 0 213.238 app.large_table
-- -- -- -- -- -- -- -- - -- -- -- -- -- -

# pt-table-checksum

         - u checksum

         -- no - check - binlog - format

         -- recursion - method = dsn = D = percona , t = dsns

         -- plugin = / vagrant / pt - plugin - tungsten_replicator .pl   

         -- databases app

         -- check - interval = 5

         -- max - lag = 10

Created plugin from / vagrant / pt - plugin - tungsten_replicator .pl .

PLUGIN get_slave_lag : Using Tungsten Replicator to check replication lag

Tungsten Replicator status of host node3 is OFFLINE : NORMAL , waiting

Tungsten Replicator status of host node3 is OFFLINE : NORMAL , waiting

Replica node3 is stopped .    Waiting .

Tungsten Replicator status of host node3 is OFFLINE : NORMAL , waiting

Replica lag is 125 seconds on node3 .    Waiting .

Replica lag is 119 seconds on node3 .    Waiting .

Checksumming app .large_table :    22 % 00 : 12 remain

             TS ERRORS   DIFFS     ROWS   CHUNKS SKIPPED     TIME TABLE

07 - 03T10 : 49 : 54        0        0    2097152        7        0 213.238 app .large_table

|编号 |父 ID | DSN      | -- -- -- -- -- -- -- -- - -- -- -- -- -- - |    2 |        空| h = 节点3 | -- -- -- -- -- -- -- -- - -- -- -- -- -- -
目前,使用 Percona Toolkit 无法为插件指定额外选项,因此仍然需要手动编辑 perl 文件来配置它。因此,在运行校验和之前,我们需要配置插件:
## 配置 # trepctl 命令运行 my $trepctl = "/opt/tungsten/installs /cookbook/tungsten/tungsten-replicator/bin/trepctl" ; # 要检查什么钨复制器服务 my $service = "bravo" ; # tungstenreplicator 使用什么用户来执行写入? # 有关更多信息,请参阅 Binlog 格式 my $tungstenusername = 'tungsten' ;
运行校验和 在这里,我使用 pt-table-checksum 对表进行了校验和。在校验和过程中,我使从节点离线并再次将其重新上线:
# pt-table-checksum          - u checksum          -- 无 - 检查 - binlog - 格式          -- 递归 - method = dsn = D = percona , t = dsns          -- plugin = / vagrant / pt - 插件 - tungsten_replicator .pl             -- 数据库应用          -- 检查 - 间隔 = 5          -- max - lag = 10 已创建来自 /vagrant/pt-plugin-tungsten_replicator.pl 的插件。 PLUGIN get_slave_lag :使用 Tungsten Replicator 检查复制延迟 主机节点 3 的 Tungsten Replicator 状态为离线:正常,等待 主机节点 3 的 Tungsten Replicator 状态为离线:正常,等待 副本节点 3 已停止。    等待 。 主机节点 3 的 Tungsten Replicator 状态为 OFFLINE:NORMAL,正在等待 节点 3 上的副本延迟为 125 秒。    等待 。 Node3 上的副本延迟为 119 秒。    等待 。 校验和应用程序 .large_table :    22 % 00 : 12 剩余              TS 错误   差异     行   跳过块     时间表 07 - 03T10 : 49 : 54        0        0    2097152        7        0 213.238应用程序 .large_table

I recommend to change the check-interval higher than the default 1 second as running trepctl takes a while. This could slow down the process quite a lot.

Making Schema Changes

The plugin also works with pt-online-schema-change :

# pt-online-schema-change-u schemachange--recursion-method=dsn=D=percona,t=dsns--plugin=/vagrant/pt-plugin-tungsten_replicator.pl--check-interval=5--max-lag=10--alter "add index (column1) "--execute D=app,t=large_tableCreated plugin from /vagrant/pt-plugin-tungsten_replicator.pl.Found 1 slaves:node3Will check slave lag on:node3PLUGIN get_slave_lag: Using Tungsten Replicator to check replication lagOperation, tries, wait:copy_rows, 10, 0.25create_triggers, 10, 1drop_triggers, 10, 1swap_tables, 10, 1update_foreign_keys, 10, 1Altering `app`.`large_table`...Creating new table...Created new table app._large_table_new OK.Waiting forever for new table `app`.`_large_table_new` to replicate to node3...Altering new table...Altered `app`.`_large_table_new` OK.2014-07-03T13:02:33 Creating triggers...2014-07-03T13:02:33 Created triggers OK.2014-07-03T13:02:33 Copying approximately 8774670 rows...Copying `app`.`large_table`:26% 01:21 remainCopying `app`.`large_table`:50% 00:59 remainReplica lag is 12 seconds on node3.Waiting.Replica lag is 12 seconds on node3.Waiting.Copying `app`.`large_table`:53% 02:22 remainCopying `app`.`large_table`:82% 00:39 remain2014-07-03T13:06:06 Copied rows OK.2014-07-03T13:06:06 Swapping tables...2014-07-03T13:06:06 Swapped original and new tables OK.2014-07-03T13:06:06 Dropping old table...2014-07-03T13:06:06 Dropped old table `app`.`_large_table_old` OK.2014-07-03T13:06:06 Dropping triggers...2014-07-03T13:06:06 Dropped triggers OK.Successfully altered `app`.`large_table`.

# pt-online-schema-change

         - u schemachange

         -- recursion - method = dsn = D = percona , t = dsns

         -- plugin = / vagrant / pt - plugin - tungsten_replicator .pl

         -- check - interval = 5

         -- max - lag = 10

         -- alter "add index (column1) "

         -- execute D = app , t = large_table

Created plugin from / vagrant / pt - plugin - tungsten_replicator .pl .

Found 1 slaves :

   node3

Will check slave lag on :

   node3

PLUGIN get_slave_lag : Using Tungsten Replicator to check replication lag

Operation , tries , wait :

   copy_rows , 10 , 0.25

   create_triggers , 10 , 1

   drop_triggers , 10 , 1

   swap_tables , 10 , 1

   update_foreign_keys , 10 , 1

Altering ` app ` . ` large_table ` . . .

Creating new table . . .

Created new table app ._large_table_new OK .

Waiting forever for new table ` app ` . ` _large_table_new ` to replicate to node3 . . .

Altering new table . . .

Altered ` app ` . ` _large_table_new ` OK .

2014 - 07 - 03T13 : 02 : 33 Creating triggers . . .

2014 - 07 - 03T13 : 02 : 33 Created triggers OK .

2014 - 07 - 03T13 : 02 : 33 Copying approximately 8774670 rows . . .

Copying ` app ` . ` large_table ` :    26 % 01 : 21 remain

Copying ` app ` . ` large_table ` :    50 % 00 : 59 remain

Replica lag is 12 seconds on node3 .    Waiting .

Replica lag is 12 seconds on node3 .    Waiting .

Copying ` app ` . ` large_table ` :    53 % 02 : 22 remain

Copying ` app ` . ` large_table ` :    82 % 00 : 39 remain

2014 - 07 - 03T13 : 06 : 06 Copied rows OK .

2014 - 07 - 03T13 : 06 : 06 Swapping tables . . .

2014 - 07 - 03T13 : 06 : 06 Swapped original and new tables OK .

2014 - 07 - 03T13 : 06 : 06 Dropping old table . . .

2014 - 07 - 03T13 : 06 : 06 Dropped old table ` app ` . ` _large_table_old ` OK .

2014 - 07 - 03T13 : 06 : 06 Dropping triggers . . .

2014 - 07 - 03T13 : 06 : 06 Dropped triggers OK .

Successfully altered ` app ` . ` large_table ` .

As you can see, there was some slave lag during the schema changes.

Binlog Format & pt-online-schema-change

pt-online-schema-change uses triggers in order to do the schema changes. Tungsten Replicator has some limitations with different binary log formats and triggers ( https://code.google.com/p/tungsten-replicator/wiki/TRCAdministration#Triggers_and_Row_Replication ).

In Tungsten Replicator, ROW based binlog events will be converted to SQL statements, which causes triggers to be executed on the slave as well, this does not happen with traditional replication.

Different settings:

  • STATEMENT based binary logging works by default
  • ROW based binary logging works, the plugin recreates the triggers and uses the technique documented at https://code.google.com/p/tungsten-replicator/wiki/TRCAdministration#Triggers_and_Row_Replication
  • MIXED binary logging does not work, as there is currently no way to determine whether an event was written to the binary log in statement or row based format, so it’s not possible to know if triggers should be run or not. The tool will exit and and error will be returned:
    Error creating --plugin: The master it's binlog_format=MIXED,pt-online-schema change does not work well withTungsten Replicator and binlog_format=MIXED.

    Error creating -- plugin : The master it ' s binlog_format = MIXED ,

    pt - online - schema change does not work well with

    Tungsten Replicator and binlog_format = MIXED .

binlog_format 可以在每个会话的基础上覆盖,确保使用 pt-online-schema-change 时不会发生这种情况。

Continent 网站上的文档已经提到了如何将数据与 pt-table-checksum 进行比较。

我相信这个插件是一个很好的补充。 Percona Toolkit 中监控复制延迟的功能现在可以与 Tungsten Replicator 一起使用,因此您可以控制使用这些工具时可以容忍的复制延迟程度。

声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn