Field expansion of large tables under MySQL cascade replication
1. Background
A customer’s business has a table with about 400 million rows. Due to business expansion, the open_id in the table varchar(50) needs to be expanded to varchar(500).
During the change, try to minimize the impact on the main library (it is best not to have any impact -> In the end, we strived for a 4-hour window period).
2. Database table information
Environment: Mysql 8.0.22
1 Master 1 slave replication based on Gtid
1. The first question, this is a big picture Table? Yes, please look at the ibd file of this table. 280G count has not returned for a long time. Use the standby database to check and confirm the number of rows > 400 million.
以下语句也可以查看: show table status from dbname like 'tablename'\G # Rows 的值不准,有时误差有2倍 SELECT a.table_schema,a.table_name,concat(round(sum(DATA_LENGTH/1024/1024)+sum(INDEX_LENGTH/1024/1024),2) ,'MB')total_size,concat(round(sum(DATA_LENGTH/1024/1024),2),'MB') AS data_size,concat(round(sum(INDEX_LENGTH/1024/1024),2),'MB') AS index_size FROM information_schema.TABLES a WHERE a.table_schema = 'dbname' AND a.table_name = 'tablename'; #看下此表的数据量Since it is Big table, what method should we use to make changes?
3. Solution selectionM in the following represents the main database, S1 is slave 1, and S2 is slave 2
Advantages | Disadvantages | Feasibility | |
---|---|---|---|
Native, when using intermediate temporary table | ALGORITHM=COPY, DML will be blocked. Recommended version>MySQL5.7 | 5 stars | |
Use binlog playback thread instead of trigger | Third-party tools, which lead to longer execution time depending on different parameters | 4 stars | |
Good version compatibility, use triggers to keep the main and secondary tables consistent | Third-party tools, and there are many restrictions on use | 3 stars | ##M-S1-S2 |
Cascade replication, manual operation | 1 star |
Based on the actual situation assessment, the demand on the business side this time is
This table has business traffic 24 hours a day, and does not accept business unavailability for more than 4 hours
OnlineDDL method, when ALGORITHM=COPY, DML will be blocked during the period (read-only). Finally, during the rename operation of the main and secondary tables (not readable and writable), until DDL is completed (the time required is uncertain).
It is recommended to connect to the slave library in Gh-ost mode for conversion. This mode has the least impact on the main library, and the flow control can be set through parameters. The fatal weakness of this tool is that the change takes too long. For a table with 400 million pieces of data, it took 70 hours in the test environment. Finally, we also need to issue a switching command and manually delete the intermediate table *_del. The advantage of this solution is that there is a slave database to ensure data security, so if you use a 1 master 2 slave architecture, it is more recommended. Pt-osc and Gh-ost both belong to third parties. Pt-osc's operation of large tables and OnlineDDL have a common disadvantage, which is that the cost of rollback on failure is very high. If it is a lower version such as MySQL Finally we chose the DBA’s favorite (xin ku) method, which is performed under M-S1-S2 cascade replication. 4. How to operate Create a new S1 slave library and build M-S1-S2 cascade replication- Use OnlineDDL to perform field expansion on S2 (the advantage is that the master of M-S1 is never affected during the period)
- After the expansion is completed, wait for delayed synchronization of M-S1-S2 (reduce S2 Data difference with M, and perform data verification)
- Remove S1 and establish a master-slave relationship between M-S2 (allowing S2 to continue synchronizing M's data)
- Backup S2 and restore S1, establish M-S2-S1 cascade replication
- Stop the application and wait for the master-slave data to be consistent (the advantage is the synchronization time of different data amounts Very short)
- Finally S2 becomes the main library and S1 is the slave library (the application needs to modify the front-end connection information)
- Apply for regression verification
- The above content may seem complicated, but it is essentially backup and recovery. Readers can consider this as an alternative. Share the specific steps?
环境装备:开启Gtid,注意M,S1 binlog保存时长,磁盘剩余空间大于待变更表的2倍 show global variables like 'binlog_expire_logs_seconds'; # 默认604800 set global binlog_expire_logs_seconds=1209600; # 主库和级联主库都需要设置 1.搭建 1主2从的级联复制,M -> S1 -> S2 ,安装MySQL注意本次环境lower_case_table_names = 0 2.在S2 上做字段扩容。 预估 10个小时 `参数设置:` set global slave_type_conversions='ALL_NON_LOSSY'; # 防止复制报错SQL_Errno: 13146,属于字段类型长度不一致无法回放 set global interactive_timeout=144000;set global wait_timeout =144000; `磁盘IO参数设置:` set global innodb_buffer_pool_size=32*1024*1024*1024;# 增加buffer_pool 防止Error1206The total number of locks exceeds the lock table size 资源不足 set global sync_binlog=20000;set global innodb_flush_log_at_trx_commit=2; set global innodb_io_capacity=600000;set global innodb_io_capacity_max=1200000; # innodb_io_capacity需要设置两次 show variables like '%innodb_io%'; # 验证以上设置 screen 下执行: time mysql -S /data/mysql/3306/data/mysqld.sock -p'' dbname -NBe "ALTER TABLE tablename MODIFY COLUMN open_id VARCHAR(500) NULL DEFAULT NULL COMMENT 'Id' COLLATE 'utf8mb4_bin';" 查看DDL进度: SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM performance_schema.events_stages_current; 3.扩容完成后,等待延迟同步M-S1-S2 数据同步至主从一致,对比主从Gtid 4.移除S1,建立M-S2的主从关系 S1 (可选) stop slave; reset slave all; systemctl stop mysql_3306 S2 stop slave; reset slave all; # MASTER_HOST='M主机IP' CHANGE MASTER TO MASTER_HOST='', MASTER_USER='', MASTER_PASSWORD=', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_CONNECT_RETRY=10; start slave; (flush privileges;# 验证数据可正常同步) 5.备份S2恢复S1,建立M-S2-S1级联复制 物理备份S2,重做S2->S1 级联主从 rm -rf binlog/* rm -rf redolog/* xtrabackup --defaults-file=/data/mysql/3306/my.cnf.3306 --move-back --target-dir=/data/actionsky/xtrabackup_recovery/data chown -R mysql. data/ chown -R mysql. binlog/* chown -R mysql. redolog/* systemctl start mysql_3306 set global gtid_purged=''; reset slave all; # MASTER_HOST='S2主机IP' ,已扩容变更完的主机 CHANGE MASTER TO MASTER_HOST='', MASTER_USER='', MASTER_PASSWORD='', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_CONNECT_RETRY=10; `MySQL8.0版本需要在上面语句中添加 GET_MASTER_PUBLIC_KEY=1; #防止 Last_IO_Errno: 2061 message: Authentication plugin 'caching_sha2_password' reported error: Authentication requires secure connection.` start slave; 6.应用停服,等待主从数据一致 主库停服+可设置read_only+flush privileges,对比主从Gtid 7.最终S2成为主库,S1为从库 应用更改配置连接新主库。 S2上: stop slave;reset slave all; set global read_only=0;set global super_read_only=0; `show master status\G 观察是否有新事务写入` 收尾:还原第2步的参数设置。 set global interactive_timeout=28800;set global wait_timeout =28800; set global innodb_buffer_pool_size=8*1024*1024*1024; set global slave_type_conversions=''; set global sync_binlog=1;set global innodb_flush_log_at_trx_commit=1; set global innodb_io_capacity=2000;set global innodb_io_capacity_max=4000;Supplementary scenario: Test based on disk IO capability
Modify directly on the main database without traffic Case:Scenario 1, the disk is a physical machine of NVME, and it takes about 5 hours to process 400 million data (disk performance 1G/s).
Scenario 2, the virtual machine whose disk is a mechanical disk, this amount of data takes about 40 hours (disk performance 100M/s).
The above is the detailed content of How to expand the fields of large tables under MySQL cascade replication. For more information, please follow other related articles on the PHP Chinese website!

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于架构原理的相关内容,MySQL Server架构自顶向下大致可以分网络连接层、服务层、存储引擎层和系统文件层,下面一起来看一下,希望对大家有帮助。

在mysql中,可以利用char()和REPLACE()函数来替换换行符;REPLACE()函数可以用新字符串替换列中的换行符,而换行符可使用“char(13)”来表示,语法为“replace(字段名,char(13),'新字符串') ”。

方法:1、利用right函数,语法为“update 表名 set 指定字段 = right(指定字段, length(指定字段)-1)...”;2、利用substring函数,语法为“select substring(指定字段,2)..”。

mysql的msi与zip版本的区别:1、zip包含的安装程序是一种主动安装,而msi包含的是被installer所用的安装文件以提交请求的方式安装;2、zip是一种数据压缩和文档存储的文件格式,msi是微软格式的安装包。

转换方法:1、利用cast函数,语法“select * from 表名 order by cast(字段名 as SIGNED)”;2、利用“select * from 表名 order by CONVERT(字段名,SIGNED)”语句。

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了关于MySQL复制技术的相关问题,包括了异步复制、半同步复制等等内容,下面一起来看一下,希望对大家有帮助。

本篇文章给大家带来了关于mysql的相关知识,其中主要介绍了mysql高级篇的一些问题,包括了索引是什么、索引底层实现等等问题,下面一起来看一下,希望对大家有帮助。

在mysql中,可以利用REGEXP运算符判断数据是否是数字类型,语法为“String REGEXP '[^0-9.]'”;该运算符是正则表达式的缩写,若数据字符中含有数字时,返回的结果是true,反之返回的结果是false。


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Notepad++7.3.1
Easy-to-use and free code editor

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Atom editor mac version download
The most popular open source editor

SublimeText3 Linux new version
SublimeText3 Linux latest version
