Mysql master-slave basic principles, main forms and master-slave synchronization delay principle (read-write separation) lead to master-slave data inconsistency problems and solutions
1. The difference between master and slave databases
The slave database (Slave) is the backup of the master database. When the master database (Master) changes, the slave database must be updated. These database software Update cycles can be designed. This is a means to improve information security. The master and slave database servers are not in the same geographical location, so the database can be saved when an accident occurs.
(1) Master-slave division of labor
The Master is responsible for the load of write operations, which means that all write operations are performed on the Master, while read operations are Allocate to Slave. This can greatly improve reading efficiency. In general Internet applications, after some data surveys, it is concluded that the read/write ratio is about 10:1, which means that a large number of data operations are concentrated on read operations, which is why we have multiple Slave s reason. But why separate reading and writing? R&D personnel who are familiar with DB all know that write operations involve locking issues, whether it is row locks, table locks or block locks, which relatively reduce system execution efficiency. Our separation is to concentrate write operations on one node, while read operations are performed on other N nodes. This effectively improves the reading efficiency and ensures the high availability of the system.
(2) Basic process
1) Mysql’s master-slave synchronization means that when data changes in the master (main database), it will be synchronized to the slave (slave database) in real time. .
2) Master-slave replication can horizontally expand the database’s load capacity, fault tolerance, high availability, and data backup.
3) Whether it is delete, update, insert, or create functions or stored procedures, they are all on the master. When the master has operations, the slave will quickly receive these operations and perform synchronization.
(3) Purpose and conditions
1), mysql master-slave replication purpose
●Real-time disaster recovery, used for failover
●Separation of reading and writing, providing query services
●Backup to avoid affecting the business
2), master-slave deployment necessary conditions:
●Enable binlog logs in the main database (set the log-bin parameter )
●The master-slave server-id is different
●The slave server can connect to the master library
2. The granularity, principle and form of master-slave synchronization:
(1). Three main implementation granularities
Detailed master-slave synchronization mainly has three forms: statement, row, mixed
1), statement: yes Write the sql statements for database operations into the binlog
2), row: Write the changes in each piece of data into the binlog.
3), mixed: a mixture of statement and row. Mysql decides when to write binlog in statement format and when to write binlog in row format.
(2), main implementation principles, specific operations, schematic diagram
1), operations on the master machine:
When the data on the master changes, the event changes will be written to the bin-log in order. When the slave is connected to the master, the master machine will start the binlog dump thread for the slave. When the master's binlog changes, the bin-log dump thread will notify the slave and send the corresponding binlog content to the slave.
2), operate on the slave machine:
When master-slave synchronization is turned on, two threads will be created on the slave: I\O thread. This thread is connected to the master machine, and the binlog dump thread on the master machine will send the contents of the binlog to the I\O thread. After receiving the binlog content, the I/O thread writes the content to the local relay log; sql thread. This thread reads the ralay log written by the I/O thread. And according to relay log. And perform corresponding operations on the slave database according to the contents of the relay log.
3) The schematic diagram of MySQL master-slave replication is as follows:
The slave library generates two threads, one I/O thread and one SQL Thread;
i/o thread requests the binlog of the main library and writes the obtained binlog log to the relay log (relay log) file;
The main library will generate a log dump thread to provide the slave with The library i/o thread transmits binlog;
SQL thread will read the log in the relay log file and parse it into specific operations to achieve consistent master-slave operations and consistent final data;
(2), master-slave form
mysql master-slave replication is flexible
● One master and one slave
● Master-master replication
● One master and multiple slaves ---Extension System read performance, because the read is read from the slave library;
● Multiple masters and one slave---supported since 5.7
● Cascade replication---
3. Problems, causes and solutions such as delay in master-slave synchronization:
(1). Delay problem in mysql database slave synchronization
1) Related parameters:
First execute show slave satus on the server; you can see many synchronized parameters:
Master_Log_File: The name of the master server binary log file that the I/O thread in SLAVE is currently reading.
Read_Master_Log_Pos: In the current master server binary log, the position that the I/O thread in SLAVE has read.
Relay_log_file: The name of the SQL thread is currently reading and executing the name of the relay log file
Relay_log_pos: In the current relay log, the location of the SQL thread read and executes
Relay_master_log_file: The name of the master binary log file that contains the most recent events
Slave_IO_Running: . The time gap between the server SQL thread and the slave server I/O thread, in seconds.
The slave synchronization delay occurs● The show slave status display parameter Seconds_Behind_Master is not 0, and this value may be very large
● The show slave status display parameter Relay_Master_Log_File and Master_Log_File show that the bin-log numbers are very different, indicating that The bin-log is not synchronized in time on the slave database, so the recently executed bin-log and the bin-log read by the current IO thread are very different
● There are a large number of mysql-relay-log logs in the data directory of the mysql slave database , the log will be automatically deleted by the system after the synchronization is completed. There are a large number of logs, indicating that the master-slave synchronization delay is very severe
1), MySQL database master-slave synchronization delay principle mysql master-slave synchronization principle: the main library writes binlog sequentially for write operations, and the single thread from the slave library goes to the main library to read and write sequentially "Operation binlog", fetch the binlog from the library and execute it locally (randomly written) to ensure that the master-slave data is logically consistent. The master-slave replication of mysql is a single-threaded operation. The main library generates binlog for all DDL and DML. The binlog is written sequentially, so it is very efficient. The slave's Slave_IO_Running thread goes to the main library to get the log, which is more efficient. Next step, question Here, the slave's Slave_SQL_Running thread implements the DDL and DML operations of the main library on the slave. The IO operations of DML and DDL are random, not sequential, and the cost is much higher. Other queries on the slave may also cause lock contention. Since Slave_SQL_Running is also single-threaded, a DDL card master will take 10 minutes to execute. Then all subsequent DDL will wait for this DDL to be executed before continuing, which leads to delays. Some friends will ask: "The same DDL on the main library also needs to be executed for 10 minutes. Why is the slave delayed?" The answer is that the master can run concurrently, but the Slave_SQL_Running thread cannot.
2) How does the master-slave synchronization delay in the MySQL database occur? When the TPS concurrency of the main library is high and the number of DDL generated exceeds what one SQL thread of the slave can bear, delays will occur. Of course, there may also be lock waits caused by the slave's large query statements. Primary reason: The database is under too much pressure to read and write in business, the CPU computing load is heavy, the network card load is heavy, and the hard disk random IO is too high. Secondary reasons: The performance impact of reading and writing binlog, and network transmission delay.
1), Architecture aspect
1. The implementation of the business persistence layer adopts a sub-database architecture, and the mysql service can be expanded in parallel to spread the pressure.
2. Separate reading and writing in a single library, one master and multiple slaves, master writes and slaves read, to spread the pressure. In this way, the pressure of the slave library is higher than that of the main library, protecting the main library.
3. The service infrastructure adds memcache or redis cache layer between the business and mysql. Reduce the read pressure of mysql.
4. MySQL for different businesses is physically placed on different machines to spread the pressure.
5. Use better hardware equipment than the main database as slave summary. MySQL has less pressure and the delay will naturally become smaller.
2) Hardware aspects1. Use a good server, for example, 4u has significantly better performance than 2u, and 2u has significantly better performance than 1u.
2. Use SSD or disk array or san for storage to improve random write performance.
3. The master and slave must be under the same switch and in a 10G environment.
Summary, if the hardware is strong, the delay will naturally become smaller. In short, the solution to minimizing latency is money and time.
3), mysql master-slave synchronization acceleration1, sync_binlog is set to 0 on the slave side
2, –logs-slave-updates slave Updates received by the server from the master server are not logged in its binary log.
3. Directly disable the binlog on the slave side
4. On the slave side, if the storage engine used is innodb, innodb_flush_log_at_trx_commit =2
4), from the file system Optimization of own attribute angleThe master side modifies the etime attribute of files in Linux and Unix file systems. Since the OS will write back the time when the read operation occurs to the disk every time a file is read, this is not possible for database files with frequent read operations. If necessary, it will only increase the burden on the disk system and affect I/O performance. You can organize the operating system to write atime information by setting the mount attribute of the file system. The operation on Linux is: open /etc/fstab, add the noatime parameter /dev/sdb1 /data reiserfs noatime 1 2 and then remount the file system #mount -oremount /data 5), Synchronization parameter adjustment The main library is written, which has higher data security, such as sync_binlog=1, innodb_flush_log_at_trx_commit = 1 and other settings are needed Slave does not need such high data security. You can set sync_binlog to 0 or turn off binlog. Innodb_flushlog can also be set to 0 to improve the execution efficiency of SQL. 1, sync_binlog=1 oMySQL provides a sync_binlog parameter to The binlog of the control database is flushed to the disk. By default, sync_binlog=0, which means that MySQL does not control the refresh of the binlog, and the file system itself controls the refresh of its cache. The performance at this time is the best, but the risk is also the highest. Once the system crashes, all binlog information in binlog_cache will be lost. If sync_binlog>0, it means that every sync_binlog transaction is submitted, MySQL calls the refresh operation of the file system to flush the cache. The safest setting is sync_binlog=1, which means that every time a transaction is submitted, MySQL will flush the binlog. It is the safest setting but has the greatest performance loss. In this case, the system may lose one transaction's data only if the operating system of the host where the database is located is damaged or suddenly loses power. However, although binlog is sequential IO, setting sync_binlog=1 and multiple transactions are submitted at the same time will also greatly affect MySQL and IO performance. Although it can be alleviated through group commit patches, too high a refresh frequency will also have a great impact on IO. For systems with high concurrent transactions, the write performance gap between systems with "sync_binlog" set to 0 and set to 1 may be as high as 5 times or more. Therefore, the sync_binlog set by many MySQL DBAs is not the safest 1, but 2 or 0. This sacrifices a certain amount of consistency to achieve higher concurrency and performance. By default, the binlog is not synchronized with the hard disk every time it is written. So if the operating system or machine (not just the MySQL server) crashes, it is possible that the last statement in the binlog is lost. To prevent this, you can use the sync_binlog global variable (1 is the safest value, but also the slowest) to synchronize binlog with the hard disk after every N binlog writes. Even if sync_binlog is set to 1, when a crash occurs, there may be inconsistencies between the table content and the binlog content. 2. innodb_flush_log_at_trx_commit (this is very useful) complaining that Innodb is 100 times slower than MyISAM? Then you probably forgot to adjust this value. The default value of 1 means that every transaction commit or instruction outside the transaction needs to write the log to the hard disk (flush), which is very time-consuming. Especially when using battery backed up cache. Setting it to 2 is fine for many applications, especially those transferred from MyISAM tables. It means not writing to the hard disk but writing to the system cache. The logs are still flushed to disk every second, so you generally won't lose more than 1-2 seconds of updates. Setting it to 0 will be faster, but the security is poor. Even if MySQL hangs up, transaction data may be lost. The value 2 will only cause data loss when the entire operating system hangs. 3. The ls(1) command can be used to list the atime, ctime and mtime of the file. atime The access time of the file ctime changes when the file is read or executed The create time of the file changes as the contents of the inode change when writing the file, changing the owner, permissions or link settings mtime of the file modified time changes as the file content changes when writing the file ls -lc filename lists the ctimels of the file -lu filename lists the atimels of the file -l filename lists the mtimestat of the file filename lists atime, mtime, ctimeatime are not necessarily in The file is modified after accessing because: when using the ext3 file system, if the noatime parameter is used during mount, the atime information will not be updated. These three time stamps are placed in the inode. If mtime and atime are modified, the inode will definitely change. Since the inode is changed, the ctime will also be changed. The reason why noatime is used in the mount option is that the file system does not want to do it. Too many modifications to improve reading performance (4), MySql database synchronization other problems and solutions 1) Problems with mysql master-slave replication: ● After the main database is down, data may be lost ● The slave database has only one sql Thread, the main database is under great write pressure, and replication is likely to be delayed 2). Solutions: ● Semi-synchronous replication---solve the problem of data loss ● Parallel replication---solve the problem of replication delay from the database 3), semi-sync replication mysql semi-sync (semi-synchronous replication) semi-synchronous replication: ● 5.5 is integrated into mysql, exists in the form of a plug-in, and needs to be installed separately ● Ensure that the binlog is at least 1 after the transaction is submitted Transfer to a slave library ● There is no guarantee that the slave library has finished applying the binlog of this transaction ● Performance will be reduced to a certain extent, and the response time will be longer ● Network abnormality or slave library downtime will cause the main library to be stuck until timeout or slave library recovery 4) , Master-slave replication--Asynchronous replication principle, semi-synchronous replication and parallel replication principle comparison a. Asynchronous replication principle: b. Semi-synchronous replication Principle: After the transaction writes the binlog in the main library, it needs to return an accepted message from the library before returning it to the client; 5.5 is integrated into mysql and exists in the form of a plug-in, which needs to be installed separately to ensure the transaction After submission, the binlog is transmitted to at least one slave library. There is no guarantee that the binlog performance of the slave library application to complete this transaction will be reduced to a certain extent due to network abnormalities or slave library downtime, and the master library will be stuck until it times out or the slave library recovers c. Parallel replication mysql parallel replication ● New in Community Edition 5.6 ● Parallel refers to multi-thread apply binlog from the library ● Binlog is applied in parallel at the library level, and data changes in the same library are still serial (parallel replication in version 5.7 is based on Transaction group) set set global slave_parallel_workers=10; set the number of sql threads to 10 Principle: multi-thread apply binlog from the library. In community 5.6, a new library level parallel application binlog is added. Data changes in the same library are still serial. The 5.7 version of parallel replication is based on transaction groups For more MySQL-related technical articles, please visit the MySQL Tutorial column to learn!
The above is the detailed content of Reasons and solutions for MySQL master-slave synchronization delay. For more information, please follow other related articles on the PHP Chinese website!