Home >System Tutorial >LINUX >Analyze the usage scenarios of mysql master-slave synchronization and asynchronous
The reason for conducting this research is mainly to address two unsolved doubts encountered before:
a. There is a system online. The semi-synchronous state often changes from semi-synchronous to asynchronous, and then immediately returns to semi-synchronous. The specific reason is unknown. Although I have guessed it before, I am still not completely sure.
b. Some time ago, due to business scenario requirements, we conducted a cross-machine room asynchronous replication test. When mysql write qps is very high, it is found that many logs have not had time to be sent to the slave library. That is, the speed of binlog log generation in the main library is greater than the speed of transmission to the slave library. This speed difference always exists, so when the main library continues to be high When the binlog was generated under pressure, more and more binlogs were not transmitted to the slave library, but the network traffic at that time was only about 18M/S (one master, one slave). According to conventional knowledge, the speed of Gigabit network transmission It can reach 100M, but the current binlog transmission speed between master and slave only reaches about 18M. What is the reason? Is it a network problem? Or for other reasons.
Master-slave replication principle Dump thread and io threadAfter the master-slave replication relationship is established, there is a dump thread on the master library, which is used to transmit the binlog log generated in the master library, and the io thread on the slave library is used to receive the data sent by the dump thread to the slave through the network. Binlog log of the library and is responsible for writing it into the relay log. This is the mechanism of master-slave replication. At the same time, because it is asynchronous replication, the transmission process does not require ack confirmation.
The question is also here - because it is asynchronous transmission, if it is simply understood as direct network transmission of binlog files, the speed should be very fast, but the actual situation: in our test environment, the transmission speed of binlog logs It is only 18M/s, which is less than the speed of about 22M/s generated by the log. Why is it only at this speed and not fully using up the network bandwidth? what is the reason?
Log delivery detailsIn the master-slave replication structure, there is one dump thread on the master library and one io thread on the slave library, so there is no concurrent sending and receiving of multiple threads. You only need to understand the working mechanism of the binlog dump thread. Can understand all the details.
By parsing the binlog file, we can know that a transaction can contain multiple events. The following is the information recorded in the binlog of the simplest thing:
# at 33580 #170531 17:22:53 server id 153443358 end_log_pos 33645 CRC32 0x4ea17869 GTID last_committed=125 sequence_number=126 SET @@SESSION.GTID_NEXT= ‘e1028e43-4123-11e7-a3c2-005056aa17e6:198’/*!*/; # at 33645 #170531 17:22:53 server id 153443358 end_log_pos 33717 CRC32 0x66820e00 Query thread_id=4 exec_time=0 error_code=0 SET TIMESTAMP=1496222573/*!*/; BEGIN /*!*/; # at 33717 #170531 17:22:53 server id 153443358 end_log_pos 33770 CRC32 0x22ddf25e Table_map: `test`.`xcytest` mapped to number 222 # at 33770 #170531 17:22:53 server id 153443358 end_log_pos 33817 CRC32 0x61051ea0 Write_rows: table id 222 flags: STMT_END_F BINLOG ‘ bYsuWRMeXCUJNQAAAOqDAAAAAN4AAAAAAAEABHRlc3QAB3hjeXRlc3QAAgMPAlgCAl7y3SI= bYsuWR4eXCUJLwAAABmEAAAAAN4AAAAAAAEAAgAC//x9AAAABQBzZGZhc6AeBWE= ‘/*!*/; ### INSERT INTO `test`.`xcytest` ### SET ### @1=125 /* INT meta=0 nullable=0 is_null=0 */ ### @2=’sdfas’ /* VARSTRING(600) meta=600 nullable=1 is_null=0 */ # at 33817 #170531 17:22:53 server id 153443358 end_log_pos 33848 CRC32 0x630805b4 Xid = 303 COMMIT/*!*/;
Each at xxxxx segment is an event.
The function Binlog_sender::send_events is the function that sends event events in binlog:
Function parameters:end_pos, the end position of the binlog file currently read.
log_cache, the record is the information of the currently transmitted log, including the location of the binlog log that has been transmitted, and the binlog log file.
Function logic analysis:If the currently sent position log_pos is less than the end position end_pos of the obtained file, it indicates that there are still binlog logs that have not been sent and the loop is entered.
Loop body:
a. First call the function read_event to get an event event.
b. Log_event_type event_type= (Log_event_type)event_ptr[EVENT_TYPE_OFFSET];
This statement is used to obtain the type of event and then perform type checking
check_event_type(event_type, log_file, log_pos), if the check is not passed, 1 will be returned directly to the upper function.
c. log_pos= my_b_tell(log_cache); Update the log_pos position, that is, move the cursor that reads the binlog position forward to the current position.
d. Then call the send_packet() function to send binlog.
It turns out that no matter how many binlogs are currently not synchronized to the slave library, the granularity of the binlog sent by the master library is still to send event one by one. Before sending, the type of event needs to be checked. Because it is sent in small packets, the network traffic is not large.
But we need to explain the prerequisites for this phenomenon: in our test environment, the write qps of the database at that time reached more than 50,000, so there were a lot of events that needed to be sent. Even if it was asynchronous, the single-threaded dump thread had no time to send it. The current log generated.
When the qps written are huge, there may be a situation where it is too late to send the log.
SummarizeNow, let’s look back at the problems encountered online. “The synchronization state often changes from a semi-synchronous state to an asynchronous state, and then is restored to a semi-synchronous state in time.” The reason is that the database is an analysis system, and sometimes Batch updates and batch imports will be done. At the same time, the binlog format set by the database is row mode. For a transaction that updates multiple rows, it contains many events (one row is an event), so sending the binlog of this transaction will take a long time and cannot be sent within 1 second. Completed (the semi-sync timeout is set to 1), so the semi-sync state becomes asynchronous.
The above is the detailed content of Analyze the usage scenarios of mysql master-slave synchronization and asynchronous. For more information, please follow other related articles on the PHP Chinese website!