This article brings you relevant knowledge about MySql, including mainly introducing the master-slave synchronization of MySQL and its working principle, etc. Friends who are interested can take a look at it together. Welcome to collect and learn!
1 Introduction
Hello everyone, Mysql is the most commonly used database for everyone. The following is to share the knowledge points of mysql master-slave synchronization for everyone to consolidate. Mysql basic knowledge, if there are any mistakes, please correct me.
2 Overview of MySql master-slave synchronization
MySQL master-slave synchronization, namely MySQL Replication, can synchronize data from one database server to multiple database servers. The MySQL database comes with a master-slave synchronization function. After configuration, it can realize master-slave synchronization in various schemes based on database and table structures.
Redis is a high-performance in-memory database, but it is not the protagonist today; MySQL is a relational database based on disk files. Compared with Redis, the reading speed will be slower, but it is powerful. Can be used to store persistent data. In actual work, we often use Redis as a cache in conjunction with MySQL. When there is a data access request, it will first be searched from the cache. If it exists, it will be taken out directly. If it does not exist, the database will be accessed again. This improves the performance. The efficiency of reading also reduces the access pressure on the back-end database. Using a cache architecture like Redis is a very important part of a high-concurrency architecture.
As the business volume continues to grow, the pressure on the database will continue to increase. Frequent changes in the cache are also strongly dependent on the data query results, resulting in low data query efficiency. High load, too many connections, etc. For e-commerce scenarios, there are often many typical scenarios of more reading and less writing. We can use a master-slave architecture for MySQL and separate reading and writing, so that the master server (Master) handles write requests and the slave server (Slave) handles read requests. This can further improve the concurrent processing capabilities of the database. As shown in the picture below:
In the above picture, you can see that we have added two slave libraries. These two slave libraries can withstand a large number of read requests together and share the main load. library pressure. The slave database will continuously synchronize data from the master database through master-slave replication to ensure that the data in the slave database is consistent with the data in the master database.
Next, let’s take a look at the functions of master-slave synchronization and how master-slave synchronization is implemented.
3 The role of master-slave synchronization
Generally speaking, not all systems need to design a master-slave architecture for the database, because the architecture itself has a certain cost Yes, if our purpose is to improve the efficiency of high concurrent access to the database, then we should first optimize the SQL statements and indexes to give full play to the maximum performance of the database; secondly, adopt a caching strategy, such as using caching tools such as Redis and MongoDB, through which The advantage of high performance is to store data in an in-memory database to improve reading efficiency. Finally, the database adopts a master-slave architecture to separate reading and writing. The cost of system use and maintenance gradually increases according to the upgrade of the architecture.
Getting back to the subject, master-slave synchronization can not only improve the throughput of the database, but also has the following three aspects:
3.1 Read and write separation
We can synchronize data through master-slave replication, and then improve the concurrent processing capabilities of the database through read-write separation. To put it simply, our data is placed in multiple databases, one of which is the Master database, and the rest are Slave databases. When the data in the main database changes, the data will be automatically synchronized to the slave database, and our program can read data from the slave database, that is, using a read-write separation method. E-commerce applications are often "read more and write less", and the separation of reading and writing can achieve higher concurrent access. Originally, all read and write pressures were borne by one server. Now, multiple servers jointly handle read requests, reducing the pressure on the main database. In addition, load balancing can be performed on the slave servers, so that different read requests can be evenly distributed to different slave servers according to policies, making reading smoother. Another reason for smooth reading is to reduce the impact of lock tables. For example, if we let the main library be responsible for writing, when a write lock occurs in the main library, it will not affect the query operation of the slave library.
3.2 Data Backup
Master-slave synchronization is also equivalent to a data hot backup mechanism, which is backed up under the normal operation of the main database without affecting the provision of data services.
3.3 High Availability
Data backup is actually a redundant mechanism. Through this redundancy method, the high availability of the database can be exchanged. When the server fails, , downtime and other unavailable situations, failover can be quickly performed and the slave database can act as the master database to ensure the normal operation of the service. You can learn about the high availability SLA indicators of the e-commerce system database.
4 The principle of master-slave synchronization
Speaking of the principle of master-slave synchronization, we need to understand an important log file in the database, which is the Binlog binary file, which records events that update the database. In fact, the principle of master-slave synchronization is based on Binlog. Data is synchronized.
In the process of master-slave replication, the operation will be based on three threads. One is the binlog dump thread, located on the master node. The other two threads are the I/O thread and the SQL thread. They are respectively Located on the slave node, as shown below:
##Combined with the above pictures, let’s understand the core process of master-slave replication:5 How to solve the data consistency problem of master-slave synchronization
As you can imagine, if the data we want to operate are all stored in the same database, then the When the data is updated, a write lock can be added to the record, so that data inconsistency will not occur when reading. But at this time, the role of the slave library is to back up data, without separation of reading and writing, so as to share the pressure of the main library. Therefore, we also need to find a way to solve the problem of data inconsistency in master-slave synchronization when separating reading and writing, that is, to solve the problem of data replication between master and slave. If the slave follows data consistency To divide from weak to strong, there are the following three replication methods.5.1 Fully synchronous replication
First of all, fully synchronous replication means that after the master database completes a transaction, all slave databases must also complete the transaction. Only then can the processing results be returned to the client; therefore, although the data consistency of fully synchronous replication is guaranteed, the master library needs to wait for all slave libraries to complete a transaction, and the performance is relatively low. As shown below:5.2 Asynchronous replication
Asynchronous replication means that when the main library submits things, the binlog dump thread will be notified to send The binlog log is sent to the slave library. Once the binlog dump thread sends the binlog log to the slave library, there is no need to wait for the slave library to complete the transaction synchronously. The master library will return the processing results to the client. Because the main library only needs to complete the transaction by itself, it can return the processing results to the client without caring about whether the slave library has completed the transaction. This may lead to short-term master-slave data inconsistencies, such as If you query the new data just inserted in the master database immediately from the slave database, you may not be able to query it.Furthermore, if the main database crashes after submitting the transaction, the binlog may not have time to be synchronized to the slave database. At this time, if the master-slave node is switched for recovery failure, data loss will occur. Therefore, although asynchronous replication has high performance, it is the weakest in terms of data consistency.
Mysql master-slave replication adopts asynchronous replication by default.
After MySQL version 5.5, semi-synchronous replication is supported. The principle is that after the client submits COMMIT, the result is not returned directly to the client, but rather waits for at least one slave library to receive the Binlog and write it to the relay log before returning it to the client. The advantage of this is that it improves the consistency of the data. Of course, compared with asynchronous replication, it increases the delay of at least one more network connection and reduces the efficiency of writing to the main database.
In the MySQL5.7 version, a rpl_semi_sync_master_wait_for_slave_count parameter has also been added. We can set the number of slave libraries that need to respond. The default is 1, which means that as long as one slave library responds, it can return to the client. If you increase this parameter, you can improve the strength of data consistency, but it will also increase the time the master database waits for the slave database to respond.
However, semi-synchronous replication also has the following problems:
When the main library successfully submits the transaction and is in the process of waiting for confirmation from the slave library, at this time, the slave library has not had time to return the processing results to the client, but because the main library storage engine has already submitted the transaction, , so other clients can read data from the main library.
However, if the main library suddenly hangs up in the next second, and the next request comes in at this time, because the main library hangs up, the request can only be switched to the slave library, because the slave library has not yet received the request from the master library. The database has synchronized the data, so of course this data cannot be read from the database. Compared with the result of reading the data in the previous second, the phenomenon of phantom reading is caused.
Enhanced semi-synchronous replication is an improvement made to semi-synchronous replication in versions after mysql 5.7.2. The principle is almost the same, and it mainly solves the problem of magic. Reading question.
After the master library is configured with the parameter rpl_semi_sync_master_wait_point = AFTER_SYNC, before the storage engine commits the transaction, the master library must receive confirmation of completion of data synchronization from the slave library before submitting the transaction, thereby solving the phantom read problem. . Refer to the picture below:
6 Summary
Through the above content, we understand the master-slave synchronization of the Mysql database. If your The goal is only high concurrency of the database, so you can first consider optimization from aspects such as SQL optimization, indexing, and Redis cache data, and then consider whether to adopt a master-slave architecture.
In the configuration of the master-slave architecture, if we want to adopt a read-write separation strategy, we can write our own program or implement it through third-party middleware.
The advantage of writing your own program is that it is more independent. We can judge which queries to execute on the slave database. For high real-time requirements, we can also consider which queries can be executed on the main database. At the same time, the program directly connects to the database, reducing the middleware layer and reducing some performance losses.
The method of using middleware has obvious advantages, it is powerful and easy to use. However, there will be some performance loss due to the addition of a middleware layer between the client and the database. At the same time, the price of commercial middleware is relatively high, and there is a certain learning cost. In addition, we can also consider using some excellent open source tools, such as MaxScale. It is MySQL data middleware developed by MariaDB. For example, in the figure below, MaxScale is used as the database proxy, and read-write separation is completed through routing and forwarding. At the same time, we can also use the MHA tool as a strongly consistent master-slave switching tool to complete the high-availability architecture of MySQL.
Recommended study: "MySQL Video Tutorial"
The above is the detailed content of A thorough understanding of MySql master-slave synchronization in one article. For more information, please follow other related articles on the PHP Chinese website!