Since the operation of MySQL data is also completed in the business layer, it is natural to synchronize other data sources in the business layer. The more common approach is to use the ORM Write relevant synchronization code in hooks.
The disadvantage of this method is that when there are more and more services, the synchronization part may be too scattered, making it difficult to update iterations. For example, incompatible migration of ES indexes may be affected. whole body.
When the application architecture evolves into microservices, each service may no longer directly call MySQL, but through a layer of middleware middleware. At this time, it can The middleware operates MySQL while synchronizing other data sources.
This method requires middleware to adapt and has a certain complexity.
Set special fields in the MySQL table structure, such as updated_at (data update time). Based on this field, the scheduled task queries the actual Changed data to achieve incremental updates of data.
You can use the open source Logstash to complete this method.
Of course, the disadvantage is also obvious, that is, the deletion operation of data cannot be synchronized.
For example, the famous canal.
Disguise yourself as a slave to parse MySQL's binary log to learn about data changes.
This is a relatively mature solution in the industry.
This method requires you to set MySQL's binlog-format
to ROW
mode.
MySQL’s binlog
has three formats:
ROW
Mode, binlog records data changes in rows;
statement
Mode, binlog records SQL statements;
mixed
In the mode, the above two are mixed, and the records may be SQL statements or changes in each row of the ROW
mode;
In some cases, your MySQL binlog
may not be set to ROW
mode. In this case, we can still parse the binlog uniformly to complete the synchronization, but here Of course, what is parsed is the original SQL statement or each row change of the ROW
pattern. At this time, we need to parse these SQL or each row change according to the business, such as using regular matching or AST abstract syntax tree, etc. , and then synchronize the data based on the parsed results.
The limitations of this method are also obvious. First, you need to adapt your business analysis SQL. Second, batch update scenarios may be difficult to handle. Of course, if your data is simply modified based on the primary key, Or deleting it will work better.
The above is the detailed content of What are the ways to synchronize MySQL data to Elasticsearch?. For more information, please follow other related articles on the PHP Chinese website!