Home  >  Article  >  Database  >  What are the ways to synchronize MySQL data to Elasticsearch?

What are the ways to synchronize MySQL data to Elasticsearch?

王林
王林forward
2023-05-30 20:49:041010browse

1. Business layer synchronization

Since the operation of MySQL data is also completed in the business layer, it is natural to synchronize other data sources in the business layer. The more common approach is to use the ORM Write relevant synchronization code in hooks.

The disadvantage of this method is that when there are more and more services, the synchronization part may be too scattered, making it difficult to update iterations. For example, incompatible migration of ES indexes may be affected. whole body.

2. Middleware synchronization

When the application architecture evolves into microservices, each service may no longer directly call MySQL, but through a layer of middleware middleware. At this time, it can The middleware operates MySQL while synchronizing other data sources.

This method requires middleware to adapt and has a certain complexity.

3. The scheduled task synchronizes based on the updated_at field

Set special fields in the MySQL table structure, such as updated_at (data update time). Based on this field, the scheduled task queries the actual Changed data to achieve incremental updates of data.

You can use the open source Logstash to complete this method.

Of course, the disadvantage is also obvious, that is, the deletion operation of data cannot be synchronized.

4. Parse binlog synchronization

For example, the famous canal.

Disguise yourself as a slave to parse MySQL's binary log to learn about data changes.

This is a relatively mature solution in the industry.

This method requires you to set MySQL's binlog-format to ROW mode.

5. Parse binlog — mixed / statement format

MySQL’s binlog has three formats:

  • ROW Mode, binlog records data changes in rows;

  • statement Mode, binlog records SQL statements;

  • mixed In the mode, the above two are mixed, and the records may be SQL statements or changes in each row of the ROW mode;

In some cases, your MySQL binlog may not be set to ROW mode. In this case, we can still parse the binlog uniformly to complete the synchronization, but here Of course, what is parsed is the original SQL statement or each row change of the ROW pattern. At this time, we need to parse these SQL or each row change according to the business, such as using regular matching or AST abstract syntax tree, etc. , and then synchronize the data based on the parsed results.

The limitations of this method are also obvious. First, you need to adapt your business analysis SQL. Second, batch update scenarios may be difficult to handle. Of course, if your data is simply modified based on the primary key, Or deleting it will work better.

The above is the detailed content of What are the ways to synchronize MySQL data to Elasticsearch?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete