Home  >  Article  >  Database  >  Detailed introduction to MySQL’s fast “flashback” based on offline binlog

Detailed introduction to MySQL’s fast “flashback” based on offline binlog

黄舟
黄舟Original
2017-03-18 14:37:121596browse

Yesterday, a customer suddenly said that he had deleted a large amount of data by mistake. The CTO directly pulled me into a discussion group and said that he would help them recover their data. They dug the hole themselves, and planned to let the development side restore it based on the business logs. They were told that only information such as deletion of the primary key was recorded, and physical deletion was impossible.

I looked at the recorded logs on the server and found that several of them had accidentally deleted record output. Although Alibaba RDS can clone an instance and restore it to the time before deletion, it is difficult to find the tens of thousands of scattered IDs, and the data associated with several tables must also be restored, which is troublesome.

Think of MySQL’s flashback solution. I have read several related articles before, and even almost used one to parse binlog and reverse it to get the rollback sql. I really don’t have time, so I need to use it urgently now. I quickly looked for "ready-made solutions" online. Text begins

MySQL (including Alibaba RDS) fast flashback can be said to be the antidote for database misoperations. The flashback function can return the database to before the misoperation. But even Oracle database only supports flashback within a short period of time.

The existing open source MySQL flashback implementation on the Internet uses the principle of parsing binlog and generating reverse sql: (must be in row mode)

    For delete operation, generate insert (DELETE_ROWS_EVENT)
  1. For the update operation, exchange the order of the values ​​in the binlog (UPDATE_ROWS_EVENT)
  2. For the insert operation, delete is generated in reverse ( WRITE_ROWS_EVENT)
  3. For multiple events, it is necessary to reversely generate sql
  4. The above two implementation methods are both through the python-mysql-replication package , simulate a slave library of the original library, then
show binary logs

to obtain the binlog, initiate a request to synchronize the binlog, and then parse EVENT. However, after the binlog of Alibaba Cloud RDS was synchronized to the slave library, it was quickly purged . If you want to restore some of the data of yesterday's , both options will not get the binlog. In other words, the flashback time is limited.

There are also some relatively simple implementations, which are to parse the binlog physical file and implement rollback, such as

binlog-rollback.pl. I have tried it, but the speed is too slow.

In order not to affect the speed, but also want to use a more mature flashback solution, we can do this:

  1. Use a self-built mysqld instance to purge the Copy the binlog to the directory of the instance

  2. In the self-built instance, create the table (structure) that needs to be restored in advance, because the tool needs to connect to it from

    information_schema.columns Obtain metadata information

  3. When copying, you can replace the binlog file name of the mysql instance to keep it continuous

  4. May be necessary Modify

    mysql-bin.index to ensure that the file name can still be recognized by mysqld

  5. Restart the mysql instance,

    show binary logs and see if In the list

  6. You can then use any of the above tools to simulate the slave library, specify a binlog file, start time, end time, and get the rollback SQL

  7. Then filter out the required sql based on the business logic

  8. ##fc430c7db1eecf4621f4fc8a5479f894
In short, it is to use another mysql to Binlog event is transmitted. Warm reminder:


    Don’t make the version span between the two instances too large
  1. Pay attention to file permissions
  2. If gtid is enabled in the original library, this self-built instance must also enable gtid
  3. Example:
python mysqlbinlog_back.py --host="localhost" --username="ecuser" --password="ecuser" --port=3306 \
--schema=dbname --tables="t_xx1,t_xx2,t_xx3" -S "mysql-bin.000019" -E "2017-03-02 13:00:00" -N "2017-03-02 14:09:00" -I -U

===log will also  write to .//mysqlbinlog_flashback.log===
parameter={'start_binlog_file': 'mysql-bin.000019', 'stream': None, 'keep_data': True,
 'file': {'data_create': None, 'flashback': None, 'data': None}, 'add_schema_name': False, 'start_time': None, 
 'keep_current_data': False, 'start_to_timestamp': 1488430800,
 'mysql_setting': {'passwd': 'ecuser', 'host': 'localhost', 'charset': 'utf8', 'port': 3306, 'user': 'ecuser'},
 'table_name': 't_xx1,t_xx2,t_xx3', 'skip_delete': False, 'schema': 'dbname', 'stat': {'flash_sql': {}},
 'table_name_array': ['t_xx1', 't_xx2', 't_xx3'],
 'one_binlog_file': False, 'output_file_path': './log', 'start_position': 4, 'skip_update': True,
 'dump_event': False, 'end_to_timestamp': 1488434940, 'skip_insert': True, 'schema_array': ['dbname']
}
scan 10000 events ....from binlogfile=mysql-bin.000019,timestamp=2017-03-02T11:42:14
scan 20000 events ....from binlogfile=mysql-bin.000019,timestamp=2017-03-02T11:42:29
...

Tips:

binlog is in ROW format, which is affected by dml Each row records two events: Table_map and Row_log. The table_id in table_map does not affect which instance it is applied to. This id can be considered as a logical mechanism to record the table structure version - when it does not find the table definition in table_definition_cache, the id is incremented by 1 and assigned to the target. Record to binlog table.


mysqlbinlog_back.py Usage experience

:

    Be sure to specify the library name, indication, starting binlog file name, start time, and end time. Can speed up the scan.
  • According to the needs of recovery, select -I, -U, -D to specify which types of operations to roll back.
  • If only partial table data is restored (non-complete flashback), the associated table cannot be restored correctly. For example, deleted data needs to be restored, but data in the business that was caused by delete to
  • update

    cannot be restored unless there is a complete flashback.

  • Table fields that are of enum type are not supported, such as the f_do_type field of t_xx3. You can change the enum definition on the self-built instance to int.

The above is the detailed content of Detailed introduction to MySQL’s fast “flashback” based on offline binlog. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn