Home  >  Article  >  Backend Development  >  Data table migration data consistency verification

Data table migration data consistency verification

WBOY
WBOYOriginal
2016-08-08 09:25:251143browse

 It is still very useful to do some necessary verification when migrating the database. For example, whether the number of data items before and after migration is consistent, and whether the data is consistent. What should I do at this time? It is easy to verify the number of items, but what if I verify whether the data is consistent? For important data, of course, every piece of data must be error-free. Random sampling verification is definitely not possible. If it is missed, it will be troublesome, and the two tables are no longer on the same server. What to do at this time? There is a way:

  1. Select a few important fields from the table, such as A, B, and C, and use these fields as a comparison ruler.
  2. Export these three fields of each piece of data from the original table to a file f1.
  3. Transfer these three fields of each piece of data from the destination table to file f2.
  4. Compare each piece of data in files f1 and f2 to see if they are the same.
  5. Draw a conclusion

 The above method was thought up at the same time, and it is not bad, but I think there is room for improvement:

  • First of all, not all fields may still appear different in non-main fields.
  • The overall efficiency is relatively low

My thoughts are this:

  1. Splice every n pieces of data in the table (connect directly, the value of n depends on the size of each piece of data).
  2. Calculate the md5 values ​​of these n pieces of data and add them to file f1 until all data values ​​are obtained.
  3. The same goes for the destination table, which is recorded in file f2.
  4. Compare the md5 values ​​of files f1 and f2. If they are consistent, ok, success.
  5. If they are inconsistent, compare each md5 value from top to bottom and find the mth inconsistent value.
  6. Concluded that the inconsistent data is between m*(n-1)+1 ~ m*n, and you can choose positioning again.

 The advantage of the second method is that the output file will be reduced within a certain range, making it easy to compare. However, it also has disadvantages. It cannot directly locate the location of different data through key fields like the first method.

The following is the specific code implementation of the second method’s effect sum:

<?php
/**
 * 使用方法: 
 * php -f mysql_diff.php 	yes 			dir 						10
 * 		               		是否计算条数	是否计算输出d5并保存到文件	合并数据的级别
 *
 */
if(php_sapi_name() != 'cli')
{
	die("请在CLI模式下运行");
}

array_shift($argv);
if(empty($argv))
{
	die("at letase contain one info");
}

$is_count = array_shift($argv);
$is_md5 = empty($argv) ? false : array_shift($argv);
$conbine_num = empty($argv) ? 1 : intval(array_shift($argv));
if($is_md5 && !is_dir($is_md5) && !mkdir($is_md5, 777, true))
{
	die("error info : md5 info must be input to a file");
}
$dbinfos = array(
	'host' => 'localhost',
	'port' => '3306',
	'user' => 'root',
	'pswd' => '123456',
	'charset' => 'utf8',
	'tables' => array(
		'lagou.pos',
		'lagou.pos_innodb',
	),
);

//验证格式
if(!$link = mysql_connect($dbinfos['host'].":".$dbinfos['port'],$dbinfos['user'], $dbinfos['pswd']))
{
	die("connect to [{$host}@{$port}] failed!!");
}

if(!mysql_query("set names {$dbinfos['charset']}"))
{
	die("set charset error : ".mysql_error());
}

foreach ($dbinfos['tables'] as $table) 
{
	if($is_count)
	{
		$sql = "select count(*) as nums from {$table}";

		$ret = mysql_query($sql);
		if(!$ret)
		{
			die("error : ".mysql_error());
		}
		$ret = mysql_fetch_array($ret, MYSQL_ASSOC);
		echo "{$table} : {$ret['nums']}\n";
	}
	
	
	if($is_md5)
	{
		$path = $is_md5.DIRECTORY_SEPARATOR.$table;
		$sql = "select * from {$table}";
		$ret = mysql_query($sql);
		$flag = 0;
		$fields = '';
		while ($_ret = mysql_fetch_array($ret, MYSQL_NUM)) {
			$flag++;
			while($_ret)
			{
				$fields .= array_pop($_ret);
			}
			if($flag % $conbine_num == 0)
			{
				file_put_contents($path, md5($fields)."\n", FILE_APPEND);
				$fields = '';
			}
		}
		if($flag % $conbine_num != 0 && $flag > 0)
		{
			file_put_contents($path, md5($fields)."\n", FILE_APPEND);
		}
		echo "save to file info : ".realpath($path)."\n";
	}
}

 

The above introduces the data consistency verification of data table migration, including aspects of the content. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn