Home > Article > Backend Development > C# Excel data duplication check and Table data duplication check
When importing Excel data recently, you need to check whether the data is duplicated:
1. Do you need to check whether the Excel data itself is duplicated?
2. Is the data in Excel duplicated with the data in the database?
1. The ways to check whether the data in Excel are repeated are:
1. Use the select statement to filter the data in the Table (omitted here, you can refer to Part 2).
2. Use a for loop to check manually. The code is as follows:
##
#region 记录Excel中的重复列 /// <summary> /// 记录Excel中的重复列 /// </summary> /// <param name="dt">需要获取重复列的表</param> /// <returns>提示重复信息</returns> private string GetDistinctTable(DataTable dt) { //DataTable dtClone = dt;这种方式是错误的,因为这种情况,修改dtClone的同时dt也会被修改。 DataTable dtClone = dt.Clone(); ; string vsSubAcctNo = string.Empty; string vsAcctNo = string.Empty; string repeatExcel = string.Empty; string vsTransDate = string.Empty; for (int i = dtClone.Rows.Count - 1; i >= 0; i--) { vsSubAcctNo = dtClone.Rows[i][4].ToString().Trim(); vsAcctNo = dtClone.Rows[i][1].ToString().Trim(); vsTransDate = dtClone.Rows[i][8].ToString().Trim(); dtClone.Rows[i].Delete(); dtClone.AcceptChanges(); for (int j = dtClone.Rows.Count - 1; j >= 0; j--) { if (vsSubAcctNo == dtClone.Rows[j][4].ToString().Trim() && vsAcctNo == dtClone.Rows[j][1].ToString().Trim() && vsTransDate == dtClone.Rows[j][8].ToString().Trim()) { //如果重复了,进行记录 repeatExcel += "第" + (i + 1).ToString() + "行\r\n"; break; } } } return repeatExcel; } #endregionSmall note:
clone It should be modified to copy
2. The ways to check whether the data in Excel is duplicated with the data in the database are: 1. Traverse the Table and check each piece of data in the database for duplication. This method is suitable for situations where the data in the Table is relatively small (within 100) and the comparison table in the database is very large. Because this method requires connecting to the database and executing queries every time a piece of data is compared, which is very time-consuming.
After a general test, there are 2,000 pieces of data in Excel. Just querying in the database consumes 7 minutes and 40 seconds, or 4601000 milliseconds (ms). About one piece of data takes 2300.5 milliseconds.
In fact, the usage method 2 imports 2,000 items, which is less time-consuming than method 1, which imports 100 items.
2. Get the table data to be compared in the database into a dataset, traverse the Table and check for duplication of each piece of data in the dataset. The code is as follows:
strTemp = "AcctNo='" + obZH.ToString() + "' and TransDate='" + obRQ.ToString() + "' and SubAcctNo='" + obDFZH.ToString() + "'"; rowsTemp = dsTemp.Tables[0].Select(strTemp); if (rowsTemp.Length>0) { //如果重复了,进行记录 repeatDj += "第" + v.ToString() + "行\r\n"; }