Home > Article > Backend Development > Quickly find differences in data in two spreadsheets
I have just come into contact with python recently. I am looking for some small tasks to practice my skills. I hope that I can continue to exercise my problem-solving ability in practice.
There will be such a scenario in the company: the contents of a spreadsheet are used by two, three or more departments. These employees will update some of their own departments from time to time while maintaining these tables. Data, over time, everyone's data will start to fight, which is very unfavorable for management. How to quickly find the differences in data in two or more spreadsheets?
Solution:
1. Excel’s own method (if interested, use Baidu)
2. Write a small script in python
1 #!/usr/bin/env python 2 # -*- coding: utf-8 -*- 3 4 #导入模块 openpyxl 5 import openpyxl 6 from openpyxl.styles import PatternFill 7 from openpyxl.styles import colors 8 from openpyxl.styles import Font, Color 9 10 #读取excel文件11 #括号中的字符串为你要比较的两个excel的路径,注意用“/”12 wb_a = openpyxl.load_workbook('d:/BAKFILE/d046532/Desktop/check excel/test.xlsx')13 wb_b = openpyxl.load_workbook('d:/BAKFILE/d046532/Desktop/check excel/test2.xlsx')14 #定义一个方法来获取表格中某一列的内容,返回一个列表15 #在这里,我的表格中:IP是具有唯一性的,所以我用它来区分数据的差异,而IP这一列在我的表格中是第“G”列16 def getIP(wb):17 sheet = wb.get_active_sheet()18 ip = []19 for cellobj in sheet['G']:20 ip.append(cellobj.value)21 22 return ip23 #获得ip列表24 ip_a = getIP(wb_a)25 ip_b = getIP(wb_b)26 #将两个列表转换成集合27 aa = set(ip_a)28 bb = set(ip_b)29 #找出两个列表的不同行,并转换成列表30 difference = list(aa ^ bb)31 #打印出列表中的元素32 #到这一步,两个表格中不同的数据已经被找出来了33 for i in difference:34 print (i)35 36 #将不同行高亮显示37 print ("开始第一张表" + "----" *10)38 a = wb_a.get_active_sheet()['G']39 for cellobj in a:40 if cellobj.value in difference:41 print (cellobj.value)42 cellobj.font = Font(color=colors.BLACK, italic=True ,bold = True)43 cellobj.fill = PatternFill("solid", fgColor="DDDDDD")44 print ("开始第二张表" + "----" *10)45 b = wb_b.get_active_sheet()['G']46 for cellobj in b:47 if cellobj.value in difference:48 print (cellobj.value)49 cellobj.font = Font(color=colors.BLACK, italic=True ,bold = True)50 cellobj.fill = PatternFill("solid", fgColor="DDDDDD")51 52 wb_a.save('d:/BAKFILE/d046532/Desktop/a.xlsx')53 wb_b.save('d:/BAKFILE/d046532/Desktop/b.xlsx')
In this way, two copies of excel will be saved, and the cell fill color and font color will be used to mark the different data differences in the two tables in this copy
Unresolved :
1. How to supplement these different data into a table to form a complete set of tables
2. How to optimize the streamlined code
The above is the detailed content of Quickly find differences in data in two spreadsheets. For more information, please follow other related articles on the PHP Chinese website!