Home  >  Article  >  Backend Development  >  pandas implements deduplication of duplicate tables and re-converts them into tables

pandas implements deduplication of duplicate tables and re-converts them into tables

不言
不言Original
2018-04-18 15:47:282910browse

Below I will share with you a pandas method to remove duplicate tables and convert them into tables again. It has a good reference value and I hope it will be helpful to everyone. Let’s take a look together

When processing data in python, DataFrame and set are often used.

train=pd.read_csv('XXX.csv')#读取文件 
train=train['item_id']#选择要去重的列 
train=set(train)#去重 
data=pd.DataFrame(list(train),columns=['item_id'])#因为set是无序的,必须要经过list处理后才能成为DataFrame 
data.to_csv('xxx.csv',index=False)#保存表格

Remember to import pandas~

The above is the detailed content of pandas implements deduplication of duplicate tables and re-converts them into tables. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn