Home > Article > Backend Development > Pandas method to filter data based on a combination of several columns
This article mainly introduces the method of pandas to filter data according to the combination of several columns. It has certain reference value. Now I share it with you. Friends in need can refer to it
Let’s talk with pictures
A file:
For example , I want to filter out the data whose three columns of "design well type", "production well type" and "current well type" are all 11. The results are as follows:
Of course , the filtering conditions here can be freely adjusted according to user needs, the code is as follows:
# -*- coding: utf-8 -*- """ Created on Wed Nov 29 10:46:31 2017 @author: wq """ import pandas as pd #input.csv是那个大文件,有很多很多行 df1 = pd.read_csv(u'input.csv', encoding='gbk') #加encoding=‘gbk'是因为文件中存在中文,不加可能出现乱码 #这里的筛选条件可以根据用户需要进行修改 outfile = df1[(df1[u'设计井别']=='11') & (df1[u'投产井别']=='11') &(df1[u'目前井别']=='11')] outfile.to_csv('outfile.csv', index=False, encoding='gbk')
Sometimes we also have the opposite requirement, and need to delete the "design well category", "production well category", and "current well category" "Those rows where the three columns of data are all 11, the effect is as follows:
The code is as follows:
#input.csv是那个大文件,有很多很多行 df1 = pd.read_csv(u'input.csv', encoding='gbk') df2 = pd.read_csv(u'outfile.csv', encoding='gbk') #加encoding=‘gbk'是因为文件中存在中文,不加可能出现乱码 index = ~df1[u'汉字井号'].isin(df2[u'汉字井号']) df4 = df1[index] df4.to_csv('outfile1.csv', index=False, encoding='gbk')
Related recommendations:
Method for selecting rows and columns based on pandas data samples、
Pandas data processing basics filter specified rows or specify Column data
The above is the detailed content of Pandas method to filter data based on a combination of several columns. For more information, please follow other related articles on the PHP Chinese website!