Home  >  Article  >  Backend Development  >  How to filter data in pandas

How to filter data in pandas

百草
百草Original
2023-11-22 10:36:261981browse

Pandas method of filtering data: 1. Import the Pandas library; 2. Read the data; 3. Filter the data; 4. Sort the data; 5. Group and aggregate the data, etc. Detailed introduction: 1. Import the Pandas library. First, make sure that the Pandas library is installed. If it is not installed, you can use the "pip install pandas" command to install it, and then you can use the "import pandas as pd" command to import the Pandas library; 2. Read data , using the Pandas library and more.

How to filter data in pandas

The operating system for this tutorial: Windows 10 system, DELL G3 computer.

Pandas is a popular Python data analysis library that provides many powerful features that allow you to easily filter, process and analyze data. Here are some common ways to use Pandas to filter data:

1. Import the Pandas library

First, make sure the Pandas library is installed. If it is not installed, you can use the following command to install it:

pip install pandas

Then, import the Pandas library:

import pandas as pd

2. Read data

Use read_csv() in the Pandas library The function reads CSV files, the read_excel() function reads Excel files, etc. For example, read a CSV file named data.csv:

df = pd.read_csv('data.csv')

3. Filter data

Pandas provides a variety of methods to filter data. The following are several common methods:

(1) Filter based on conditions

Use loc and iloc attributes and logical operators (such as &, |, ~, etc.) to filter data. For example, to filter data whose age is greater than or equal to 18 years old and whose gender is female:

df.loc[(df['age'] >= 18) & (df['gender'] == 'female')]

(2) Filtering based on tags

Use the loc attribute to filter data for specific tags. For example, to filter data with the surname "Zhang":

df.loc[df['last_name'] == '张']

(3) Filter by range

Use the loc attribute to filter data within a specific range. For example, filter data between the ages of 18 and 30:

df.loc[(df[&#39;age&#39;] >= 18) & (df[&#39;age&#39;] <= 30)]

(4) Filter by multiple conditions

Use the query method to filter data that meets multiple conditions. For example, to filter data whose age is greater than or equal to 18 years old and whose gender is female:

df.query(&#39;age >= 18 & gender == "female"&#39;)

4. Sorting data

Use the sort_values() method to sort the data. For example, sort by age in ascending order:

df.sort_values(&#39;age&#39;, ascending=True)

5. Grouped aggregate data

Use the groupby() method to group the data, and use aggregate functions (such as sum(), mean(), count (), etc.) are calculated for each group. For example, to calculate the average age for each gender group:

df.groupby(&#39;gender&#39;).mean()[&#39;age&#39;]

The above is the detailed content of How to filter data in pandas. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn