Home >Backend Development >Python Tutorial >How to Select Pandas DataFrame Rows Based on Complex Multi-Column Criteria?
Selecting with complex criteria from pandas.DataFrame
In data analysis using Pandas, we often need to select specific rows or columns based on complex criteria. This question demonstrates how to select values from column 'A' given corresponding values for 'B' and 'C' meet certain conditions.
To understand the problem, let's create an example DataFrame:
import pandas as pd from random import randint df = pd.DataFrame({'A': [randint(1, 9) for x in range(10)], 'B': [randint(1, 9)*10 for x in range(10)], 'C': [randint(1, 9)*100 for x in range(10)]})
Now, we want to select values from 'A' that satisfy the following conditions:
To solve this, we can use Pandas' column operations:
df["B"] > 50
This creates a boolean Series object where each element corresponds to a row in 'B' and indicates whether the respective 'B' value is greater than 50.
(df["B"] > 50) & (df["C"] != 900)
This leverages broadcasting and logical operators to combine the boolean Series objects, creating a new Series where each element indicates whether the corresponding row meets both conditions.
Using this Series as an index, we can extract the 'A' values:
df["A"][(df["B"] > 50) & (df["C"] != 900)]
Alternatively, using .loc syntax:
df.loc[(df["B"] > 50) & (df["C"] != 900), "A"]
This approach efficiently extracts rows that meet the specified criteria, enabling further analysis or operations on the selected data.
The above is the detailed content of How to Select Pandas DataFrame Rows Based on Complex Multi-Column Criteria?. For more information, please follow other related articles on the PHP Chinese website!