Home >Backend Development >Python Tutorial >How to Get Rows with the Maximum Count in Grouped Pandas DataFrames?
Problem:
Retrieve all rows from a pandas DataFrame that have the maximum value for the "count" column, grouped by the "Sp" and "Mt" columns.
Examples:
Example 1:
Input DataFrame:
Sp | Mt | Value | count |
---|---|---|---|
MM1 | S1 | a | 3 |
MM1 | S1 | n | 2 |
MM1 | S3 | cb | 5 |
MM2 | S3 | mk | 8 |
MM2 | S4 | bg | 10 |
MM2 | S4 | dgd | 1 |
MM4 | S2 | rd | 2 |
MM4 | S2 | cb | 2 |
MM4 | S2 | uyi | 7 |
Expected Output:
Sp | Mt | Value | count |
---|---|---|---|
MM1 | S1 | a | 3 |
MM1 | S3 | cb | 5 |
MM2 | S3 | mk | 8 |
MM2 | S4 | bg | 10 |
MM4 | S2 | uyi | 7 |
Example 2:
Input DataFrame:
Sp | Mt | Value | count |
---|---|---|---|
MM2 | S4 | bg | 10 |
MM2 | S4 | dgd | 1 |
MM4 | S2 | rd | 2 |
MM4 | S2 | cb | 8 |
MM4 | S2 | uyi | 8 |
Expected Output:
Sp | Mt | Value | count |
---|---|---|---|
MM2 | S4 | bg | 10 |
MM4 | S2 | cb | 8 |
MM4 | S2 | uyi | 8 |
Solution:
import pandas as pd df.groupby(['Sp', 'Mt'])['count'].max()
This returns a Series with the maximum count for each unique combination of "Sp" and "Mt".
idx = df.groupby(['Sp', 'Mt'])['count'].transform(max) == df['count']
This creates a Boolean mask where True indicates rows with the maximum count within their group.
df[idx]
This filters the DataFrame to include only rows where the idx mask is True, resulting in the desired output.
Note: Multiple rows with equal maximum count will be returned if they occur in the same group.
The above is the detailed content of How to Get Rows with the Maximum Count in Grouped Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!