Home >Backend Development >Python Tutorial >How to Count the Frequency of Rows Based on Multiple Columns in a Pandas DataFrame?
Get a Frequency Count Based on Multiple Dataframe Columns
To find the frequency of rows that appear multiple times in a dataframe, you can utilize the groupby operation with either size or count functions. Let's demonstrate this with an example dataframe:
import pandas as pd # Sample dataframe data = {'Group': ['Short', 'Short', 'Moderate', 'Moderate', 'Tall'], 'Size': ['Small', 'Small', 'Medium', 'Small', 'Large']} df = pd.DataFrame(data)
Option 1: Using groupby and size
dfg = df.groupby(['Group', 'Size']).size() print(dfg)
Output:
Group Size Moderate Medium 1 Small 1 Short Small 2 Tall Large 1 dtype: int64
Option 2: Using groupby, size, and reset_index
dfg = df.groupby(['Group', 'Size']).size().reset_index(name='Time') print(dfg)
Output:
Group Size Time 0 Moderate Medium 1 1 Moderate Small 1 2 Short Small 2 3 Tall Large 1
Option 3: Using groupby, size, and as_index
dfg = df.groupby(['Group', 'Size'], as_index=False).size() print(dfg)
Output:
Group Size Time 0 Moderate Medium 1 1 Moderate Small 1 2 Short Small 2 3 Tall Large 1
Each option returns a dataframe with Group and Size columns, indicating the specific row combinations that appear in the original dataframe. An additional Time column shows the frequency count for each combination.
The above is the detailed content of How to Count the Frequency of Rows Based on Multiple Columns in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!