Home  >  Article  >  Backend Development  >  How to Count the Frequency of Identical Rows in a Pandas DataFrame?

How to Count the Frequency of Identical Rows in a Pandas DataFrame?

Barbara Streisand
Barbara StreisandOriginal
2024-10-25 08:01:02466browse

How to Count the Frequency of Identical Rows in a Pandas DataFrame?

Get a Frequency Count Based on Multiple Dataframe Columns

To determine how often identical rows appear in a dataframe, we can employ Pandas' groupby function. Consider the following example:

data = {'Group': ['Short', 'Short', 'Moderate', 'Moderate', 'Tall'], 'Size': ['Small', 'Small', 'Medium', 'Small', 'Large']}
df = pd.DataFrame(data)

We can calculate the frequency count in three ways:

Option 1:

dfg = df.groupby(by=["Group", "Size"]).size()

This produces a Series with the following output:

Group     Size
Moderate  Medium    1
          Small     1
Short     Small     2
Tall      Large     1
dtype: int64

Option 2:

dfg = df.groupby(by=["Group", "Size"]).size().reset_index(name="Time")

This results in a DataFrame with an added "Time" column:

      Group    Size  Time
0  Moderate  Medium     1
1  Moderate   Small     1
2     Short   Small     2
3      Tall   Large     1

Option 3:

dfg = df.groupby(by=["Group", "Size"], as_index=False).size()

This also produces a DataFrame, equivalent to the output of Option 2.

The above is the detailed content of How to Count the Frequency of Identical Rows in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn