Home  >  Article  >  Backend Development  >  How to retrieve the first row of each group in a Pandas DataFrame based on multiple columns?

How to retrieve the first row of each group in a Pandas DataFrame based on multiple columns?

DDD
DDDOriginal
2024-11-17 09:59:03728browse

How to retrieve the first row of each group in a Pandas DataFrame based on multiple columns?

Retrieve the First Row of Each Group in a Pandas DataFrame

Question:

How can you efficiently extract the first row of each group from a Pandas DataFrame, where the grouping is defined by multiple columns?

Answer:

To retrieve the first row of each group in a Pandas DataFrame based on multiple columns:

  1. Group the Data: Group the DataFrame by the desired columns using the groupby() method:

    df_grouped = df.groupby(['id', 'value'])
  2. Apply an Aggregation Function: Apply the first() function to each group to obtain the first non-null element:

    df_first_rows = df_grouped.first()
  3. Reset the Index (Optional): If you need the 'id' and 'value' columns as separate columns, use the reset_index() method:

    df_first_rows = df_first_rows.reset_index()

Example:

Consider the following DataFrame:

df = pd.DataFrame({'id': [1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 5, 6, 6, 6, 7, 7],
                   'value': ["first", "second", "second", "first",
                             "second", "first", "third", "fourth",
                             "fifth", "second", "fifth", "first",
                             "first", "second", "third", "fourth", "fifth"]})

Applying the上記の steps:

df_grouped = df.groupby(['id', 'value'])
df_first_rows = df_grouped.first()
df_first_rows = df_first_rows.reset_index()

print(df_first_rows)

Output:

   id   value
0   1   first
1   2   first
2   3   first
3   4   second
4   5   first
5   6   first
6   7   fourth

This code successfully retrieves the first row of each group defined by the 'id' and 'value' columns.

The above is the detailed content of How to retrieve the first row of each group in a Pandas DataFrame based on multiple columns?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn