Home  >  Article  >  Backend Development  >  How to Sort Data Within Groups in Pandas DataFrames?

How to Sort Data Within Groups in Pandas DataFrames?

Susan Sarandon
Susan SarandonOriginal
2024-10-20 17:27:02250browse

How to Sort Data Within Groups in Pandas DataFrames?

Sorting Within Groups in pandas

When working with pandas dataframes, it is often necessary to group data by specific columns and then perform additional operations within those groups. One common requirement is to sort the grouped data based on a certain criterion.

To achieve this, the groupby function can be chained with the sort_values function. As an example, consider a dataframe df that has columns count, job, and source.

In [167]: df

Out[167]:
   count     job source
0      2   sales      A
1      4   sales      B
2      6   sales      C
3      3   sales      D
4      7   sales      E
5      5  market      A
6      3  market      B
7      2  market      C
8      4  market      D
9      1  market      E

If you want to group the data by job and source and then sort the aggregated results by count in descending order, you can do the following:

In [168]: df.groupby(['job','source']).agg({'count':sum})

This will create a new dataframe that contains the aggregated count values for each group. However, the resulting dataframe will not be sorted by count. To sort the dataframe, you can use the sort_values function:

In [34]: df.sort_values(['job','count'],ascending=False)

This will sort the dataframe by job first and then by count in descending order. The resulting dataframe will look like this:

Out[35]: 
   count     job source
4      7   sales      E
2      6   sales      C
1      4   sales      B
5      5  market      A
8      4  market      D
6      3  market      B

To take the top three rows of each group, you can use the head function:

In [34]: df.sort_values(['job','count'],ascending=False).groupby('job').head(3)

This will result in a new dataframe that contains the top three rows of each group, sorted by count in descending order.

Out[35]: 
   count     job source
4      7   sales      E
2      6   sales      C
1      4   sales      B
5      5  market      A
8      4  market      D
6      3  market      B

The above is the detailed content of How to Sort Data Within Groups in Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn