Home >Backend Development >Python Tutorial >How Can I Perform Multiple Aggregations on the Same Column Using Pandas GroupBy.agg()?
Multiple Aggregations on the Same Column with Pandas GroupBy.agg()
In pandas, GroupBy.agg() allows for convenient aggregation of data by applying a function to each column. However, it becomes necessary to call agg() multiple times when applying different functions to the same column.
Traditional (Incorrect) Approach:
The intuitively straightforward approach would be:
df.groupby("dummy").agg({ "returns": f1, "returns": f2 })
Unfortunately, this results in an error due to duplicate keys.
Solution:
Since agg() expects a dictionary, the straightforward solution is to create a dictionary with the column name and a list of functions:
df.groupby("dummy").agg({ "returns": [f1, f2] })
This will result in a multi-index DataFrame with the output of both aggregations.
Example:
Consider the following DataFrame:
import pandas as pd import datetime as dt import numpy as np pd.np.random.seed(0) df = pd.DataFrame({ "date": [dt.date(2012, x, 1) for x in range(1, 11)], "returns": 0.05 * np.random.randn(10), "dummy": np.repeat(1, 10) })
To apply both mean and sum to the "returns" column:
df.groupby("dummy").agg({ "returns": ["mean", "sum"] })
This will produce:
returns mean sum dummy 1 0.036901 0.369012
The above is the detailed content of How Can I Perform Multiple Aggregations on the Same Column Using Pandas GroupBy.agg()?. For more information, please follow other related articles on the PHP Chinese website!