Home >Backend Development >Python Tutorial >How to Create a New Column in Pandas DataFrame Using Groupby().sum() and .transform()?
Creating a New Column from the Output of Pandas Groupby().sum()
When working with data in Python, it's often necessary to perform calculations and create new columns in a DataFrame based on existing values. In this example, we're looking to create a new column (Data4) containing the sum of Data3 for each Date.
Using .transform()
To achieve this, we can utilize the .transform() method on the grouped Data3 column. .transform() applies a function to each group and returns a Series with the index aligned to the original DataFrame. This allows us to add the calculated values as a new column.
df['Data4'] = df['Data3'].groupby(df['Date']).transform('sum')
In the example DataFrame provided:
import pandas as pd df = pd.DataFrame({ 'Date': ['2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05', '2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05'], 'Sym': ['aapl', 'aapl', 'aapl', 'aapl', 'aaww', 'aaww', 'aaww', 'aaww'], 'Data2': [11, 8, 10, 15, 110, 60, 100, 40], 'Data3': [5, 8, 6, 1, 50, 100, 60, 120] })
Using .transform(), we calculate the sum of Data3 for each Date and assign it to the new column Data4:
df['Data4'] = df['Data3'].groupby(df['Date']).transform('sum')
The resulting DataFrame will have the desired Data4 column:
Date Sym Data2 Data3 Data4 0 2015-05-08 aapl 11 5 55 1 2015-05-07 aapl 8 8 108 2 2015-05-06 aapl 10 6 66 3 2015-05-05 aapl 15 1 121 4 2015-05-08 aaww 110 50 55 5 2015-05-07 aaww 60 100 108 6 2015-05-06 aaww 100 60 66 7 2015-05-05 aaww 40 120 121
The above is the detailed content of How to Create a New Column in Pandas DataFrame Using Groupby().sum() and .transform()?. For more information, please follow other related articles on the PHP Chinese website!