Home >Backend Development >Python Tutorial >How to Create a New Column in Pandas DataFrame Using Groupby().sum() and .transform()?

How to Create a New Column in Pandas DataFrame Using Groupby().sum() and .transform()?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-12-24 04:43:13202browse

How to Create a New Column in Pandas DataFrame Using Groupby().sum() and .transform()?

Creating a New Column from the Output of Pandas Groupby().sum()

When working with data in Python, it's often necessary to perform calculations and create new columns in a DataFrame based on existing values. In this example, we're looking to create a new column (Data4) containing the sum of Data3 for each Date.

Using .transform()

To achieve this, we can utilize the .transform() method on the grouped Data3 column. .transform() applies a function to each group and returns a Series with the index aligned to the original DataFrame. This allows us to add the calculated values as a new column.

df['Data4'] = df['Data3'].groupby(df['Date']).transform('sum')

In the example DataFrame provided:

import pandas as pd

df = pd.DataFrame({
    'Date': ['2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05',
             '2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05'],
    'Sym': ['aapl', 'aapl', 'aapl', 'aapl', 'aaww', 'aaww', 'aaww', 'aaww'],
    'Data2': [11, 8, 10, 15, 110, 60, 100, 40],
    'Data3': [5, 8, 6, 1, 50, 100, 60, 120]
})

Using .transform(), we calculate the sum of Data3 for each Date and assign it to the new column Data4:

df['Data4'] = df['Data3'].groupby(df['Date']).transform('sum')

The resulting DataFrame will have the desired Data4 column:

         Date   Sym  Data2  Data3  Data4
0  2015-05-08  aapl     11      5     55
1  2015-05-07  aapl      8      8    108
2  2015-05-06  aapl     10      6     66
3  2015-05-05  aapl     15      1    121
4  2015-05-08  aaww    110     50     55
5  2015-05-07  aaww     60    100    108
6  2015-05-06  aaww    100     60     66
7  2015-05-05  aaww     40    120    121

The above is the detailed content of How to Create a New Column in Pandas DataFrame Using Groupby().sum() and .transform()?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn