Home >Backend Development >Python Tutorial >Conditional merge with pandas

Conditional merge with pandas

WBOY
WBOYforward
2024-02-22 13:07:091243browse

Conditional merge with pandas

Question content

I have a pandas dataframe as shown below, which details additional calls to a region:

commsdate area day0 incremental day1 incremental day2 incremental
01/01/24 sales 43 36 29
01/01/24 service 85 74 66
02/01/24 sales 56 42 31
02/01/24 service 73 62 49
03/01/24 sales 48 32 twenty four
03/01/24 service 67 58 46

I am trying to calculate the number of calls received by date, so a sales call received on January 1st will be day0_incremental (43) of that date and January 2nd will be day0 of January 2 plus 1 Day1 on January 1 (36) 56) and January 3 will be day0 on January 3 plus day1 on January 2 plus day2 on January 1 (48 42 29), resulting in the following data frame:

CallDate Sales Service
01/01/24 43 85
02/01/24 92 147
03/01/24 119 195
04/01/24 63 107
05/01/24 twenty four 46

I have successfully created a shell of the data frame for the second table with no values ​​under the range column but don't know what to do next:

df['commsdate'] = pd.to_datetime(df['commsdate'], format='%d/%m/%y')
areaunique = df['area'].unique().tolist()
from datetime import timedelta
calldate = pd.date_range(start=min(df['commsdate']), end=max(df['commsdate'])+timedelta(days=6), freq='d')

data = {area: [] for area in areaunique}

dfnew = pd.dataframe(data)

dfnew['calldate'] = calldate

dfnew = dfnew.melt(id_vars=['calldate'], var_name='area')

dfnew = dfnew.pivot(index='calldate', columns='area', values='value')

dfnew = dfnew.reset_index()

dfnew = dfnew[['calldate'] + areaunique]

I've started writing a for loop, but I've only gotten this far:

for i in range(1,len(areaunique)+1):
    dfnew.columns(i) =

Correct answer


You can dialpivot,shiftandadd

df['commsdate'] = pd.to_datetime(df['commsdate'], dayfirst=true)
tmp = df.pivot(index='commsdate', columns='area')

out = (tmp['day0 incremental']
       .add(tmp['day1 incremental'].shift(freq='1d'), fill_value=0)
       .add(tmp['day2 incremental'].shift(freq='2d'), fill_value=0)
       .reset_index().rename_axis(columns=none)
      )

Alternatively, programmatically use functools.reduce using numbers extracted from the dayx … string:

from functools import reduce
import re

reg = re.compile(r'day(\d+)')

df['commsdate'] = pd.to_datetime(df['commsdate'], dayfirst=true)
tmp = df.pivot(index='commsdate', columns='area')

out = reduce(lambda a,b: a.add(b, fill_value=0),
             (tmp[d].shift(freq=f'{reg.search(d).group(1)}d') for d in
              tmp.columns.get_level_values(0).unique())
            ).reset_index().rename_axis(columns=none)

Output:

CommsDate  Sales  Service
0 2024-01-01   43.0     85.0
1 2024-01-02   92.0    147.0
2 2024-01-03  119.0    195.0
3 2024-01-04   63.0    107.0
4 2024-01-05   24.0     46.0

The above is the detailed content of Conditional merge with pandas. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete