Home >Backend Development >Python Tutorial >Conditional merge with pandas

Conditional merge with pandas

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBforward: 2024-02-22 13:07:091310browse

Question content

I have a pandas dataframe as shown below, which details additional calls to a region:

commsdate	area	day0 incremental	day1 incremental	day2 incremental
01/01/24	sales	43	36	29
01/01/24	service	85	74	66
02/01/24	sales	56	42	31
02/01/24	service	73	62	49
03/01/24	sales	48	32	twenty four
03/01/24	service	67	58	46

I am trying to calculate the number of calls received by date, so a sales call received on January 1st will be day0_incremental (43) of that date and January 2nd will be day0 of January 2 plus 1 Day1 on January 1 (36) 56) and January 3 will be day0 on January 3 plus day1 on January 2 plus day2 on January 1 (48 42 29), resulting in the following data frame:

CallDate	Sales	Service
01/01/24	43	85
02/01/24	92	147
03/01/24	119	195
04/01/24	63	107
05/01/24	twenty four	46

I have successfully created a shell of the data frame for the second table with no values under the range column but don't know what to do next:

df['commsdate'] = pd.to_datetime(df['commsdate'], format='%d/%m/%y')
areaunique = df['area'].unique().tolist()
from datetime import timedelta
calldate = pd.date_range(start=min(df['commsdate']), end=max(df['commsdate'])+timedelta(days=6), freq='d')

data = {area: [] for area in areaunique}

dfnew = pd.dataframe(data)

dfnew['calldate'] = calldate

dfnew = dfnew.melt(id_vars=['calldate'], var_name='area')

dfnew = dfnew.pivot(index='calldate', columns='area', values='value')

dfnew = dfnew.reset_index()

dfnew = dfnew[['calldate'] + areaunique]

I've started writing a for loop, but I've only gotten this far:

for i in range(1,len(areaunique)+1):
    dfnew.columns(i) =

Correct answer

You can dialpivot,shiftandadd：

df['commsdate'] = pd.to_datetime(df['commsdate'], dayfirst=true)
tmp = df.pivot(index='commsdate', columns='area')

out = (tmp['day0 incremental']
       .add(tmp['day1 incremental'].shift(freq='1d'), fill_value=0)
       .add(tmp['day2 incremental'].shift(freq='2d'), fill_value=0)
       .reset_index().rename_axis(columns=none)
      )

Alternatively, programmatically use functools.reduce using numbers extracted from the dayx … string:

from functools import reduce
import re

reg = re.compile(r'day(\d+)')

df['commsdate'] = pd.to_datetime(df['commsdate'], dayfirst=true)
tmp = df.pivot(index='commsdate', columns='area')

out = reduce(lambda a,b: a.add(b, fill_value=0),
             (tmp[d].shift(freq=f'{reg.search(d).group(1)}d') for d in
              tmp.columns.get_level_values(0).unique())
            ).reset_index().rename_axis(columns=none)

Output:

CommsDate  Sales  Service
0 2024-01-01   43.0     85.0
1 2024-01-02   92.0    147.0
2 2024-01-03  119.0    195.0
3 2024-01-04   63.0    107.0
4 2024-01-05   24.0     46.0

The above is the detailed content of Conditional merge with pandas. For more information, please follow other related articles on the PHP Chinese website!

pandas for 字符串循环

Statement：

This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete

Previous article：Error updating list while looping in PythonNext article：Error updating list while looping in Python

See more

Conditional merge with pandas

Correct answer

Related articles