Home  >  Article  >  Backend Development  >  Create random datetime column conditional on another datetime column pandas

Create random datetime column conditional on another datetime column pandas

王林
王林forward
2024-02-10 09:24:041172browse

创建随机日期时间列,条件是另一个日期时间列 pandas

Question content

I have a pandas data frame df_sample:

columna columnb
a         aa
a         ab
b         ba
b         bb
b         bc

I have created a random column containing some date objects:

df_sample['contract_starts'] = np.random.choice(pd.date_range('2024-01-01', '2024-05-01'), len(df_sample))

This results in the following output:

columna columnb contract_starts
a         aa     2024-01-21
a         ab     2024-03-03
b         ba     2024-01-18
b         bb     2024-02-18
b         bc     2024-04-03

How to create another datetime column contract_noted that also has a given range of values ​​(e.g. until 2024-05-01) but does not exceed contract_startscolumn, for example:

columnA columnB contract_starts contract_noted
A         AA     2024-01-21      2024-01-20
A         AB     2024-03-03      2024-01-01
B         BA     2024-01-18      2024-01-13
B         BB     2024-02-18      2024-02-01
B         BC     2024-04-03      2024-03-28

Correct answer


You can subtract a random time increment from the contract_starts column by numpy.random. randint andto_timedelta:

df_sample['contract_noted'] = (df_sample['contract_starts'] - 
                               pd.to_timedelta(np.random.randint(1,30, len(df_sample)), 
                                               unit='d'))

print (df_sample)
  columna columnb contract_starts contract_noted
0       a      aa      2024-04-18     2024-03-21
1       a      ab      2024-02-12     2024-01-22
2       b      ba      2024-02-21     2024-02-02
3       b      bb      2024-04-12     2024-03-29
4       b      bc      2024-02-10     2024-02-03

If you also need the date and time between start and end, such as

contract_starts generate inetegers between 1 and the difference from the start date and time:

days =(df_sample['contract_starts'] - pd.Timestamp('2024-01-01')).dt.days
print (days)

df_sample['contract_noted'] = (df_sample['contract_starts'] - 
                               pd.to_timedelta(np.random.randint(1,days, len(df_sample)), 
                                               unit='d'))
print (df_sample)
  columnA columnB contract_starts contract_noted
0       A      AA      2024-02-09     2024-01-09
1       A      AB      2024-04-26     2024-02-23
2       B      BA      2024-04-10     2024-04-06
3       B      BB      2024-01-31     2024-01-07
4       B      BC      2024-01-14     2024-01-08

The above is the detailed content of Create random datetime column conditional on another datetime column pandas. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:stackoverflow.com. If there is any infringement, please contact admin@php.cn delete