我有以下數據幀:
import pandas as pd
data = {'ID': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
'Time_order': ['2019-01-01 07:00:00', '2019-01-01 07:25:00', '2019-01-02 07:02:00', '2019-01-02 07:27:00', '2019-01-02 06:58:00', '2019-01-03 07:24:00', '2019-01-04 07:03:00', '2019-01-04 07:24:00', '2019-01-05 07:05:00', '2019-01-05 07:30:00', '2019-01-06 07:00:00', '2019-01-06 07:25:00', '2019-01-07 07:02:00', '2019-01-07 07:27:00', '2019-01-08 06:58:00', '2019-01-08 07:24:00', '2019-01-09 07:03:00', '2019-01-09 07:24:00', '2019-01-10 07:05:00', '2019-01-10 07:30:00',
'2019-01-11 017:00:00', '2019-01-11 17:25:00', '2019-01-12 07:02:00', '2019-01-12 07:27:00', '2019-01-13 06:58:00', '2019-01-13 07:24:00', '2019-01-14 07:03:00', '2019-01-14 07:24:00', '2019-01-15 07:05:00', '2019-01-15 07:30:00']}
df = pd.DataFrame(data)
df['Time_order'] = pd.to_datetime(df['Time_order'])
df['hour'] = df['Time_order'].dt.strftime('%H:%M:%S)
我想做一個長度為25分鐘的time_period = 25 minutes
,這樣我就可以檢查time_period里面是否有訂單。例如:我將從午夜開始每天檢查,例如從00:00:00
到00:25:00
,并計算出該順序中有多少個訂單,然后再向前移動5分鐘,例如從00:05:00
到00:30:00
,然后掃描一整天,直到23:59:00
。我所期待的是有多少訂單,并選擇最大的,所以它返回的時候,有一個高峰的訂單在這段時間。
我厭倦了以下幾點:
x = 12 * 24 # each five minutes per hour (12) times 24 hours (a day)
for i in range(x):
df[f'each{i}_minutes_start'] = pd.to_datetime(df['Time_order']).dt.floor(f'{i}_min')
df[f'each{i}_minutes_end'] = df[f'each{i}_minutes_start'] + pd.Timedelta(minutes = 5)
df['time_period'] = df[f'each{i}_minutes_start'].dt.strftime('%H:%M:S%') + '-' + pd.to_datetime(df[f'each{i}_minutes_end']).dt.strtime('%H:%M:S%')
這時我結巴了,不能站出來。先謝謝你
我認為這是可行的:
df.set_index('Time_order').resample("5min").count().rolling(6)['ID'].sum()