我一天玩几次游戏,每次都得了分。我想逐小时重新组织数据,并将缺少的值设置为零。
以下是原始数据:
import pandas as pd
df = pd.DataFrame({
'Time': ['2017-01-01 08:45:00', '2017-01-01 09:11:00',
'2017-01-01 11:40:00', '2017-01-01 14:05:00',
'2017-01-01 21:00:00'],
'Score': range(1, 6)})
它看起来是这样的:
Score Time
0 1 2017-01-01 08:45:00
1 2 2017-01-01 09:11:00
2 3 2017-01-01 11:40:00
3 4 2017-01-01 14:05:00
4 5 2017-01-01 15:00:00
如何获得这样的新数据帧:
day Hour Score
2017-01-01 00:00:00 0
...
2017-01-01 08:00:00 1
2017-01-01 09:00:00 2
2017-01-01 10:00:00 0
2017-01-01 11:00:00 3
2017-01-01 12:00:00 0
2017-01-01 13:00:00 0
2017-01-01 14:00:00 4
2017-01-01 15:00:00 5
2017-01-01 16:00:00 0
...
2017-01-01 23:00:00 0
非常感谢!
发布于 2017-07-11 03:53:42
您可以将resample
与一些聚合函数一起使用,如sum
,然后使用fillna
和convert to to int
by astype
,但首先添加first
和last
DateTime
值:
df.loc[-1, 'Time'] = '2017-01-01 00:00:00'
df.loc[-2, 'Time'] = '2017-01-01 23:00:00'
df['Time'] = pd.to_datetime(df['Time'])
df = df.resample('H', on='Time').sum().fillna(0).astype(int)
print (df)
Score
Time
2017-01-01 00:00:00 0
2017-01-01 01:00:00 0
2017-01-01 02:00:00 0
2017-01-01 03:00:00 0
2017-01-01 04:00:00 0
2017-01-01 05:00:00 0
2017-01-01 06:00:00 0
2017-01-01 07:00:00 0
2017-01-01 08:00:00 1
2017-01-01 09:00:00 2
2017-01-01 10:00:00 0
2017-01-01 11:00:00 3
2017-01-01 12:00:00 0
2017-01-01 13:00:00 0
2017-01-01 14:00:00 4
2017-01-01 15:00:00 0
2017-01-01 16:00:00 0
2017-01-01 17:00:00 0
2017-01-01 18:00:00 0
2017-01-01 19:00:00 0
2017-01-01 20:00:00 0
2017-01-01 21:00:00 5
2017-01-01 22:00:00 0
2017-01-01 23:00:00 0
https://stackoverflow.com/questions/45024920
复制相似问题