现在我有了日期和值的列表,但是我不知道如何使用日期结构进行计算。
看起来像是
[[datetime.date(2018, 8, 10) 1076.2392505636847]
[datetime.date(2018, 8, 11) 3537.9781979862732]
[datetime.date(2018, 8, 12) 8637.536518161462]
[datetime.date(2018, 8, 13) 15660.768121458246]
[datetime.date(2018, 8, 14) 21087.477911830327]
[datetime.date(2018, 8, 15) 21087.477911830327]
[datetime.date(2018, 8, 16) 15660.768121458246]
[datetime.date(2018, 8, 17) 8637.536518161465]
[datetime.date(2018, 8, 18) 3537.9781979862732]
[datetime.date(2018, 8, 19) 1076.2392505636856]]而且,我知道
startdate = datetime.date(2018, 8, 10)
enddate = datetime.date(2018,8, 19)我想要创建另一个列表,其中包含“年度-月”数据,即该月份的总和。在这种情况下,这将只是'2018-8‘的总和。如果终止日期为2020,8,19,则长度为25 (两年零一个月)。
你能分享一下我可能使用的一些有用的功能/方法吗?
发布于 2018-08-15 17:12:56
collections.defaultdict
您可以将collections.defaultdict用于不需要排序的O(n)解决方案。
import datetime
L = [[datetime.date(2018, 8, 10), 1076.23], [datetime.date(2018, 8, 11), 3537.97],
[datetime.date(2018, 8, 19), 1076.23], [datetime.date(2018, 9, 10), 5.23],
[datetime.date(2018, 9, 11), 10.97], [datetime.date(2018, 10, 19), 15.23]]
from collections import defaultdict
d = defaultdict(int)
for date, val in L:
d[date.strftime('%Y-%m')] += val
# defaultdict(int,
# {'2018-08': 5690.43,
# '2018-09': 16.20,
# '2018-10': 15.23})
res = list(map(list, d.items()))
print(res)
[['2018-08', 5690.43],
['2018-09', 16.20],
['2018-10', 15.23]]熊猫
如果您乐于使用第三方库,则可以使用Pandas:
# construct dataframe from list of lists
df = pd.DataFrame(L, columns=['date', 'val'])
# convert to datetime
df['date'] = pd.to_datetime(df['date'])
# perform GroupBy operation over monthly frequency
res = df.set_index('date').groupby(pd.Grouper(freq='M'))['val'].sum().reset_index()
print(res)
date val
0 2018-08-31 5690.430
1 2018-09-30 16.200
2 2018-10-31 15.230发布于 2018-08-15 17:16:14
您可以使用min和max来查找开始和结束时间。然后使用itertools.groupby对每个月的条目进行分组,并为每个组查找和
lst = [[datetime.date(2018, 8, 10), 1076.2392505636847],
[datetime.date(2018, 8, 11), 3537.9781979862732],
[datetime.date(2018, 8, 12), 8637.536518161462],
[datetime.date(2018, 8, 13), 15660.768121458246],
[datetime.date(2018, 8, 14), 21087.477911830327],
[datetime.date(2018, 8, 15), 21087.477911830327],
[datetime.date(2018, 8, 16), 15660.768121458246],
[datetime.date(2018, 8, 17), 8637.536518161465],
[datetime.date(2018, 8, 18), 3537.9781979862732],
[datetime.date(2018, 8, 19), 1076.2392505636856]]
starttime = min(lst)
endtime = max(lst)
from itertools import groupby
from operator import itemgetter
res = [[k.strftime('%Y-%m'), sum(map(itemgetter(1), group))] for k,group in groupby(lst, lambda sl: sl[0].replace(day=1))]
print (starttime, endtime)
print (res)输出
[datetime.date(2018, 8, 10), 1076.2392505636847] [datetime.date(2018, 8, 19), 1076.2392505636856]
[['2018-08', 99999.99999999999]]发布于 2018-08-15 17:27:13
有了潘达斯,它会更直观,更容易理解。
将数据加载到数据框架中
df=pd.DataFrame([[datetime.date(2018, 8, 10), 1076.2392505636847],
[datetime.date(2018, 8, 11), 3537.9781979862732],
[datetime.date(2018, 8, 12), 8637.536518161462],
[datetime.date(2018, 8, 13), 15660.768121458246],
[datetime.date(2018, 8, 14), 21087.477911830327],
[datetime.date(2018, 8, 15), 21087.477911830327],
[datetime.date(2018, 8, 16), 15660.768121458246],
[datetime.date(2018, 8, 17), 8637.536518161465],
[datetime.date(2018, 8, 18), 3537.9781979862732],
[datetime.date(2019, 8, 19), 1076.2392505636856]],
columns=["Date",'amount'])将日期列转换为日期时间
df.Date=pd.to_datetime(df.Date)按年份和月份创建索引
df.index=[df.Date.dt.year, df.Date.dt.month]按年份和月份分列的总数
df.groupby(['year','month']).sum()https://stackoverflow.com/questions/51863147
复制相似问题