我想在python中实现一个sumifs等效计算,但不确定如何实现它。
data={'product_id':['a','a','b','b'],
'start':[datetime.date(2016,1,1),datetime.date(2016,1,1),datetime.date(2016,1,3),datetime.date(2016,1,1)],
'end':[datetime.date(2016,1,4),datetime.date(2016,1,3),datetime.date(2016,1,4),datetime.date(2016,1,3)],
'store':['X','Y','X','Y']}
df=pd.DataFrame.from_dict(data)
sales={'date':[datetime.date(2016,1,1),datetime.date(2016,1,2),datetime.date(2016,1,3),datetime.date(2016,1,4),
datetime.date(2016,1,1),datetime.date(2016,1,2),datetime.date(2016,1,3),datetime.date(2016,1,4),
datetime.date(2016,1,1),datetime.date(2016,1,2),datetime.date(2016,1,3),datetime.date(2016,1,4),
datetime.date(2016,1,1),datetime.date(2016,1,2),datetime.date(2016,1,3),datetime.date(2016,1,4)],
'product':['a','a','a','a','b','b','b','b','a','a','a','a','b','b','b','b'],
'store':['X','X','X','X','X','X','X','X','Y','Y','Y','Y','Y','Y','Y','Y'],
'sales':[20,30,59,110,30,40,80,20,80,59,110,30,40,80,20,10]}
df2=pd.DataFrame.from_dict(sales)
我想根据df定义的时间范围,按商店总结每种产品的销售情况。在excel中,我可以使用sumifs函数,但不确定如何在python中这样做。有人能帮我吗?谢谢!
发布于 2021-02-13 09:21:50
您可以将日期范围与销售数据合并,在范围之间查找记录,并对其执行groupby/sum操作。
df = df2.merge(df, left_on=['store','product'], right_on=['store', 'product_id'])
df.loc[df['date'].between(df['start'], df['end'])].groupby(['store','product'])['sales'].sum().reset_index(name='total_sales')
输出
store product total_sales
0 X a 219
1 X b 100
2 Y a 249
3 Y b 140
https://stackoverflow.com/questions/66187799
复制相似问题