我想将一个查找表(demand
)乘以一个DataFrame (areas
),它是为多个商品(这里:水、Elec)和区域类型(Com、Ind、Res)提供的,后者是这些区域类型的区域表。
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
在此之前:
areas
Com Ind
0 1 4
1 2 5
2 3 6
demand
Elec Water
Com 8 4
Ind 9 3
之后:
area_demands
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
我的尝试
冗长和不完整;不适用于任意数量的商品。
areas = pd.DataFrame({'area': areas.stack()})
areas.index.names = ['Edge', 'Type']
both = areas.reset_index(1).join(demand, on='Type')
both['Elec'] = both['Elec'] * both['area']
both['Water'] = both['Water'] * both['area']
del both['area']
# almost there; it must be late, I fail to make 'Type' a hierarchical column...
快到了:
Type Elec Water
Edge
0 Com 8 4
0 Ind 36 12
1 Com 16 8
1 Ind 45 15
2 Com 24 12
2 Ind 54 18
In
如何合理地将DataFrames、areas
和demand
相乘在一起?
发布于 2013-09-02 18:42:17
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
def multiply_by_demand(series):
return demand.ix[series.name].apply(lambda x: x*series).stack()
df = areas.apply(multiply_by_demand).unstack(0)
print(df)
收益率
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
是如何工作的:
首先,看看当我们调用areas.apply(foo)
时会发生什么。foo
一个接一个地传递areas
的列:
def foo(series):
print(series)
In [226]: areas.apply(foo)
0 1
1 2
2 3
Name: Com, dtype: int64
0 4
1 5
2 6
Name: Ind, dtype: int64
因此,假设series
就是这样的一列:
In [230]: series = areas['Com']
In [231]: series
Out[231]:
0 1
1 2
2 3
Name: Com, dtype: int64
我们可以通过这个系列来满足不同的需求:
In [229]: demand.ix['Com'].apply(lambda x: x*series)
Out[229]:
0 1 2
Elec 8 16 24
Water 4 8 12
这是我们想要的数字的一半,但不是我们想要的形式。现在,apply
需要返回一个Series
,而不是DataFrame
。将DataFrame
转换为Series
的一种方法是使用stack
。看看如果我们stack
这个DataFrame会发生什么。这些列成为索引的一个新级别:
In [232]: demand.ix['Com'].apply(lambda x: x*areas['Com']).stack()
Out[232]:
Elec 0 8
1 16
2 24
Water 0 4
1 8
2 12
dtype: int64
因此,使用它作为multiply_by_demand
的返回值,我们得到:
In [235]: areas.apply(multiply_by_demand)
Out[235]:
Com Ind
Elec 0 8 36
1 16 45
2 24 54
Water 0 4 12
1 8 15
2 12 18
现在,我们希望索引的第一层成为列。这可以用unstack
来完成
In [236]: areas.apply(multiply_by_demand).unstack(0)
Out[236]:
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
根据注释中的请求,下面是pivot_table
解决方案:
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
areas = pd.DataFrame({'area': areas.stack()})
areas.index.names = ['Edge', 'Type']
both = areas.reset_index(1).join(demand, on='Type')
both['Elec'] = both['Elec'] * both['area']
both['Water'] = both['Water'] * both['area']
both.reset_index(inplace=True)
both = both.pivot_table(values=['Elec', 'Water'], rows='Edge', cols='Type')
both = both.reorder_levels([1,0], axis=1)
both = both.reindex(columns=both.columns[[0,2,1,3]])
print(both)
https://stackoverflow.com/questions/18578686
复制相似问题