文章/答案/技术大牛

发布

问熊猫:加入外卖产品
EN

Stack Overflow用户

提问于 2013-09-02 18:05:45

回答 1查看 1.2K关注 0票数 6

我想将一个查找表(demand)乘以一个DataFrame (areas)，它是为多个商品(这里:水、Elec)和区域类型(Com、Ind、Res)提供的，后者是这些区域类型的区域表。

import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3], 
                       'Elec':[8,9]}, index=['Com', 'Ind'])

在此之前：

areas
   Com  Ind
0    1    4
1    2    5
2    3    6

demand
     Elec  Water
Com     8      4
Ind     9      3

之后：

area_demands                  
     Com          Ind         
     Elec  Water  Elec  Water 
0       8      4    36     12 
1      16      8    45     15 
2      24     12    54     18

我的尝试

冗长和不完整；不适用于任意数量的商品。

areas = pd.DataFrame({'area': areas.stack()})
areas.index.names = ['Edge', 'Type']
both = areas.reset_index(1).join(demand, on='Type')
both['Elec'] = both['Elec'] * both['area']
both['Water'] = both['Water'] * both['area']
del both['area']
# almost there; it must be late, I fail to make 'Type' a hierarchical column...

快到了：

     Type  Elec  Water
Edge
0     Com     8      4
0     Ind    36     12
1     Com    16      8
1     Ind    45     15
2     Com    24     12
2     Ind    54     18

如何合理地将DataFrames、areas和demand相乘在一起？

pandas

python

回答 1

Stack Overflow用户

回答已采纳

发布于 2013-09-02 18:42:17

import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3], 
                       'Elec':[8,9]}, index=['Com', 'Ind'])

def multiply_by_demand(series):
    return demand.ix[series.name].apply(lambda x: x*series).stack()
df = areas.apply(multiply_by_demand).unstack(0)
print(df)

收益率

    Com          Ind       
   Elec  Water  Elec  Water
0     8      4    36     12
1    16      8    45     15
2    24     12    54     18

是如何工作的：

首先，看看当我们调用areas.apply(foo)时会发生什么。foo一个接一个地传递areas的列：

def foo(series):
    print(series)

In [226]: areas.apply(foo)
0    1
1    2
2    3
Name: Com, dtype: int64
0    4
1    5
2    6
Name: Ind, dtype: int64

因此，假设series就是这样的一列：

In [230]: series = areas['Com']

In [231]: series
Out[231]: 
0    1
1    2
2    3
Name: Com, dtype: int64

我们可以通过这个系列来满足不同的需求：

In [229]: demand.ix['Com'].apply(lambda x: x*series)
Out[229]: 
       0   1   2
Elec   8  16  24
Water  4   8  12

这是我们想要的数字的一半，但不是我们想要的形式。现在，apply需要返回一个Series，而不是DataFrame。将DataFrame转换为Series的一种方法是使用stack。看看如果我们stack这个DataFrame会发生什么。这些列成为索引的一个新级别：

In [232]: demand.ix['Com'].apply(lambda x: x*areas['Com']).stack()
Out[232]: 
Elec   0     8
       1    16
       2    24
Water  0     4
       1     8
       2    12
dtype: int64

因此，使用它作为multiply_by_demand的返回值，我们得到：

In [235]: areas.apply(multiply_by_demand)
Out[235]: 
         Com  Ind
Elec  0    8   36
      1   16   45
      2   24   54
Water 0    4   12
      1    8   15
      2   12   18

现在，我们希望索引的第一层成为列。这可以用unstack来完成

In [236]: areas.apply(multiply_by_demand).unstack(0)
Out[236]: 
    Com          Ind       
   Elec  Water  Elec  Water
0     8      4    36     12
1    16      8    45     15
2    24     12    54     18

根据注释中的请求，下面是pivot_table解决方案：

import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3], 
                       'Elec':[8,9]}, index=['Com', 'Ind'])

areas = pd.DataFrame({'area': areas.stack()})
areas.index.names = ['Edge', 'Type']
both = areas.reset_index(1).join(demand, on='Type')
both['Elec'] = both['Elec'] * both['area']
both['Water'] = both['Water'] * both['area']
both.reset_index(inplace=True)
both = both.pivot_table(values=['Elec', 'Water'], rows='Edge', cols='Type')
both = both.reorder_levels([1,0], axis=1)
both = both.reindex(columns=both.columns[[0,2,1,3]])
print(both)

票数 5

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/18578686

复制

相似问题

问熊猫:加入外卖产品
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫:加入外卖产品EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫:加入外卖产品
EN