在将其标记为副本之前,我查看了以下内容:question1 question2 source3
对于每个农民,我试着计算两件事: 1)成熟果实的百分比x:%(成熟果实x) /(总成熟果实) 2)果实成熟果实的百分比x:%(成熟果实x)/(总果实x)
基于成熟的水果指示器(1代表成熟,0代表未成熟)。
输入:
df = pd.DataFrame({'Farmer': ['Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Sallys','Tims','Tims','Tims','Tims'],
'Fruit':['Apple','Apple','Apple','Grape','Grape','Grape','Grape','Cherry','Cherry','Cherry','Cherry','Cherry','Cherry','Cherry','Cherry'],
'Type': ['Red','Yellow','Green','Red seedless','Red with seeds','Green','Purple','Montmorency','Morello','Bing','Rainer','Montmorency','Morello','Bing','Rainer'],
'Number':[2,6,2,1,1,6,2,3,1,3,3,3,1,3,3],
'Ripe':[1,1,0,1,0,1,1,0,0,0,1,0,0,0,1]})
df
Farmer Fruit Number Ripe Type
0 Sallys Apple 2 1 Red
1 Sallys Apple 6 1 Yellow
2 Sallys Apple 2 0 Green
3 Sallys Grape 1 1 Red seedless
4 Sallys Grape 1 0 Red with seeds
5 Sallys Grape 6 1 Green
6 Sallys Grape 2 1 Purple
7 Sallys Cherry 3 0 Montmorency
8 Sallys Cherry 1 0 Morello
9 Sallys Cherry 3 0 Bing
10 Sallys Cherry 3 1 Rainer
11 Tims Cherry 3 0 Montmorency
12 Tims Cherry 1 0 Morello
13 Tims Cherry 3 0 Bing
14 Tims Cherry 3 1 Rainer
所需输出:
Farmer Fruit %(ripe fruit x)/(total ripe fruit) %(ripe fruit x)/(total fruit x)
0 Sallys Apple 40 80
1 Sallys Grape 45 90
2 Sallys Cherry 15 30
3 Tims Cherry 100 30
发布于 2018-10-17 15:40:09
首先聚合sum
并按unstack
整形,然后用sum
除以div
df1 = df.groupby(['Farmer','Fruit','Ripe'], sort=False)['Number'].sum().unstack()
a = df1[1].div(df1[1].sum(level=0)).mul(100)
b = df1[1].div(df1.sum(axis=1)).mul(100)
keys = ('%(ripe fruit x)/(total ripe fruit)','%(ripe fruit x)/(total fruit x)')
df2 = pd.concat([a,b], axis=1, keys=keys).reset_index()
print (df2)
Farmer Fruit %(ripe fruit x)/(total ripe fruit) \
0 Sallys Apple 40.0
1 Sallys Grape 45.0
2 Sallys Cherry 15.0
3 Tims Cherry 100.0
%(ripe fruit x)/(total fruit x)
0 80.0
1 90.0
2 30.0
3 30.0
https://stackoverflow.com/questions/52848336
复制相似问题