我有一个示例数据帧,如下所示,我正在使用Python语言中的groupby
来解决四列col
alpha
lambda
n_fold
相等的问题,然后求count
列的和并执行类似(score*count)/sum(count)
的数学运算
df =
col fold alpha lambda score n_fold count
0 0.5 0 0 1 -0.424915241 1 3966
1 0.5 1 0 1 -1.669508557 1 10182
2 0.5 2 0 1 -0.157958626 1 17048
3 0.75 0 0 1 -0.459086614 1 3966
4 0.75 1 0 1 -1.830245577 1 10182
5 0.75 2 0 1 -0.173278918 1 17048
6 1 0 0 1 -0.442985033 2 3966
7 1 1 0 1 -1.886578419 2 10182
8 1 2 0 1 -0.18286539 2 17048
输出:
`col alpha lambda fold final 0 0.5 0 1 1 -0.685249027 1 0.75 0 1 1 -0.750428163 2 1 0 1 2 -0.772006323` I have tried below code but I am not able to solve. Is there anyway to solve this.
代码:df2 = (df.groupby(['sample', 'alpha', 'lambda', 'n_fold']).apply(lambda x: (x.score*x.count)/sum(count)).to_frame('final'))
发布于 2018-07-25 09:00:59
IIUC
df.groupby(['col', 'alpha', 'lambda', 'n_fold']).apply(lambda x: sum((x['score']*x['count']))/sum(x['count']))
Out[352]:
col alpha lambda n_fold
0.50 0 1 1 -0.685249
0.75 0 1 1 -0.750428
1.00 0 1 2 -0.772006
dtype: float64
https://stackoverflow.com/questions/51509371
复制相似问题