文章/答案/技术大牛

发布

问将行组合到列- Pandas
EN

Stack Overflow用户

提问于 2022-09-22 00:36:25

回答 2查看 64关注 0票数 -2

如果我有以下数据，我需要计算数据集中每个组内的所有可能系数：

ID Country_code  V1   V2
1  US            0.4  1
1  GB            0.6  2
1  AU            0.4  3
2  US            0.5  2
2  CL            0.4  2

我需要这个作为输出：

ID Country_code  coefV1   coefV2
1  US-GB         0.66     0.5
1  US-AU         1        0.33
1  GB-AU         1.5      0.66
2  US-CL         1.25     1

我想先扩展dataframe，类似于：

ID Country_code  V1-1   V1-2   V2-1   V2-2
1  US-GB         0.4    0.6    1      2
1  US-AU         0.4    0.4    1      3
1  GB-AU         0.6    0.4    2      3
2  US-CL         0.5    0.4    2      2

但我也做不到。

有什么想法吗？谢谢!

python

pandas

numpy

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-09-23 21:50:17

尽管#bbis的回答做了工作(荣誉！谢谢！)，我终于做到了以下几点：

def combinateRows(df):
    a, b = map(list, zip(*it.combinations(df.index, 2)))
    d = pd.concat([df.loc[a].reset_index(), df.loc[b].reset_index()],keys=['a', 'b'], axis=1)
    return d.set_index([('a', 'affiliate_id'), ('b', 'affiliate_id')]).rename_axis(['a', 'b'])

df = df.groupby('ID', as_index = False).apply(combinateRows)

df['coefV1'] = df['a V1'] / df['b V1']
df['coefV2'] = df['a V2'] / df['b V2']

强烈受Pandas: all possible combinations of rows影响

这种方法的优点是避免显式循环。

票数 0

Stack Overflow用户

发布于 2022-09-22 01:38:28

您可以尝试以下方法：

import pandas as pd

df = pd.DataFrame({
    'ID': [1, 1, 1, 2, 2],
    'Country': ['US', 'GB', 'AU', 'US', 'CL'],
    'V1': [0.4, 0.6, 0.4, 0.5, 0.4],
    'V2': [1, 2, 3, 2, 2]
})

def f(df):
    dfs=[]
    for c in ['V1', 'V2']:
        d = pd.DataFrame(df[c].values / df[c].values[:, None],
                         index=df['Country'],
                         columns=df['Country'])
        d.columns.name = 'Country2'
        d = d.unstack().reset_index()
        d = d[d['Country'] < d['Country2']]
        d['County Pair'] = d['Country2'] + "/" + d['Country']
        d = d[['County Pair', 0]]
        d = d.set_index('County Pair')
        d.columns = ['Q' + c]
        dfs.append(d)
    return pd.concat(dfs, axis=1)
    
print(df.groupby(by='ID').apply(f))

它规定：

                     QV1       QV2
ID County Pair                    
1  US/GB        0.666667  0.500000
   US/AU        1.000000  0.333333
   GB/AU        1.500000  0.666667
2  US/CL        1.250000  1.000000

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/73808201

复制

相似问题

问将行组合到列- Pandas
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将行组合到列- PandasEN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将行组合到列- Pandas
EN