首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在Pandas dataframe中按groupby聚合后将列组合为字符串

在Pandas dataframe中按groupby聚合后将列组合为字符串
EN

Stack Overflow用户
提问于 2021-08-31 06:08:33
回答 2查看 41关注 0票数 1

我有一个这样的DataFrame,它是在一些统计模型实验之后出现的。

代码语言:javascript
复制
    data = {
        "cat1": {
            (1, "class1", "metric1"): 0.9520103335380554,
            (1, "class1", "metric2"): 0.9596380591392517,
            (1, "class2", "metric1"): 0.9013115167617798,
            (1, "class2", "metric2"): 0.9917504191398621,
            (1, "class3", "metric1"): 0.9027230143547058,
            (1, "class3", "metric2"): 0.8536863327026367,
            (2, "class1", "metric1"): 0.8746241331100464,
            (2, "class1", "metric2"): 0.8844705820083618,
            (2, "class2", "metric1"): 0.7890198826789856,
            (2, "class2", "metric2"): 0.6964980363845825,
            (2, "class3", "metric1"): 0.9410034418106079,
            (2, "class3", "metric2"): 0.9601017236709595,
            (3, "class1", "metric1"): 0.9640659689903259,
            (3, "class1", "metric2"): 0.9766426682472229,
            (3, "class2", "metric1"): 0.893884003162384,
            (3, "class2", "metric2"): 0.9959416389465332,
            (3, "class3", "metric1"): 0.9533607363700867,
            (3, "class3", "metric2"): 0.9378591179847717,
        },
        "cat2": {
            (1, "class1", "metric1"): 0.9520103335380554,
            (1, "class1", "metric2"): 0.9596380591392517,
            (1, "class2", "metric1"): 0.9013115167617798,
            (1, "class2", "metric2"): 0.9917504191398621,
            (1, "class3", "metric1"): 0.9027230143547058,
            (1, "class3", "metric2"): 0.8536863327026367,
            (2, "class1", "metric1"): 0.8746241331100464,
            (2, "class1", "metric2"): 0.8844705820083618,
            (2, "class2", "metric1"): 0.7890198826789856,
            (2, "class2", "metric2"): 0.6964980363845825,
            (2, "class3", "metric1"): 0.9410034418106079,
            (2, "class3", "metric2"): 0.9601017236709595,
            (3, "class1", "metric1"): 0.9640659689903259,
            (3, "class1", "metric2"): 0.9766426682472229,
            (3, "class2", "metric1"): 0.893884003162384,
            (3, "class2", "metric2"): 0.9959416389465332,
            (3, "class3", "metric1"): 0.9533607363700867,
            (3, "class3", "metric2"): 0.9378591179847717,
        },
        "cat3": {
            (1, "class1", "metric1"): 0.8746241331100464,
            (1, "class1", "metric2"): 0.8844705820083618,
            (1, "class2", "metric1"): 0.7890198826789856,
            (1, "class2", "metric2"): 0.6964980363845825,
            (1, "class3", "metric1"): 0.9410034418106079,
            (1, "class3", "metric2"): 0.9601017236709595,
            (2, "class1", "metric1"): 0.9309893846511841,
            (2, "class1", "metric2"): 0.884644627571106,
            (2, "class2", "metric1"): 0.861851155757904,
            (2, "class2", "metric2"): 0.9180170893669128,
            (2, "class3", "metric1"): 0.8841384649276733,
            (2, "class3", "metric2"): 0.8577012419700623,
            (3, "class1", "metric1"): 0.8895564675331116,
            (3, "class1", "metric2"): 0.8351058959960938,
            (3, "class2", "metric1"): 0.832390308380127,
            (3, "class2", "metric2"): 0.8969333171844482,
            (3, "class3", "metric1"): 0.7883192300796509,
            (3, "class3", "metric2"): 0.8577012419700623,
        },
    }
    df = pd.DataFrame(data)
    df = df.rename_axis(("experiment", "class", "metric"))
    df.groupby(["class", "metric"]).agg(["mean", "std"])

在对每个实验进行分组和聚合之后,如何合并多索引的第二级列,以便输出字符串连接和舍入,并在其间插入一些符号,如下所示:

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-08-31 06:16:10

您可以使用f-string为自定义输出更改函数:

代码语言:javascript
复制
f = lambda x: f'{round(x.mean(), 2)} +/- {round(x.std(), 2)}'
df = df.groupby(["class", "metric"]).agg(f)
print (df)
                         cat1           cat2           cat3
class  metric                                              
class1 metric1  0.93 +/- 0.05  0.93 +/- 0.05   0.9 +/- 0.03
       metric2  0.94 +/- 0.05  0.94 +/- 0.05  0.87 +/- 0.03
class2 metric1  0.86 +/- 0.06  0.86 +/- 0.06  0.83 +/- 0.04
       metric2  0.89 +/- 0.17  0.89 +/- 0.17  0.84 +/- 0.12
class3 metric1  0.93 +/- 0.03  0.93 +/- 0.03  0.87 +/- 0.08
       metric2  0.92 +/- 0.06  0.92 +/- 0.06  0.89 +/- 0.06

或者通过DataFrame.xs选择级别,使用convert to string进行四舍五入,通过+/-进行最后连接

代码语言:javascript
复制
df = df.groupby(["class", "metric"]).agg(["mean", "std"])

df = (df.xs('mean', axis=1, level=1).round(2).astype(str) + '+/-' + 
      df.xs('std', axis=1, level=1).round(2).astype(str))
print (df)
                       cat1         cat2         cat3
class  metric                                        
class1 metric1  0.93+/-0.05  0.93+/-0.05   0.9+/-0.03
       metric2  0.94+/-0.05  0.94+/-0.05  0.87+/-0.03
class2 metric1  0.86+/-0.06  0.86+/-0.06  0.83+/-0.04
       metric2  0.89+/-0.17  0.89+/-0.17  0.84+/-0.12
class3 metric1  0.93+/-0.03  0.93+/-0.03  0.87+/-0.08
       metric2  0.92+/-0.06  0.92+/-0.06  0.89+/-0.06
票数 4
EN

Stack Overflow用户

发布于 2021-08-31 06:15:37

你可以使用stack+apply+unstack

代码语言:javascript
复制
(df.groupby(["class", "metric"])
   .agg(["mean", "std"])
   .stack(level=0)
   .apply(lambda r: f'{r["mean"]:.2f}±{r["std"]:.2f}', axis=1)
   .unstack(level=-1)
)

输出:

代码语言:javascript
复制
                     cat1       cat2       cat3
class  metric                                  
class1 metric1  0.93±0.05  0.93±0.05  0.90±0.03
       metric2  0.94±0.05  0.94±0.05  0.87±0.03
class2 metric1  0.86±0.06  0.86±0.06  0.83±0.04
       metric2  0.89±0.17  0.89±0.17  0.84±0.12
class3 metric1  0.93±0.03  0.93±0.03  0.87±0.08
       metric2  0.92±0.06  0.92±0.06  0.89±0.06
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68993659

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档