文章/答案/技术大牛

发布

社区首页 >问答首页 >如何将列数据与行数据进行匹配？

问如何将列数据与行数据进行匹配？
EN

Stack Overflow用户

提问于 2021-12-03 06:58:33

回答 2查看 42关注 0票数 0

我有一个体育数据集，内容如下：

season  team   tm   shk   dgs   brs   cts   cws  avg_pt_marg
2015    sharks shk  0-0   1-3   2-0   4-1   3-2    1.2
2015    dogs   dgs  3-1   0-0   2-1   1-1   2-0    3.4
2015    bears  brs  0-2   1-2   0-0   1-3   2-1    -0.2
2015    cats   cts  1-4   1-1   3-1   0-0   2-2    2.0
2015    cows   cws  2-3   0-2   1-2   2-2   0-0    -2.1
2014    sharks shk  0-0   1-3   2-0   4-1   3-2    0.7
2014    dogs   dgs  3-1   0-0   2-1   1-1   2-0    1.8
2014    bears  brs  0-2   1-2   0-0   1-3   2-1    -1.9
2014    cats   cts  1-4   1-1   3-1   0-0   2-2    2.3
2014    cows   cws  2-3   0-2   1-2   2-2   0-0    -3.0

我想为每一行(一支球队的一个赛季)添加一列，其中包含球队对手的平均积分差距。

计算方法是：(该赛季)与一支球队的比赛次数乘以该球队(该赛季)的积分差除以(该赛季)比赛的总次数。

例如，对于2015年的sharks,，一支球队的对手的平均积分差距将是((4 x 3.4)+(2 x -0.2)+(5 x 2.0)+(5 x -2.1)) / 16.

如何计算此列，然后将其添加到数据帧中？

如下所示：

season  team   tm   shk   dgs   brs   cts   cws  avg_pt_marg opponent_marg
2015    sharks shk  0-0   1-3   2-0   4-1   3-2    1.2
2015    dogs   dgs  3-1   0-0   2-1   1-1   2-0    3.4
2015    bears  brs  0-2   1-2   0-0   1-3   2-1    -0.2
2015    cats   cts  1-4   1-1   3-1   0-0   2-2    2.0
2015    cows   cws  2-3   0-2   1-2   2-2   0-0    -2.1
2014    sharks shk  0-0   1-3   2-0   4-1   3-2    0.7
2014    dogs   dgs  3-1   0-0   2-1   1-1   2-0    1.8
2014    bears  brs  0-2   1-2   0-0   1-3   2-1    -1.9
2014    cats   cts  1-4   1-1   3-1   0-0   2-2    2.3
2014    cows   cws  2-3   0-2   1-2   2-2   0-0    -3.0

dataframe

回答 2

Stack Overflow用户

发布于 2021-12-03 07:20:33

嗯，不是很漂亮，但是

do.call(
  rbind,
  by(df,list(df$season),function(x){
    tmp=sapply(
      1:nrow(x),
      function(i){
        unlist(
          lapply(
            strsplit(
              as.character(x[i,grepl("tm[0-9]+",colnames(x))]),
              "-"
            ),
            function(y){
              sum(as.numeric(y))
            }
          )
        )
      }
    )
    cbind(
      x,
      "opponent_marg"=colSums(tmp*x[,"avg_pt_marg"])/colSums(tmp)
    )
  })
)

结果是

        season  team  tm tm1 tm2 tm3 tm4 tm5 avg_pt_marg opponent_marg
2014.6    2014 team1 tm1 0-0 1-3 2-0 4-1 3-2         0.7    -0.0062500
2014.7    2014 team2 tm2 3-1 0-0 2-1 1-1 2-0         1.8    -0.3909091
2014.8    2014 team3 tm3 0-2 1-2 0-0 1-3 2-1        -1.9     0.5833333
2014.9    2014 team4 tm3 1-4 1-1 3-1 0-0 2-2         2.3    -0.8333333
2014.10   2014 team5 tm5 2-3 0-2 1-2 2-2 0-0        -3.0     0.7571429
2015.1    2015 team1 tm1 0-0 1-3 2-0 4-1 3-2         1.2     0.7937500
2015.2    2015 team2 tm2 3-1 0-0 2-1 1-1 2-0         3.4     0.3636364
2015.3    2015 team3 tm3 0-2 1-2 0-0 1-3 2-1        -0.2     1.1916667
2015.4    2015 team4 tm3 1-4 1-1 3-1 0-0 2-2         2.0     0.2400000
2015.5    2015 team5 tm5 2-3 0-2 1-2 2-2 0-0        -2.1     1.4428571

票数 0

Stack Overflow用户

发布于 2021-12-03 07:49:13

另一个不是很好的解决方案，但它是一个相当复杂的小任务-它有很多组件。我在这里使用的是data.table -如果您不熟悉它们，它们只是data.frame的一个增强版，提供了一些额外的功能

library(data.table)
setDT(dt1)

首先，将数据整形为更长的格式

# Reshape the data
dt2 <- dt1[, melt(.SD, id.vars=c("tm", "team", "season", "avg_pt_marg"))]

我还过滤掉了球队和对手比赛的案例。这一步还为比赛/游戏的数量创建了一个变量

# Filter out cases where team and opponent match
dt2 <- dt2[tm != variable,][,
  # Get number of games played
  `:=`("games_played" = as.numeric(tstrsplit(value, "-")[[1]])+
                        as.numeric(tstrsplit(value, "-")[[2]]))]

然后，最后一步为您提供所需的值：

# Get the team/season averages
dt3 <- dt2[, sum(avg_pt_marg*games_played)/sum(games_played), keyby=.(season, "tm" = variable)]

您可以将其与data.table连接合并

dt1 <- dt1[dt3, on=c("tm", "season")]

给予：

    season   team  tm shk dgs brs cts cws avg_pt_marg         V1
 1:   2014 sharks shk 0-0 1-3 2-0 4-1 3-2         0.7 -0.0062500
 2:   2014   dogs dgs 3-1 0-0 2-1 1-1 2-0         1.8 -0.3909091
 3:   2014  bears brs 0-2 1-2 0-0 1-3 2-1        -1.9  0.5833333
 4:   2014   cats cts 1-4 1-1 3-1 0-0 2-2         2.3 -0.8333333
 5:   2014   cows cws 2-3 0-2 1-2 2-2 0-0        -3.0  0.7571429
 6:   2015 sharks shk 0-0 1-3 2-0 4-1 3-2         1.2  0.7937500
 7:   2015   dogs dgs 3-1 0-0 2-1 1-1 2-0         3.4  0.3636364
 8:   2015  bears brs 0-2 1-2 0-0 1-3 2-1        -0.2  1.1916667
 9:   2015   cats cts 1-4 1-1 3-1 0-0 2-2         2.0  0.2400000
10:   2015   cows cws 2-3 0-2 1-2 2-2 0-0        -2.1  1.4428571

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70210549

复制

相似问题

问如何将列数据与行数据进行匹配？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何将列数据与行数据进行匹配？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何将列数据与行数据进行匹配？
EN