我有两个数据帧。对于df1的某些行,在df2中有匹配的行。现在,应该对df1的一些列进行操作,使它们包含自己的值和df2中的等价值之和。
在下面的示例中,应该对列'count1‘和'count2’求和,而不是对列'type‘求和。
df1 <- data.frame(id = c("one_a", "two_a", "three_a", "four_a"), type = c(8,7,6,5), count1 = c(1,2,1,NA), count2 = c(NA,0,1,0), id_df2 = c("one", "two", "three", "four"))
df2 <- data.frame(id = c("one", "two", "four"), type = c(8,7,5), count1 = c(0,1,1), count2 = c(0,0,1))
result <- data.frame(id = c("one_a", "two_a", "three_a", "four_a"), type = c(8,7,6,5), count1 = c(1,3,1,1), count2 = c(0,0,1,1))
> df1
id type count1 count2 id_df2
1 one_a 8 1 NA one
2 two_a 7 2 0 two
3 three_a 6 1 1 three
4 four_a 5 NA 0 four
> df2
id type count1 count2
1 one 8 0 0
2 two 7 1 0
3 four 5 1 1
> result
id type count1 count2
1 one_a 8 1 0
2 two_a 7 3 0
3 three_a 6 1 1
4 four_a 5 1 1也有类似的问题,我试图通过拆分数据帧并随后将其合并来找到解决方案。我只是想知道是否有更优雅的方式来做这件事。我的原始数据集大约有300列,所以我正在寻找一个可伸缩的解决方案。
预先感谢chuckmorris
发布于 2019-01-31 04:52:29
稍微不那么优雅,但仍然有效:
result_2 <- df2 %>%
mutate(id = paste0(id, "_a")) %>%
bind_rows(df1) %>%
select(-id_df2) %>%
replace(., is.na(.), 0) %>%
group_by(id) %>%
summarise(count1 = sum(count1), count2 = sum(count2), type = max(type)) %>%
mutate(id_df2 = as.factor(id)) %>%
select(c(id_df2, type, count1, count2), -id)https://stackoverflow.com/questions/54444226
复制相似问题