首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >按组计算所有成对组合的频率。

按组计算所有成对组合的频率。
EN

Stack Overflow用户
提问于 2019-04-09 13:31:22
回答 1查看 369关注 0票数 1

我想通过itemgroup来计算所有成对组合的频率。

代码语言:javascript
运行
复制
have <- data.frame(group=c("a", "a", "a", 
                           "b", "b", 
                           "c",
                           "d", "d",
                           "e", "e",
                           "f", "f", "f"),
                   item=c("apple", "banana", "black cherry",
                          "apple", "black cherry",
                          "orange",
                          "banana", "black cherry",
                          "banana", "black cherry",
                          "apple", "banana", "black cherry"))

have
#    group           item
# 1      a          apple
# 2      a         banana
# 3      a   black cherry
# 4      b          apple
# 5      b   black cherry
# 6      c         orange
# 7      d         banana
# 8      d   black cherry
# 9      e         banana
# 10     e   black cherry
# 11     f          apple
# 12     f         banana
# 13     f   black cherry

# almost what I want...
# cons: repeats pairs and does not include zeros
have %>% 
# https://stackoverflow.com/a/38335011/841405
  full_join(have, by="group") %>% 
  group_by(item.x, item.y) %>% 
  summarise(length(unique(group))) %>% 
  filter(item.x!=item.y) %>%
  mutate(item = paste(item.x, item.y, sep=", "))

#         item.x       item.y  `length(unique(group))`                item                
# 1 apple        banana                             2 apple, banana       
# 2 apple        black cherry                       3 apple, black cherry 
# 3 banana       apple                              2 banana, apple       
# 4 banana       black cherry                       4 banana, black cherry
# 5 black cherry apple                              3 black cherry, apple 
# 6 black cherry banana                             4 black cherry, banana

# want I really want

#         item.x       item.y  `length(unique(group))`                item                
# 1 apple        banana                             2 apple, banana       
# 2 apple        black cherry                       3 apple, black cherry 
# 3 apple        orange                             0 apple, orange
# 4 banana       black cherry                       4 banana, black cherry
# 5 banana       orange                             0 banana, orange
# 6 black cherry orange                             0 black cherry, orange
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-04-09 13:42:26

我这样做的方法是使用expand.grid进行每个组合,然后加入您已经完成的内容,然后用零填充不匹配的行。我也把你的数改名为n。

代码语言:javascript
运行
复制
have2 = have %>% 
  full_join(have, by="group") %>% 
  group_by(item.x, item.y) %>% 
  summarise(n = length(unique(group))) %>% 
  filter(item.x!=item.y) %>%
  mutate(item = paste(item.x, item.y, sep=", "))

combos = expand.grid(item.x = unique(have$item),
                    item.y = unique(have$item)) %>% 
  filter(as.numeric(item.x) < as.numeric(item.y)) %>% 
  mutate(item = paste(item.x, item.y, sep = ', ')) %>% 
  arrange(item.x, item.y) %>% 
  left_join(have2) %>% 
  mutate(n = replace(n, is.na(n), 0))
票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55594121

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档