如何从多个答案列中创建多个(虚拟)列?我想做这件事完全自动,这样它就可以自动检测答案。如下所示的函数: create_multiple_columns(df$posessions,sep =“")
为了便于说明,它会自动创建“帽子”、“桌子”和“笔”虚拟对象。
df = data.frame(person = c(1, 2, 3, 4),
posessions = c("hat table pen", "hat", "table", "hat pen"),
hat = c(1,1,0,1),
table = c(1,0,1,0),
pen = c(1,0,0,1)
)发布于 2020-08-26 20:33:59
我会推荐一种tidyverse方法,通过重塑数据,你可以得到更接近你想要的东西:
library(tidyverse)
#Data
df2 <- structure(list(person = c(1, 2, 3, 4), posessions = structure(c(3L,
1L, 4L, 2L), .Label = c("hat", "hat pen", "hat table pen", "table"
), class = "factor")), class = "data.frame", row.names = c(NA,
-4L))代码:
df2 %>% separate(posessions,into = c('v1','v2','v3'),sep = ' ') %>%
pivot_longer(cols = -1) %>% filter(!is.na(value)) %>%
group_by(person,value) %>% summarise(N=n()) %>%
pivot_wider(names_from = value, values_from=N) %>%
replace(is.na(.),0)输出:
# A tibble: 4 x 4
# Groups: person [4]
person hat pen table
<dbl> <int> <int> <int>
1 1 1 1 1
2 2 1 0 0
3 3 0 0 1
4 4 1 1 0发布于 2020-08-26 20:36:52
使用data.table
df = data.table(
person = c(1, 2, 3, 4),
posessions = c("hat table pen", "hat", "table", "hat pen")
)
all_words <- df$posessions %>% str_split(" ") %>% unlist() %>% unique()
df[, (all_words) := map(all_words, ~str_detect(posessions, .x) * 1L)]发布于 2020-08-26 20:39:54
您可以使用strsplit拆分字符串,获取unique单词,并使用%in%进行测试。
x <- strsplit(df$posessions, " ")
y <- unique(unlist(x))
z <- +(do.call(rbind, lapply(x, "%in%", x=y)))
colnames(z) <- y
cbind(df[1:2], z)
# person posessions hat table pen
#1 1 hat table pen 1 1 1
#2 2 hat 1 0 0
#3 3 table 0 1 0
#4 4 hat pen 1 0 1https://stackoverflow.com/questions/63597594
复制相似问题