我有一个名为data_logistic的数据帧,它有三列: sim、时间和感染。
我使用以下命令最大限度地消除了所有感染的重复:
library(dplyr)
data_logistic_small <-
data_logistic %>%
group_by(sim1.2) %>%
distinct(infected1)
但是,当我这样做时,我丢失了时间列,有没有一种方法可以保留所有三列,同时使用distinct函数删除所有重复的感染值?
这是一个可重复使用的data_frame,这不是真实的数据帧,但问题仍然存在:
structure(list(sim1.2 = c("sim1", "sim1", "sim1", "sim1", "sim2",
"sim2", "sim2", "sim2", "sim3", "sim3", "sim3", "sim3", "sim4",
"sim4", "sim4", "sim4", "sim5", "sim5", "sim5", "sim5", "sim6",
"sim6", "sim6", "sim6", "sim7", "sim7", "sim7", "sim7", "sim8",
"sim8", "sim8", "sim8", "sim9", "sim9", "sim9", "sim9", "sim10",
"sim10", "sim10", "sim10", "sim11", "sim11", "sim11", "sim11",
"sim12", "sim12", "sim12", "sim12", "sim13", "sim13", "sim13",
"sim13", "sim14", "sim14", "sim14", "sim14", "sim15", "sim15",
"sim15", "sim15", "sim16", "sim16", "sim16", "sim16", "sim17",
"sim17", "sim17", "sim17", "sim18", "sim18", "sim18", "sim18",
"sim19", "sim19", "sim19", "sim19", "sim20", "sim20", "sim20",
"sim20", "sim21", "sim21", "sim21", "sim21", "sim22", "sim22",
"sim22", "sim22", "sim23", "sim23", "sim23", "sim23", "sim24",
"sim24", "sim24", "sim24", "sim25", "sim25", "sim25", "sim25",
"sim26", "sim26", "sim26", "sim26", "sim27", "sim27", "sim27",
"sim27", "sim28", "sim28", "sim28", "sim28", "sim29", "sim29",
"sim29", "sim29", "sim30", "sim30", "sim30", "sim30", "sim31",
"sim31", "sim31", "sim31", "sim32", "sim32", "sim32", "sim32",
"sim33", "sim33", "sim33", "sim33", "sim34", "sim34", "sim34",
"sim34", "sim35", "sim35", "sim35", "sim35", "sim36", "sim36",
"sim36", "sim36"), time1 = c(10, 50, 100, 200, 10, 50, 100, 200,
10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50,
100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200,
10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50,
100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200,
10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50,
100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200,
10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50,
100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200,
10, 50, 100, 200, 10, 50, 100, 200, 10, 50, 100, 200, 10, 50,
100, 200, 10, 50, 100, 200, 10, 50, 100, 200), infected1 = c(200,
1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000,
200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000,
1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000,
1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200,
1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000,
200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000,
1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000,
1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200,
1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000,
200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000,
1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000,
1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000, 200,
1000, 1000, 1000, 200, 1000, 1000, 1000, 200, 1000, 1000, 1000
)), class = "data.frame", row.names = c(NA, -144L))
发布于 2020-09-30 22:45:09
非常简单的解决方案,使用
%>% distinct (infected, .keep_all=TRUE).
https://stackoverflow.com/questions/64139565
复制相似问题