我有这样的数据:
library(data.table)
dt1 <- data.table(
id = 1,
week = c( seq(1:260))
)
dt1[0:100, status := "A"][101:260, status := "B"]
dt2 <- data.table(
id = 2,
week = c( seq(1:260))
)
dt2[0:34, status := "A"][35:70, status := "B"][71:260, status := "A"]
dt3 <- data.table(
id = 3,
week = c( seq(1:260))
)
dt3[0:80, status := "A"][81:90, status := "B"][91:100, status := "A"][101:260, status := "B"]
data <- rbind(dt1,dt2,dt3)
我想要找到,对于每一个独特的id,指数,在此之前,52周期间已经通过了至少75% (39周)与status=="B“。我想要的最终产品如下:
data_want <- data.table(
id = c(1,3),
week = c(153, 133)
)
对于id 1,101后52周( status==B的第一排)是153,至少75%的周期,status==B,等等。
有什么建议吗?
发布于 2022-03-01 09:31:02
如果我理解您需要什么(并忽略您的data_want
),那么我认为这是可行的:
data[, fr := frollmean(status == "B", 52, align = "right", fill = 0), by = .(id)]
data[, .SD[shift(fr >= 0.75, type = "lead"),][1,], by = id]
# id week status fr
# <num> <int> <char> <num>
# 1: 1 138 B 0.7307692
# 2: 2 NA <NA> NA
# 3: 3 128 B 0.7307692
验证:
data[id == 1 & between(week, 130, 140),]
# id week status fr
# <num> <int> <char> <num>
# 1: 1 130 B 0.5769231
# 2: 1 131 B 0.5961538
# 3: 1 132 B 0.6153846
# 4: 1 133 B 0.6346154
# 5: 1 134 B 0.6538462
# 6: 1 135 B 0.6730769
# 7: 1 136 B 0.6923077
# 8: 1 137 B 0.7115385
# 9: 1 138 B 0.7307692
# 10: 1 139 B 0.7500000 <-- 52-week window had 75% "B"
# 11: 1 140 B 0.7692308
https://stackoverflow.com/questions/71305548
复制相似问题