我有以下格式的数据:
DATE GROUP EVENT ELIGIBLE
2021-3-9 A 1 1
2021-3-1 A 0 0
2021-3-1 B 0 1
2021-2-20 B 1 1我想按GROUP列对数据进行分组,然后添加三个新列,这三个列按组计算以下时间范围的(事件/合格)总和。过去三个月,三个月回到六个月前,最后一年。
我通过执行以下操作计算了没有单独时间框架的总体百分比:
grouped <- data %>%
filter(ELIGIBLE == 1 ) %>%
group_by(GROUP) %>%
mutate(count_Eligible = sum(ELIGIBLE == 1 )) %>%
mutate(count_events = sum(EVENT == 1 )) %>%
mutate(Percentage = round(100*count_events/count_Eligible,2))我想知道在时间范围内添加三个不同百分比的最干净的方法是什么。到目前为止,我已经使用以下代码提取了日期以进行过滤:
today <- Sys.Date()
three_month_lookback <- as.Date(today) - months(3)
six_month_lookback <- as.Date(today) - months(6)
one_year_lookback <- as.Date(today) - months(12)发布于 2021-03-10 04:28:41
我们可以创建一个函数来进行计算
library(dplyr)
library(purrr)
f1 <- function(data) {
data %>%
filter(ELIGIBLE == 1 ) %>%
group_by(GROUP) %>%
transmute(count_Eligible = sum(ELIGIBLE == 1 ),
count_events = sum(EVENT == 1 ),
Percentage = round(100*count_events/count_Eligible,2))
}然后,循环遍历“lookback”周期,根据“DATE”列设置数据子集,并应用函数
map2_dfr(list(three_month_lookback, six_month_lookback,
one_year_lookback) list(today(), three_month_lookback, today()),
~ data %>%
mutate(DATE = as.Date(DATE)) %>%
filter(DATE >= .x, DATE <= .y) %>%
f1(.), .id = 'grp'
)如果我们需要按列组合
map2(list(three_month_lookback, six_month_lookback,
one_year_lookback) list(today(), three_month_lookback, today()),
~ data %>%
mutate(DATE = as.Date(DATE)) %>%
filter(DATE >= .x, DATE <= .y) %>%
f1(.)
) %>%
reduce(full_join, by = "GROUP")https://stackoverflow.com/questions/66554272
复制相似问题