您好,我需要使用以下条件过滤此数据帧。
下面是有问题的数据帧:
# A tibble: 14 x 4
user_id date order_type plan
<dbl> <chr> <chr> <chr>
1 123 2019-02 acquisition 3M
2 123 2019-05 repeats 3M
3 123 2019-08 repeats 3M
4 124 2019-02 acquisition 1M
5 124 2019-03 repeats 3M
6 124 2019-06 repeats 3M
7 125 2019-08 acquisition 1M
8 125 2019-09 repeats 1M
9 126 2019-07 acquisition 3M
10 126 2019-10 repeats 1M
11 126 2019-11 repeats 1M
12 127 2019-05 acquisition 3M
13 127 2019-08 repeats 3M
14 127 2019-11 repeats 3M 可重现的例子:
df <- tibble::tribble(
~user_id, ~date, ~order_type, ~plan,
123, "2019-02", "acquisition", "3M",
123, "2019-05", "repeats", "3M",
123, "2019-08", "repeats", "3M",
124, "2019-02", "acquisition", "1M",
124, "2019-03", "repeats", "3M",
124, "2019-06", "repeats", "3M",
125, "2019-08", "acquisition", "1M",
125, "2019-09", "repeats", "1M",
126, "2019-07", "acquisition", "3M",
126, "2019-10", "repeats", "1M",
126, "2019-11", "repeats", "1M",
127, "2019-05", "acquisition", "3M",
127, "2019-08", "repeats", "3M",
127, "2019-11", "repeats", "3M"
)我需要筛选:*筛选user_id条目(标记为"acquisition")具有名为"3M“的计划的行*对于那些user_id标识为"3M”的所有后续订单
以下是预期结果:
# A tibble: 7 x 4
user_id date order_type plan
<dbl> <chr> <chr> <chr>
1 123 2019-02 acquisition 3M
2 123 2019-05 repeats 3M
3 123 2019-08 repeats 3M
4 126 2019-07 acquisition 3M
5 127 2019-05 acquisition 3M
6 127 2019-08 repeats 3M
7 127 2019-11 repeats 3M 可重现的例子:
df_filtered <- tibble::tribble(
~user_id, ~date, ~order_type, ~plan,
123, "2019-02", "acquisition", "3M",
123, "2019-05", "repeats", "3M",
123, "2019-08", "repeats", "3M",
126, "2019-07", "acquisition", "3M",
127, "2019-05", "acquisition", "3M",
127, "2019-08", "repeats", "3M",
127, "2019-11", "repeats", "3M"
)发布于 2020-05-07 23:23:02
这里有一个dplyr解决方案,但不确定它是否能在更大范围内实现这一点:
df %>%
group_by(user_id) %>%
mutate(keep = case_when(any(plan == "3M" & order_type == "acquisition")~"Y", TRUE ~ "N")) %>%
filter(keep == "Y" & plan != "1M")https://stackoverflow.com/questions/61660961
复制相似问题