对于一个样本数据:
df1 <- structure(list(practice = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), drug = c("123A456",
"123A567", "123A123", "123A567", "123A456", "123A123", "123A567",
"123A567", "998A125", "123A456", "998A125", "123A567", "123A456",
"998A125", "123A567", "123A567", "123A567", "998A125", "123A123",
"998A125", "123A123", "123A456", "998A125", "123A567", "998A125",
"123A456", "123A123", "998A125", "123A567", "123A567", "998A125",
"123A456", "123A123", "123A567", "123A567", "998A125", "123A456"
), items = c(1, 2, 3, 4, 5, 4, 6, 7, 8, 9, 5, 6, 7, 8, 9, 4,
5, 6, 3, 2, 3, 4, 5, 6, 7, 4, 3, 2, 3, 4, 5, 4, 3, 4, 5, 6, 4
), quantity = c(1, 2, 4, 5, 3, 2, 3, 5, 4, 5, 7, 9, 5, 3, 4,
6, 1, 2, 4, 5, 3, 2, 3, 5, 4, 5, 7, 9, 5, 3, 4, 6, 1, 2, 4, 5,
3)), .Names = c("practice", "drug", "items", "quantity"), row.names = c(NA,
-37L), spec = structure(list(cols = structure(list(practice = structure(list(), class = c("collector_integer",
"collector")), drug = structure(list(), class = c("collector_character",
"collector")), items = structure(list(), class = c("collector_integer",
"collector")), quantity = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("practice", "drug", "items", "quantity"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"), class = c("tbl_df",
"tbl", "data.frame"))我想做各种分析。我认为dplyr将是我的解决方案,但我正在努力如何将该功能组合在一起。
我的数据是一份药品清单,我想总结一下其中的一些药物(由药物代码的前三位数来定义)。
我可以单独做一些分析,例如,用它来概括项目总数:
practice <- df1 %>%
group_by(practice) %>%
summarise(all.items = sum(items))..。我只想看看我感兴趣的药物.
drug123 <- df1 %>%
filter(substr(drug, 1,3)==123)
ALL.drug123 <- aggregate(drug123$quantity, by=list(Category=drug123$practice), FUN=sum)但我怎么把所有的东西都整理好呢?
我想要一个包含以下列的dataframe:
实践(1,2,3在所提供的数据中)。
药品123. drug123项目#
药品123. drug123的数量#
所有药物的all.items #
所有药物的all.quantity #
有什么想法吗?
发布于 2018-08-22 14:51:16
我想这就是你要找的:
df1 %>%
group_by(practice) %>%
summarize(items_123 = sum(if_else(stringr::str_detect(drug, '^123'), items, 0)),
quantity_123 = sum(if_else(stringr::str_detect(drug, '^123'), quantity, 0)),
all_items = sum(items),
all_quantity = sum(quantity))
# A tibble: 3 x 5
practice items_123 quantity_123 all_items all_quantity
<int> <dbl> <dbl> <dbl> <dbl>
1 1 54 44 75 58
2 2 44 42 66 65
3 3 24 19 35 28https://stackoverflow.com/questions/51968384
复制相似问题