我试图计算每个外科专业的手术取消的平均值。这是我的数据集:
> dput(dt)
structure(list(Service = c("C", "C", "El",
"Ga", "Ga", "Ge", "Ge",
"Gy", "Gy", "Gyny", "He",
"He",
binary = c("Cancelled", "Not Cancelled", "Cancelled", "Cancelled",
"Not Cancelled", "Cancelled", "Not Cancelled", "Cancelled",
"Not Cancelled", "Not Cancelled", "Cancelled", "Not Cancelled",
"Cancelled", "Not Cancelled", "Cancelled", "Not Cancelled",
"Cancelled", "Not Cancelled", "Cancelled", "Cancelled", "Not Cancelled",
"Cancelled", "Not Cancelled", "Cancelled", "Not Cancelled",
"Cancelled", "Not Cancelled", "Cancelled", "Not Cancelled",
"Cancelled", "Not Cancelled", "Cancelled", "Not Cancelled",
"Cancelled", "Not Cancelled"), n = c(338L, 38L, 10L, 14L,
6L, 69L, 12L, 31L, 11L, 1L, 3L, 1L, 39L, 6L, 3L, 1L, 113L,
9L, 2L, 74L, 15L, 1L, 1L, 3L, 12L, 1L, 1L, 2L, 2L, 3L,
13L, 0L, 12L, 5L, 4L)), row.names = c(NA, -35L), groups = rows = structure(list(1:2, 3L, 4:5, 6:7, 8:9,
10L, 11:12, 13:14, 15:16, 17:18, 19L, 20:21, 22:23, 24:25,
26:27, 28:29, 30:31, 32:33, 34:35), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -19L), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))我试过这个:
dt.1 <- dt %>%
group_by(Service) %>%
summarise(`Cancelled` = mean(binary == "Cancelled")*100,
`Not Cancelled` = mean(!binary == "Cancelled")*100)但有很奇怪的价值观。
发布于 2022-09-27 16:59:16
我们可以使用逻辑条件来减去“n”
library(dplyr)
dt %>%
group_by(Service) %>%
summarise(Perc_not_cancelled =
100 *n[binary != 'Cancelled'][1]/sum(n),
Perc_cancelled = 100 * n[binary == 'Cancelled'][1]/sum(n), .groups = 'drop' )-output
# A tibble: 19 × 3
Service Perc_not_cancelled Perc_cancelled
<chr> <dbl> <dbl>
1 Cardiac 10.1 89.9
2 Electrophysiology NA 100
3 Gastroenterology 30 70
4 General 14.8 85.2
5 Gynecology 24.4 75.6
6 Gynecology Oncology 100 NA
7 Hepatology 25 75
8 Interventional Radiology 13.3 86.7
9 Neuroradiology 25 75
10 Neurosurgery 7.38 92.6
11 Ortho Spine NA 100
12 Orthopedics 16.9 83.1
13 Otolaryngology Head and Neck Surgery 7.69 92.3
14 Plastics 25 75
15 Pulmonary 50 50
16 Thoracic 8.33 91.7
17 Transplant 30.2 69.8
18 Urology 14.6 85.4
19 Vascular 6.56 93.4-checking
> 38/(38+338) * 100
[1] 10.10638如果我们想用NA替换0
library(tidyr)
dt %>%
group_by(Service) %>%
summarise(Perc_not_cancelled =
100 *n[binary != 'Cancelled'][1]/sum(n),
Perc_cancelled = 100 * n[binary == 'Cancelled'][1]/sum(n),
.groups = 'drop' ) %>%
mutate(across(where(is.numeric), replace_na, 0))
# A tibble: 19 × 3
Service Perc_not_cancelled Perc_cancelled
<chr> <dbl> <dbl>
1 Cardiac 10.1 89.9
2 Electrophysiology 0 100
3 Gastroenterology 30 70
4 General 14.8 85.2
5 Gynecology 24.4 75.6
6 Gynecology Oncology 100 0
7 Hepatology 25 75
8 Interventional Radiology 13.3 86.7
9 Neuroradiology 25 75
10 Neurosurgery 7.38 92.6
11 Ortho Spine 0 100
12 Orthopedics 16.9 83.1
13 Otolaryngology Head and Neck Surgery 7.69 92.3
14 Plastics 25 75
15 Pulmonary 50 50
16 Thoracic 8.33 91.7
17 Transplant 30.2 69.8
18 Urology 14.6 85.4
19 Vascular 6.56 93.4或者另一个选择是pivot_wider
dt %>%
group_by(Service) %>%
mutate(n = 100 *proportions(n)) %>%
ungroup %>%
pivot_wider(names_from = "binary", values_from = n,
values_fill = 0, names_glue = "Perc_{.name}")发布于 2022-09-27 18:07:16
在R基,你会做:
prop.table(xtabs(n~.,dt), 1) * 100
binary
Service Cancelled Not Cancelled
Cardiac 89.893617 10.106383
Electrophysiology 100.000000 0.000000
Gastroenterology 70.000000 30.000000
General 85.185185 14.814815
Gynecology 75.609756 24.390244
Gynecology Oncology 0.000000 100.000000
Hepatology 75.000000 25.000000
Interventional Radiology 86.666667 13.333333
Neuroradiology 75.000000 25.000000
Neurosurgery 92.622951 7.377049
Ortho Spine 100.000000 0.000000
Orthopedics 83.146067 16.853933
Otolaryngology Head and Neck Surgery 92.307692 7.692308
Plastics 75.000000 25.000000
Pulmonary 50.000000 50.000000
Thoracic 91.666667 8.333333
Transplant 69.767442 30.232558
Urology 85.365854 14.634146
Vascular 93.442623 6.557377如果您需要返回一个data.frame,只需将上面的代码包装为:
as.data.frame.matrix(prop.table(xtabs(n~.,dt), 1)*100)https://stackoverflow.com/questions/73871237
复制相似问题