我的数据是研究结果记录。它包含两组(group1和group2),三个结果(正,中性和负)。此外,它是一个时间序列数据集,因此它具有日期变量(day1,day2,...)
以下是我的数据示例:
Book1 %>% head(n=20)
# A tibble: 20 x 4
No Group Result date
<dbl> <dbl> <chr> <chr>
1 1 1 positive day1
2 2 1 neutral day1
3 3 1 neutral day1
4 4 2 negative day1
5 5 2 positive day1
6 6 2 neutral day1
7 7 1 neutral day1
8 8 1 negative day1
9 9 1 positive day1
10 10 2 neutral day1
11 11 1 neutral day2
12 12 1 negative day2
13 13 1 positive day2
14 14 2 neutral day2
15 15 2 neutral day2
16 16 2 negative day2
17 17 1 positive day2
18 18 1 positive day2
19 19 1 positive day2
20 20 2 positive day2
我计划画一个折线图来比较两组之间的结果率(阳性率、中性率和负率),所以我的代码是:
Book1 %>%
ggplot(aes(x = Date, y = (..count..)/sum(..count..), fill = Group)) +
geom_line(stat = "count") +
facet_grid(Result~.)
但是,我收到了很多警告信息:
geom_path: Each group consists of only one
observation. Do you need to adjust the group
aesthetic?
而且这个图什么也不包含。下面是图:
我不知道为什么我会得到这个结果,也不知道如何做才能得到正确的曲线图。
发布于 2020-10-17 07:01:32
geom_line
期望不止一个观察结果。AFter汇总,则每个组将有一个元素。用geom_col
或geom_bar
代替geom_line
是很有用的
library(dplyr)
library(ggplot2)
Book1 %>%
group_by(date, Group = factor(Group), Result) %>%
summarise(value = n(), .groups = 'drop') %>%
mutate(perc = value/sum(value)) %>%
ggplot(aes(x = date, y = perc, fill = Group)) +
geom_col() +
facet_grid(~ Result)
-output
或者如果我们使用..count..
Book1 %>%
mutate(Group = factor(Group)) %>%
ggplot(aes(x = date)) +
geom_bar(aes(y = (..count..)/sum(..count..), fill = Group)) +
facet_grid(~ Result)
-output
数据
Book1 <- structure(list(No = 1:20, Group = c(1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), Result = c("positive",
"neutral", "neutral", "negative", "positive", "neutral", "neutral",
"negative", "positive", "neutral", "neutral", "negative", "positive",
"neutral", "neutral", "negative", "positive", "positive", "positive",
"positive"), date = c("day1", "day1", "day1", "day1", "day1",
"day1", "day1", "day1", "day1", "day1", "day2", "day2", "day2",
"day2", "day2", "day2", "day2", "day2", "day2", "day2")),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"))
发布于 2020-10-17 07:10:39
折线图是可能的,这取决于您如何绘制数据的摘要。您可以使用此样式的代码查看每个组的逐日演变:
library(ggplot2)
library(dplyr)
#Code
df %>% group_by(Group,Result,date) %>%
summarise(N=n()) %>%
ggplot(aes(x=date,y=N,group=Result,color=Result))+
geom_line(size=1)+
geom_point(size=1)+
scale_y_continuous(limits = c(0,NA))+
facet_grid(.~Group)+
theme_bw()+
theme(axis.text = element_text(color='black',face='bold'),
axis.title = element_text(color='black',face='bold'),
strip.text = element_text(color='black',face='bold'),
legend.title = element_text(color='black',face='bold'),
legend.text = element_text(color='black',face='bold'))
输出:
这是一种非常有趣的方法,它使用线条来查看结果的趋势如何变化。
使用的一些数据:
#Data
df <- structure(list(No = 1:20, Group = c(1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), Result = c("positive",
"neutral", "neutral", "negative", "positive", "neutral", "neutral",
"negative", "positive", "neutral", "neutral", "negative", "positive",
"neutral", "neutral", "negative", "positive", "positive", "positive",
"positive"), date = c("day1", "day1", "day1", "day1", "day1",
"day1", "day1", "day1", "day1", "day1", "day2", "day2", "day2",
"day2", "day2", "day2", "day2", "day2", "day2", "day2")), class = "data.frame", row.names = c(NA,
-20L))
或者使用您可以尝试的百分比:
#Code 2
df %>% group_by(Group,Result,date) %>%
summarise(N=n()) %>% ungroup() %>%
group_by(Group) %>% mutate(Perc=N/n()) %>%
ggplot(aes(x=date,y=Perc,group=Result,color=Result))+
geom_line(size=1)+
geom_point(size=1)+
scale_y_continuous(limits = c(0,NA),labels = scales::percent)+
facet_grid(.~Group)+
theme_bw()+
theme(axis.text = element_text(color='black',face='bold'),
axis.title = element_text(color='black',face='bold'),
strip.text = element_text(color='black',face='bold'),
legend.title = element_text(color='black',face='bold'),
legend.text = element_text(color='black',face='bold'))
输出:
https://stackoverflow.com/questions/64397381
复制相似问题