文章/答案/技术大牛

发布

社区首页 >问答首页 >ggplot2:计算日期系列数据集中因素组的百分比折线图

问ggplot2:计算日期系列数据集中因素组的百分比折线图
EN

Stack Overflow用户

提问于 2020-10-17 06:52:12

回答 2查看 28关注 0票数 0

我的数据是研究结果记录。它包含两组(group1和group2)，三个结果(正，中性和负)。此外，它是一个时间序列数据集，因此它具有日期变量(day1，day2，...)

以下是我的数据示例：

Book1 %>% head(n=20)
# A tibble: 20 x 4
      No Group Result   date 
   <dbl> <dbl> <chr>    <chr>
 1     1     1 positive day1 
 2     2     1 neutral  day1 
 3     3     1 neutral  day1 
 4     4     2 negative day1 
 5     5     2 positive day1 
 6     6     2 neutral  day1 
 7     7     1 neutral  day1 
 8     8     1 negative day1 
 9     9     1 positive day1 
10    10     2 neutral  day1 
11    11     1 neutral  day2 
12    12     1 negative day2 
13    13     1 positive day2 
14    14     2 neutral  day2 
15    15     2 neutral  day2 
16    16     2 negative day2 
17    17     1 positive day2 
18    18     1 positive day2 
19    19     1 positive day2 
20    20     2 positive day2

我计划画一个折线图来比较两组之间的结果率(阳性率、中性率和负率)，所以我的代码是：

Book1 %>%
  ggplot(aes(x = Date, y = (..count..)/sum(..count..), fill = Group)) +
  geom_line(stat = "count") +
  facet_grid(Result~.)

但是，我收到了很多警告信息：

geom_path: Each group consists of only one
observation. Do you need to adjust the group
aesthetic?

而且这个图什么也不包含。下面是图：

enter image description here

我不知道为什么我会得到这个结果，也不知道如何做才能得到正确的曲线图。

time-series

ggplot2

回答 2

Stack Overflow用户

发布于 2020-10-17 07:01:32

geom_line期望不止一个观察结果。AFter汇总，则每个组将有一个元素。用geom_col或geom_bar代替geom_line是很有用的

library(dplyr)
library(ggplot2)
Book1 %>% 
     group_by(date, Group = factor(Group), Result) %>%
     summarise(value = n(), .groups = 'drop') %>% 
     mutate(perc = value/sum(value)) %>% 
     ggplot(aes(x = date, y = perc, fill = Group)) + 
            geom_col() +
            facet_grid(~ Result)

-output

或者如果我们使用..count..

Book1 %>% 
     mutate(Group = factor(Group)) %>%
     ggplot(aes(x = date)) + 
        geom_bar(aes(y = (..count..)/sum(..count..), fill = Group)) +
        facet_grid(~ Result)

-output

数据

Book1 <- structure(list(No = 1:20, Group = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), Result = c("positive", 
"neutral", "neutral", "negative", "positive", "neutral", "neutral", 
"negative", "positive", "neutral", "neutral", "negative", "positive", 
"neutral", "neutral", "negative", "positive", "positive", "positive", 
"positive"), date = c("day1", "day1", "day1", "day1", "day1", 
"day1", "day1", "day1", "day1", "day1", "day2", "day2", "day2", 
"day2", "day2", "day2", "day2", "day2", "day2", "day2")),
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20"))

票数 0

Stack Overflow用户

发布于 2020-10-17 07:10:39

折线图是可能的，这取决于您如何绘制数据的摘要。您可以使用此样式的代码查看每个组的逐日演变：

library(ggplot2)
library(dplyr)
#Code
df %>% group_by(Group,Result,date) %>%
  summarise(N=n()) %>%
  ggplot(aes(x=date,y=N,group=Result,color=Result))+
  geom_line(size=1)+
  geom_point(size=1)+
  scale_y_continuous(limits = c(0,NA))+
  facet_grid(.~Group)+
  theme_bw()+
  theme(axis.text = element_text(color='black',face='bold'),
        axis.title = element_text(color='black',face='bold'),
        strip.text = element_text(color='black',face='bold'),
        legend.title = element_text(color='black',face='bold'),
        legend.text = element_text(color='black',face='bold'))

输出：

这是一种非常有趣的方法，它使用线条来查看结果的趋势如何变化。

使用的一些数据：

#Data
df <- structure(list(No = 1:20, Group = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), Result = c("positive", 
"neutral", "neutral", "negative", "positive", "neutral", "neutral", 
"negative", "positive", "neutral", "neutral", "negative", "positive", 
"neutral", "neutral", "negative", "positive", "positive", "positive", 
"positive"), date = c("day1", "day1", "day1", "day1", "day1", 
"day1", "day1", "day1", "day1", "day1", "day2", "day2", "day2", 
"day2", "day2", "day2", "day2", "day2", "day2", "day2")), class = "data.frame", row.names = c(NA, 
-20L))

或者使用您可以尝试的百分比：

#Code 2
df %>% group_by(Group,Result,date) %>%
  summarise(N=n()) %>% ungroup() %>%
  group_by(Group) %>% mutate(Perc=N/n()) %>%
  ggplot(aes(x=date,y=Perc,group=Result,color=Result))+
  geom_line(size=1)+
  geom_point(size=1)+
  scale_y_continuous(limits = c(0,NA),labels = scales::percent)+
  facet_grid(.~Group)+
  theme_bw()+
  theme(axis.text = element_text(color='black',face='bold'),
        axis.title = element_text(color='black',face='bold'),
        strip.text = element_text(color='black',face='bold'),
        legend.title = element_text(color='black',face='bold'),
        legend.text = element_text(color='black',face='bold'))

输出：

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64397381

复制

相似问题

问ggplot2:计算日期系列数据集中因素组的百分比折线图
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问ggplot2:计算日期系列数据集中因素组的百分比折线图EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问ggplot2:计算日期系列数据集中因素组的百分比折线图
EN