前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >跟着Nature microbiology学作图:R语言ggplot2堆积柱形图柱子单独配色/多个图例排序

跟着Nature microbiology学作图:R语言ggplot2堆积柱形图柱子单独配色/多个图例排序

作者头像
用户7010445
发布2023-08-23 10:46:09
5030
发布2023-08-23 10:46:09
举报
文章被收录于专栏:小明的数据分析笔记本

论文

A high-quality genome compendium of the human gut microbiome of Inner Mongolians

https://www.nature.com/articles/s41564-022-01270-1

2023Naturemicrobiology--Ahigh-qualitygenomecompendiumofthehumangutmicrobiomeofInnerMongolians4.pdf

论文中大部分作图数据都有,争取把论文中的图都复现一下

今天的推文我们试着复现一下论文中的Figure2b

image.png

今天推文的主要知识点就是四个柱子,每个柱子单独配色,然后4个图例如何调整四个图例的前后顺序

数据的部分截图

image.png

读取数据

代码语言:javascript
复制
library(readxl)
library(tidyverse)
df<-read_excel("data/20230305/41564_2022_1270_MOESM5_ESM.xlsx",
               sheet = "Fig2b")
head(df)

每一列单独统计频率

代码语言:javascript
复制
df %>% 
  select(Phylum) %>% 
  mutate(Phylum=str_replace(Phylum,"p__","")) %>% 
  group_by(Phylum) %>% 
  summarise(phylum_counts=n()) %>% 
  ungroup() %>% 
  mutate(group01=case_when(
    Phylum == "Actinobacteriota" ~ "Actinobacteriota",
    Phylum == "Firmicutes_A" ~ "Firmicutes_A",
    Phylum == "Bacteroidota" ~ "Bacteroidota",
    Phylum == "Firmicutes" ~ "Firmicutes",
    Phylum == "Proteobacteria" ~ "Proteobacteria",
    TRUE ~ "Others"
  )) %>%
  group_by(group01) %>% 
  summarise(value=sum(phylum_counts)) %>% 
  ungroup() %>%
  mutate(group01=factor(group01,
                      levels = c("Others","Proteobacteria",
                                 "Firmicutes","Bacteroidota",
                                 "Firmicutes_A",
                                 "Actinobacteriota"))) -> df01


df01

image.png

以上代码需要单独运行四次

这里统计的和论文中的内容有些出入,暂时搞不清楚问题出在哪里

代码语言:javascript
复制
df %>% 
  select(Class) %>% 
  mutate(Class=str_replace(Class,"c__","")) %>% 
  group_by(Class) %>% 
  summarise(class_counts=n()) %>% 
  ungroup() %>% 
  mutate(group02=case_when(
    Class == "Negativicutes" ~ "Negativicutes",
    Class == "Clostridia" ~ "Clostridia",
    Class == "Bacteroidia" ~ "Bacteroidia",
    Class == "Bacilli" ~ "Bacilli",
    Class == "Gammaproteobacteria" ~ "Gammaproteobacteria",
    TRUE ~ "Others"
  )) %>%
  group_by(group02) %>% 
  summarise(value=sum(class_counts)) %>% 
  ungroup() %>% 
  mutate(group02=factor(group02,
                      levels = c("Others","Gammaproteobacteria",
                                 "Bacilli","Bacteroidia",
                                 "Clostridia",
                                 "Negativicutes"))) -> df02


df %>% 
  select(Order) %>% 
  mutate(Order=str_replace(Order,"o__","")) %>% 
  group_by(Order) %>% 
  summarise(order_counts=n()) %>% 
  ungroup() %>%
  mutate(group03=case_when(
    Order == "Lachnospirales" ~ "Lachnospirales",
    Order == "Oscillospirales" ~ "Oscillospirales",
    Order == "Bacteroidales" ~ "Bacteroidales",
    Order == "Christensenellales" ~ "Christensenellales",
    Order == "Lactobacillales" ~ "Lactobacillales",
    TRUE ~ "Others"
  )) %>%
  group_by(group03) %>% 
  summarise(value=sum(order_counts)) %>% 
  ungroup() %>% 
  mutate(group03=factor(group03,
                      levels = c("Others","Lactobacillales",
                                 "Christensenellales","Bacteroidales",
                                 "Oscillospirales",
                                 "Lachnospirales"))) -> df03


df %>% 
  select(Family) %>% 
  mutate(Family=str_replace(Family,"f__","")) %>% 
  group_by(Family) %>% 
  summarise(family_counts=n()) %>% 
  ungroup() %>%
  mutate(group04=case_when(
    Family == "Lachnospiraceae" ~ "Lachnospiraceae",
    Family == "Oscillospiraceae" ~ "Oscillospiraceae",
    Family == "Ruminococcaceae" ~ "Ruminococcaceae",
    Family == "Acutalibacteraceae" ~ "Acutalibacteraceae",
    Family == "Bacteroidaceae" ~ "Bacteroidaceae",
    TRUE ~ "Others"
  )) %>%
  group_by(group04) %>% 
  summarise(value=sum(family_counts)) %>% 
  ungroup() %>% 
  mutate(group04=factor(group04,
                      levels = c("Others","Bacteroidaceae",
                                 "Acutalibacteraceae",
                                 "Ruminococcaceae",
                                 "Oscillospiraceae",
                                 "Lachnospiraceae"))) -> df04

作图代码

代码语言:javascript
复制
ggplot()+
  geom_bar(data=df01,
           aes(x=1,y=value,fill=group01),
           stat="identity",position = "fill")+
  scale_fill_manual(values = c("#827f88","#3288bd","#f36c44",
                               "#e4e569","#b9b9dd","#000000"),
                    breaks = rev(c("Others","Proteobacteria",
                                   "Firmicutes","Bacteroidota",
                                   "Firmicutes_A",
                                   "Actinobacteriota")),
                    name="Phylum",
                    guide=guide_legend(order=1))+
  ggnewscale::new_scale_fill()+
  geom_bar(data=df02,
           aes(x=2,y=value,fill=group02),
           stat="identity",position = "fill")+
  scale_fill_manual(values = c("#7ed0de","#5f50a1","#add8a4",
                               "#fddf8a","#8a95ab","#b57c82"),
                    breaks = rev(c("Others","Gammaproteobacteria",
                                   "Bacilli","Bacteroidia",
                                   "Clostridia",
                                   "Negativicutes")),
                    name="Class",
                    guide=guide_legend(order=2))+
  ggnewscale::new_scale_fill()+
  geom_bar(data=df03,
           aes(x=3,y=value,fill=group03),
           stat="identity",position = "fill")+
  scale_fill_manual(values = c("#134b5f","#9ba791","#eb9486",
                               "#adc0e3","#cc141d","#1d933a"),
                    breaks = rev(c("Others","Lactobacillales",
                                   "Christensenellales","Bacteroidales",
                                   "Oscillospirales",
                                   "Lachnospirales")),
                    name="Order",
                    guide=guide_legend(order=3))+
  ggnewscale::new_scale_fill()+
  geom_bar(data=df04,
           aes(x=4,y=value,fill=group04),
           stat="identity",position = "fill")+
  scale_fill_manual(values = c("#59a691","#505d75","#c9b014",
                               "#9d1b45","#ee8354","#bb7b53"),
                    breaks = rev(c("Others","Bacteroidaceae",
                                   "Acutalibacteraceae",
                                   "Ruminococcaceae",
                                   "Oscillospiraceae",
                                   "Lachnospiraceae")),
                    name="Family",
                    guide=guide_legend(order=4))+
  scale_x_continuous(breaks = c(1,2,3,4),
                     label=c("Phylum","Class","Order","Family"))+
  theme_bw()+
  theme(panel.grid = element_blank(),
        legend.key.size = unit(3,'mm'))+
  labs(x=NULL,y="Proportion")

image.png

推文记录的是自己的学习笔记,很可能存在错误,请大家批判着看

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2023-06-13,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 小明的数据分析笔记本 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 论文
  • 数据的部分截图
  • 读取数据
  • 每一列单独统计频率
  • 作图代码
相关产品与服务
图数据库 KonisGraph
图数据库 KonisGraph(TencentDB for KonisGraph)是一种云端图数据库服务,基于腾讯在海量图数据上的实践经验,提供一站式海量图数据存储、管理、实时查询、计算、可视化分析能力;KonisGraph 支持属性图模型和 TinkerPop Gremlin 查询语言,能够帮助用户快速完成对图数据的建模、查询和可视化分析。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档