The flying spider-monkey tree fern genome provides insights into fern evolution and arborescence
https://www.nature.com/articles/s41477-022-01146-6#Sec44
https://doi.org/10.6084/m9.figshare.19125641
今天的推文重复一下论文中的Figure3d
image.png
image.png
library(readxl)
dat01<-read_excel("data/20220526/NaturePlantsFig3d.xlsx")
head(dat01)
论文中的图展示的是Z-score,数据应该是FPKM之类的,这里需要对数据集进行一个转化,这里关于zscore的计算我采用的公式是 以每个基因为单位,先取log2,然后是 (FPKM - mean(FPKM))/sd(FPKM)
这里我不确定这个转化做的对不对,这里的疑问是计算平均值和标准差的时候是用提供的所有基因的数据 还是用每个基因分别算平均值和标准差,我采用的是后者。
library(tidyverse)
library(stringr)
dat01 %>%
rowwise() %>%
mutate(mean_value = mean(c_across(2:16)),
sd_value = sd(c_across(2:16))) %>%
mutate(across(2:16,~(.x-mean_value)/sd_value)) %>%
select(-c(mean_value,sd_value)) -> dat01.2
dat01.2 %>%
reshape2::melt(id.vars="Gene") %>%
mutate(new_var=str_replace(variable,'-[123]','')) %>%
group_by(Gene,new_var) %>%
summarise(mean_value=mean(value)) %>%
ungroup() -> dat01.3
library(ggplot2)
library(paletteer)
dat01.3$new_var<-factor(dat01.3$new_var,
levels = c("Ph","Sb","Xy","Pi","Le1"))
ggplot(data = dat01.3,
aes(x=Gene,y=new_var))+
geom_tile(aes(fill=mean_value),
color="white")+
scale_fill_paletteer_c("ggthemes::Classic Red-Green",
direction = -1,
name="Expression level (Z-score)",
limits=c(-2,2))+
scale_y_discrete(position = "right")+
labs(x=NULL,y=NULL)+
theme_minimal()+
theme(panel.grid = element_blank(),
legend.position = "top",
axis.text.x = element_text(angle = 60,
hjust = 1,
vjust=1),
plot.margin = unit(c(0.2,0.2,0.2,1),'cm'))+
guides(fill=guide_colorbar(title.position = "top",
title.hjust = 0.5,
barwidth = 10,
barheight = 0.5,
ticks = FALSE))
image.png
作图代码的详细介绍会以视频的形式放到B站,欢迎大家关注我的同名B站账号 小明的数据分析笔记本
示例数据可以去论文中下载,代码可以在推文中进行复制