在做单细胞转录组数据分析时,我们经常遇到特定(通路/功能)基因集合的表达活性分析
,目前比较常用的是GSVA
以及Seurat软件包的AddModuleScore[1]函数。这里我们介绍一种新的方法UCell[2],它具有以下特点:
UCell 是R包,目前存放在github[3],安装方式如下:
library(remotes)
remotes::install_github("carmonalab/UCell")
也可以从其github网址下载zip包到本地,通过以下命令安装
devtools::install_local("/path/UCell-master.zip")
详细使用教程见网址 https://carmonalab.github.io/UCell/
大多单细胞转录组数据用seurat软件进行处理,所以这里我们拿seurat处理的对象文件作为示例。
我们从Hao and Hao et al, bioRvix 2020[4]下载downsampled数据, pbmc_multimodal.downsampled20k.Tcell.seurat.RNA.rds[5]。
library(Seurat)
library(ggplot2)
library(UCell)
cuscolors <-c('#E5D2DD','#53A85F','#F1BB72',
'#F3B1A0','#D6E7A3','#57C3F3',
'#476D87','#E95C59','#E59CC4',
'#AB3282', '#23452F','#BD956A',
'#8C549C', '#585658')
rds <- readRDS("pbmc_multimodal.downsampled20k.Tcell.seurat.RNA.rds")
head(rds@meta.data, 2)
options(repr.plot.width=8, repr.plot.height=6)
DimPlot(object = rds, reduction = "wnn.umap", group.by = "celltype.l2", label = TRUE,
label.size = 3, repel = TRUE, cols=my36colors)
# 构建UCell需要输入的gene sets
markers <- list()
markers$Tcell_CD4 <- c("CD4", "CD40LG")
markers$Tcell_CD8 <- c("CD8A", "CD8B")
markers$Tcell_Treg <- c("FOXP3", "IL2RA")
markers$Tcell_MAIT <- c("KLRB1", "SLC4A10", "NCR3")
markers$Tcell_gd <- c("TRDC", "TRGC1", "TRGC2", "TRDV1")
markers$Tcell_NK <- c("FGFBP2", "SPON2", "KLRF1", "FCGR3A", "KLRD1", "TRDC")
# 主函数
rds <- AddModuleScore_UCell(rds, features = markers)
signature.names <- paste0(names(markers), "_UCell")
options(repr.plot.width=6, repr.plot.height=4)
VlnPlot(rds, features = signature.names,
group.by = "celltype.l1", stack=TRUE,
cols=my36colors) + NoLegend()
options(repr.plot.width=12, repr.plot.height=4)
FeaturePlot(rds, reduction = "wnn.umap", features = c("Tcell_CD4_UCell", "Tcell_NK_UCell"),
ncol = 3, order = T,
min.cutoff = "q03", max.cutoff = "q99",
cols = c("#F5F5F5", "#333399"), pt.size = 0.1)
markers3 <- list()
markers3$Tcell_gd_pos <- c("TRDC+", "TRGC1+", "TRGC2+", "TRDV1+","TRAC+","TRBC1+","TRBC2+")
markers3$Tcell_NK_pos <- c("FGFBP2+", "SPON2+", "KLRF1+", "FCGR3A+", "CD3E+","CD3G+")
rds <- AddModuleScore_UCell(rds, features = markers3)
options(repr.plot.width=12, repr.plot.height=4)
FeaturePlot(rds, reduction = "wnn.umap", features = c("Tcell_gd_pos_UCell", "Tcell_NK_pos_UCell"),
ncol = 3, order = T,
min.cutoff = "q03", max.cutoff = "q99",
cols = c("#F5F5F5", "#333399"), pt.size = 0.1)
可以在gene后加上+
或者-
来表示signatures genes set中的gene是上调还是下调的模式。
[1]
AddModuleScore: http://events.jianshu.io/p/827143ce66fa
[2]
UCell: Andreatta M, Carmona S J. UCell: robust and scalable single-cell gene signature scoring[J]. Computational and Structural Biotechnology Journal, 2021. https://doi.org/10.1016/j.csbj.2021.06.043
[3]
github: https://github.com/carmonalab/UCell
[4]
Hao and Hao et al, bioRvix 2020: https://www.biorxiv.org/content/10.1101/2020.10.12.335331v1
[5]
pbmc_multimodal.downsampled20k.Tcell.seurat.RNA.rds: https://drive.switch.ch/index.php/s/3kM5PQ0tQaG6d6A/download