结合自己的肝脏免疫细胞数据,我们使用scMayoMap进行注释。
rm(list=ls())
load('./scRNA_after_sctype.Rdata')
pkgs <- c("ggplot2", "dplyr","tidyr","tibble","reshape2")
sapply(pkgs, require, character.only = TRUE)
library(scMayoMap)
scMayoMapDatabase1<-scMayoMapDatabase
table(scMayoMapDatabase1$tissue)
# adipose tissue bladder blood
# 659 456 767
# bone bone marrow brain
# 587 1059 2902
# breast embryo eye
# 593 1250 356
# gastrointestinal tract heart kidney
# 1195 518 1856
# liver lung mammary gland
# 986 1681 666
# muscle other ovary
# 797 539 543
# pancreas placenta prostate
# 1033 294 144
# skin spleen stomach
# 659 864 203
# testis thymus tooth
# 587 37 132
# uterus
# 295
这就是参考注释库包含的组织名称,我们选择liver。
Idents(sce)<-'RNA_snn_res.0.8'
seurat.markers <- FindAllMarkers(sce, method = 'MAST')
scMayoMap.obj <- scMayoMap(data = seurat.markers, database=scMayoMapDatabase, tissue = 'liver')
plt <- scMayoMap.plot(scMayoMap.object = scMayoMap.obj)
res_markers<-scMayoMap.obj$markers
gns <- scMayoMap.obj$markers$genes[scMayoMap.obj$markers$cluster==0 &scMayoMap.obj$markers$celltype=='B cell']
gns <- strsplit(gns, ',')[[1]]
DotPlot(sce, features = gns)
res <- scMayoMap.obj$res
res_markers结果如上,每个cluster都按照分数score预测了细胞类型,比如cluster 0概率从大到小,依次为肝细胞、T细胞、B细胞。
提取res结果,整理最终结果如下:
接下来,我们把结果整合回sce中:
sce
celltype=data.frame(ClusterID=0:21 ,
celltype= 0:21)
#定义细胞亚群
celltype[celltype$ClusterID %in% c( 9,11,16,17,20 ),2]='B cell'
celltype[celltype$ClusterID %in% c( 1,5,6,8,10,13 ),2]='Kupffer cell'
celltype[celltype$ClusterID %in% c( 0,3,4 ),2]='T cell'
celltype[celltype$ClusterID %in% c( 7 ),2]='Monocyte'
celltype[celltype$ClusterID %in% c( 2,18,19 ),2]='Neutrophil'
celltype[celltype$ClusterID %in% c( 21 ),2]='Epithelial cell'
celltype[celltype$ClusterID %in% c( 12 ),2]='Endothelial cell'
celltype[celltype$ClusterID %in% c( 14,15 ),2]='Liver bud hepatic cell'
head(celltype)
celltype
table(celltype$celltype)
sce@meta.data$celltype = "NA"
for(i in 1:nrow(celltype)){
sce@meta.data[which(sce@meta.data$RNA_snn_res.0.8 == celltype$ClusterID[i]),'celltype'] <- celltype$celltype[i]}
table(sce$celltype)
DimPlot(sce,group.by = 'celltype',label = T)
接下来,我们比较与scType自动注释的结果:
library(gplots)
balloonplot(table(sce$celltype,sce$customclassif))
横坐标是scMayoMap的自动注释信息,纵坐标是scType的注释结果。
结果显示:自动注释的准确性大概率取决于参考数据库到底有哪些细胞,scType中有NK细胞,但没有kuffer细胞,而scMayoMap相反,没有NK细胞,所以在细胞归属上有了比较大的分歧,但总的来说,大趋势是差不多的。
左边是scMayoMap结果,右边是scType结果。
往期回顾
单细胞测序最好的教程(十):万能的Transformer与细胞注释