新鲜出炉(2021年10月)的,发表在:《Computational and Structural Biotechnology Journal》杂志的综述文章:《Automatic cell type identification methods for single-cell RNA sequencing》整理了目前的单细胞亚群注释工具,文章链接是:https://www.sciencedirect.com/science/article/pii/S2001037021004499
作者开发了一个整合这么多工具的包(AutomaticCellTypeIdentification),主要是把各个工具分成了3类:
eagersupervised
methods include ACTINN, CaSTLe, CHETAH, clustifyr, Garnett, Markercount, MARS, scClassifR, scHPL, SciBet, scID, scLearn, scmapcluster, scPred, scVI, Seurat, SingleCellNet and SingleR.lazysupervised
methods include CELLBLAST and scmapcell.markersupervised
methods include scTyper, Markercount, SCSA, DigitalCellSorter and SCINA.工作量有点大啊!
不过,综述文章关于软件工具算法测评的思路值得学习:
文章也提到了目前单细胞转录组测序数据都是多个样品了,所以确实存在两个难题(Yet, for integrated datasets, there are still two issues to be solved.):
实际上我做的大量肿瘤单细胞数据分析项目里面,用不到这些自动化注释工具,都是自己肉眼看,需要有一些背景知识哦!比如背诵如下所示各个细胞亚群高表达量基因的列表:
# T Cells (CD3D, CD3E, CD8A),
# B cells (CD19, CD79A, MS4A1 [CD20]),
# Plasma cells (IGHG1, MZB1, SDC1, CD79A),
# Monocytes and macrophages (CD68, CD163, CD14),
# NK Cells (FGFBP2, FCG3RA, CX3CR1),
# Photoreceptor cells (RCVRN),
# Fibroblasts (FGF7, MME),
# Endothelial cells (PECAM1, VWF).
# epi or tumor (EPCAM, KRT19, PROM1, ALDH1A1, CD24).
# immune (CD45+,PTPRC), epithelial/cancer (EpCAM+,EPCAM),
# stromal (CD10+,MME,fibo or CD31+,PECAM1,endo)
最后,摘抄了这个综述文章里面收集整理的各个工具的详细GitHub网页链接:
Name of method Version URL
CELLBLAST v0.3.8 https://github.com/gao-lab/Cell_BLAST
CellFishing.jl v0.3.2 https://github.com/bicycle1885/CellFishing.jl
scmap-cell v1.6.0 https://github.com/hemberg-lab/scmap
ACTINN master https://github.com/mafeiyang/ACTINN
CaSTLe v1.0.0.2 https://github.com/yuvallb/CaSTLe
CHETAH v1.2.0 https://github.com/jdekanter/CHETAH
Garnett v0.1.19 https://github.com/cole-trapnell-lab/garnett
SciBet v0.1.0 https://github.com/zwj-tina/scibetR
scID v2.1 https://github.com/BatadaLab/scID
scLearn v1.0 https://github.com/bm2-lab/scLearn
scmap-cluster v1.6.0 https://github.com/hemberg-lab/scmap
scPred v1.9.0 https://github.com/powellgenomicslab/scPred
scVI v0.4.1 https://github.com/YosefLab/scvi-tools
Seurat v3.2.2 https://github.com/satijalab/seurat
SingleCellNet v0.1.0 https://github.com/pcahan1/singleCellNet
SingleR v1.1.1 https://github.com/dviraran/SingleR
CellAssign v0.99.21 https://github.com/Irrationone/cellassign
DigitalCellSorter v1.1 https://github.com/sdomanskyi/DigitalCellSorter
SCINA v1.2.0 https://github.com/jcao89757/SCINA
SCSA master https://github.com/bioinfo-ibms-pumc/SCSA
scTyper v0.1.0 https://github.com/omicsCore/scTyper
scHPL V0.0.2 https://github.com/lcmmichielsen/scHPL
MARS master https://github.com/snap-stanford/mars
clustifyr v1.5.0 https://github.com/rnabioco/clustifyr
scClassifR v1.1.1 https://github.com/grisslab/scClassifR
MarkerCount master https://github.com/combio-dku/MarkerCount/tree/master
入门单细胞数据处理,需要一些基础认知,也可以看基础10讲:
最基础的往往是降维聚类分群,参考前面的例子:人人都能学会的单细胞聚类分群注释