作者 | 单细胞天地小编 柠檬不酸
官网上PC数目的确定(https://satijalab.org/seurat/v3.1/pbmc3k_tutorial.html)
library(Seurat)
load(file = 'Cluster_seurat.Rdata') # data.filt
seurat_data <- data.filt
# Explore heatmap of PCs
DimHeatmap(seurat_data, dims = 1:6, cells = 500, balanced = TRUE)
DimHeatmap(seurat_data , dims = 7:12, cells = 500, balanced = TRUE)
# Plot the elbow plot
ElbowPlot(object = seurat_data , ndims = 30)
# Slow slow slow
seurat_data <- JackStraw(object = seurat_data, dims = 50)
seurat_data <- ScoreJackStraw(seurat_data, dims = 1:50)
JackStrawPlot(object = seurat_data, dims = 1:50)
上面三种方法只能给出PC数的粗略范围,选择不同PC数目,细胞聚类效果差别较大,因此,需要一个更具体的PC数目。作者提出一个确定PC阈值的三个标准:
# Determine percent of variation associated with each PC
pct <- seurat_data [["pca"]]@stdev / sum( seurat_data [["pca"]]@stdev) * 100
# Calculate cumulative percents for each PC
cumu <- cumsum(pct)
# Determine which PC exhibits cumulative percent greater than 90% and % variation associated with the PC as less than 5
co1 <- which(cumu > 90 & pct < 5)[1]
co1
# Determine the difference between variation of PC and subsequent PC
co2 <- sort(which((pct[1:length(pct) - 1] - pct[2:length(pct)]) > 0.1), decreasing = T)[1] + 1
# last point where change of % of variation is more than 0.1%.
co2
# Minimum of the two calculation
pcs <- min(co1, co2)
pcs
# Create a dataframe with values
plot_df <- data.frame(pct = pct, cumu = cumu, rank = 1:length(pct))
# Elbow plot to visualize
ggplot(plot_df, aes(cumu, pct, label = rank, color = rank > pcs)) +
geom_text() +
geom_vline(xintercept = 90, color = "grey") +
geom_hline(yintercept = min(pct[pct > 5]), color = "grey") +
theme_bw()
查看PC相关高可变基因。如果我们看到一种罕见细胞类型的已知标记基因的PC数,那么可以选择从1~直到该PC值的所有PC数目。
# Printing out the most variable genes driving PCs
print(x = seurat_data [["pca"]], dims = 1:25, nfeatures = 5)
PC_ 1
Positive: NEIL1, LTB, KLF2, TP53INP1, CD27
Negative: TYMS, MKI67, PCLAF, RRM2, NUSAP1
PC_ 2
Positive: GZMA, ARL4C, PRF1, CST7, GZMM
Negative: SLC35E3, ID3, PRDX1, TOP2B, RPLP0
PC_ 3
Positive: HBA2, HBB, HBA1, AHSP, HBD
Negative: RPS18, RPL18A, RPS2, RPSA, RPL37A
PC_ 4
Positive: IGLL1, SLC35E3, PCDH9, CD38, F13A1
Negative: CCL17, HMBS, BLVRB, AQP1, CD36
PC_ 5
Positive: GYPC, RPS18, RPS2, C1QTNF4, RPL18A
Negative: MNDA, LYZ, S100A9, S100A8, FCN1
PC_ 6
Positive: PLK1, CDC20, CENPA, HMMR, CENPE
Negative: GINS2, MCM6, HELLS, MCM4, MCM3
PC_ 7
Positive: GYPC, C1QTNF4, LIMS1, NRIP1, S100A9
Negative: SPIB, TAGLN2, MS4A1, IGLC6, PTPRC
PC_ 8
Positive: FCGR3A, GZMB, SPON2, KLRF1, MYOM2
Negative: CCR7, CD3G, CD3D, IL7R, GPR183
PC_ 9
Positive: CCL17, LTB, TMEM154, CCND2, HSPA12B
Negative: ACTG1, LGALS1, IGLL1, CCDC81, TOP2B
PC_ 10
Positive: AHNAK, VIM, EMP1, LMNA, CD27
Negative: MT1X, CCL17, FTL, HSP90B1, NSMCE1
PC_ 11
Positive: NEIL1, LTB, FTH1, CFD, CST3
Negative: LCN2, RETN, S100A8, LTF, CAMP
PC_ 12
Positive: RPS12, RPLP1, RPL18A, EEF1B2, RPS5
Negative: HNRNPU, NCL, AHNAK, AC245060.5, EMP1
PC_ 13
Positive: CD3D, TRAC, CD3G, IGLC6, CD27
Negative: MARCH1, MS4A1, BANK1, ADAM28, LINC02397
PC_ 14
Positive: SCIMP, SRGN, GUSB, SHISA2, MARCH1
Negative: MS4A1, ZNF608, ENAM, CCND2, CCL17
PC_ 15
Positive: ATF5, HSPA5, PSAT1, PHGDH, MARCH1
Negative: NT5E, GIMAP4, TP53INP1, SHISA2, DBI
PC_ 16
Positive: ACSM3, IGLC6, SHISA2, REXO2, MT1X
Negative: CD82, GCHFR, PRDX1, UBASH3B, PTGDR
PC_ 17
Positive: MARCKSL1, FTH1, S100A1, CRIP2, EMP2
Negative: HSP90B1, HSPA5, UBASH3B, PPIB, FKBP5
PC_ 18
Positive: MARCH1, H3F3A, CALM2, ACTB, PRDX1
Negative: HSP90B1, ATF5, HSPA5, MT-ND6, CANX
PC_ 19
Positive: TRGC2, LGALS1, KLRG1, CCL5, PTMS
Negative: CCR7, TXK, FCER1G, CD7, TCF7
PC_ 20
Positive: PIM1, SOCS3, ADGRE5, RGCC, EPHA4
Negative: LRMP, BANK1, MS4A1, CLEC4E, NME1
PC_ 21
Positive: CCR7, CMTM2, S100A11, LRMP, TXK
Negative: TRGC2, RPS12, KLRG1, LCN6, RPS18
PC_ 22
Positive: CTGF, PMAIP1, FOS, KLF6, FOSB
Negative: FUT7, SLC9A3R2, LCN6, PPP1R14A, EMP3
PC_ 23
Positive: ATF5, PSAT1, HSP90B1, PHGDH, HSPA5
Negative: CTHRC1, NSMCE1, MAP1A, IGLL1, BTNL9
PC_ 24
Positive: SERINC2, LST1, NAMPT, MT1X, SLC25A37
Negative: SHISA2, DEPP1, GADD45A, PSTPIP2, CD33
PC_ 25
Positive: CDKN1C, RHOB, BATF3, CX3CR1, SERPINA1
Negative: FOS, ALDH2, MGST1, MPO, FOSB