前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Seurat软件学习8-不同细胞类型样本的分析流程

Seurat软件学习8-不同细胞类型样本的分析流程

原创
作者头像
小胡子刺猬的生信学习123
发布2022-12-12 17:23:08
6970
发布2022-12-12 17:23:08
举报

Seurat软件学习1-多个模型得数据进行整合:https://cloud.tencent.com/developer/article/2130078

Seurat软件学习2-scrna数据整合分析:https://cloud.tencent.com/developer/article/2131431

Seurat软件学习3-scrna数据整合分析注释数据集:https://cloud.tencent.com/developer/article/2133583

Seurat软件学习4-使用RPCA进行快速整合数据集:https://cloud.tencent.com/developer/article/2134684

Seurat软件学习5-scRNA-Seq和scATAC-Seq数据整合:https://cloud.tencent.com/developer/article/2136814

Seurat软件学习6-多模型参考映射的方法:https://cloud.tencent.com/developer/article/2144475

Seurat软件学习7-同胞多组学结合方法-WNN:https://cloud.tencent.com/developer/article/2152008

image.png
image.png

overview

本教程演示如何使用 mixscape 分析单细胞池 CRSIPR 内容。 我们引入新的 Seurat 函数用于: 计算每个细胞的干扰的特定特征。 识别并移除“逃脱”CRISPR 干扰的细胞。 可视化不同干扰之间的异同。

Loading required packages

代码语言:javascript
复制
# Load packages.
library(Seurat)
library(SeuratData)
library(ggplot2)
library(patchwork)
library(scales)
library(dplyr)
library(reshape2)

# Download dataset using SeuratData.
InstallData(ds = "thp1.eccite")

# Setup custom theme for plotting.
custom_theme <- theme(
  plot.title = element_text(size=16, hjust = 0.5), 
  legend.key.size = unit(0.7, "cm"), 
  legend.text = element_text(size = 14))

Loading Seurat object containing ECCITE-seq dataset

我们使用从 SeuratData 包中的由THP-1 细胞生成的 111 gRNA ECCITE-seq 数据集,该数据集由Papalexi 等人发表。

代码语言:javascript
复制
# Load object.
eccite <- LoadData(ds = "thp1.eccite")

# Normalize protein.
eccite <- NormalizeData(
  object = eccite, 
  assay = "ADT", 
  normalization.method = "CLR", 
  margin = 2)

RNA-based clustering is driven by confounding sources of variation

在这里,我们依照标准的 Seurat 工作流程,根据基因表达谱对细胞进行聚类。 我们希望得到出现特定现象的细胞类群,但我们发现类群主要由细胞周期阶段和复制 ID 影响的。 我们只观察到一个包含细胞表达 IFNgamma 通路 gRNA 的扰动特异性簇。

代码语言:javascript
复制
# Prepare RNA assay for dimensionality reduction: 
# Normalize data, find variable features and scale data.
DefaultAssay(object = eccite) <- 'RNA'
eccite <- NormalizeData(object = eccite) %>% FindVariableFeatures() %>% ScaleData()

# Run Principle Component Analysis (PCA) to reduce the dimensionality of the data.
eccite <- RunPCA(object = eccite)

# Run Uniform Manifold Approximation and Projection (UMAP) to visualize clustering in 2-D.
eccite <- RunUMAP(object = eccite, dims = 1:40)

# Generate plots to check if clustering is driven by biological replicate ID, 
# cell cycle phase or target gene class.
p1 <- DimPlot(
  object = eccite, 
  group.by = 'replicate', 
  label = F, 
  pt.size = 0.2, 
  reduction = "umap", cols = "Dark2", repel = T) +
  scale_color_brewer(palette = "Dark2") +
  ggtitle("Biological Replicate") +
  xlab("UMAP 1") +
  ylab("UMAP 2") +
  custom_theme

p2 <- DimPlot(
  object = eccite, 
  group.by = 'Phase', 
  label = F, pt.size = 0.2, 
  reduction = "umap", repel = T) + 
  ggtitle("Cell Cycle Phase") +
  ylab("UMAP 2") +
  xlab("UMAP 1") +
  custom_theme

p3 <- DimPlot(
  object = eccite, 
  group.by = 'crispr', 
  pt.size = 0.2, 
  reduction = "umap", 
  split.by = "crispr", 
  ncol = 1, 
  cols = c("grey39","goldenrod3")) + 
  ggtitle("Perturbation Status") +
  ylab("UMAP 2") +
  xlab("UMAP 1") +
  custom_theme

# Visualize plots.
((p1 / p2 + plot_layout(guides = 'auto')) | p3 )
image.png
image.png

Calculating local perturbation signatures mitigates confounding effects

为了计算局部扰动影响,我们将非目标最近邻 (NN) 的数量设置为 k=20,我们建议用户从以下范围中选择一个 k:20 < k < 30。一般来讲,并不推荐 将 k 设置为非常小或很大的数字,因为这很可能不会从数据集中消除技术差异。 使用 PRTB 签名对细胞进行聚类可消除所有技术差异并揭示一个额外的扰动特定聚类。

代码语言:javascript
复制
# Calculate perturbation signature (PRTB).
eccite<- CalcPerturbSig(
  object = eccite, 
  assay = "RNA", 
  slot = "data", 
  gd.class ="gene", 
  nt.cell.class = "NT", 
  reduction = "pca", 
  ndims = 40, 
  num.neighbors = 20, 
  split.by = "replicate", 
  new.assay.name = "PRTB")

# Prepare PRTB assay for dimensionality reduction: 
# Normalize data, find variable features and center data.
DefaultAssay(object = eccite) <- 'PRTB'

# Use variable features from RNA assay.
VariableFeatures(object = eccite) <- VariableFeatures(object = eccite[["RNA"]])
eccite <- ScaleData(object = eccite, do.scale = F, do.center = T)

# Run PCA to reduce the dimensionality of the data.
eccite <- RunPCA(object = eccite, reduction.key = 'prtbpca', reduction.name = 'prtbpca')

# Run UMAP to visualize clustering in 2-D.
eccite <- RunUMAP(
  object = eccite, 
  dims = 1:40, 
  reduction = 'prtbpca', 
  reduction.key = 'prtbumap', 
  reduction.name = 'prtbumap')

# Generate plots to check if clustering is driven by biological replicate ID, 
# cell cycle phase or target gene class.
q1 <- DimPlot(
  object = eccite, 
  group.by = 'replicate', 
  reduction = 'prtbumap', 
  pt.size = 0.2, cols = "Dark2", label = F, repel = T) +
  scale_color_brewer(palette = "Dark2") +
  ggtitle("Biological Replicate") +
  ylab("UMAP 2") +
  xlab("UMAP 1") +
  custom_theme

q2 <- DimPlot(
  object = eccite, 
  group.by = 'Phase', 
  reduction = 'prtbumap', 
  pt.size = 0.2, label = F, repel = T) +
  ggtitle("Cell Cycle Phase") +
  ylab("UMAP 2") +
  xlab("UMAP 1") + 
  custom_theme

q3 <- DimPlot(
  object = eccite,
  group.by = 'crispr',
  reduction = 'prtbumap', 
  split.by = "crispr", 
  ncol = 1, 
  pt.size = 0.2, 
  cols = c("grey39","goldenrod3")) +
  ggtitle("Perturbation Status") +
  ylab("UMAP 2") +
  xlab("UMAP 1") +
  custom_theme

# Visualize plots.
(q1 / q2 + plot_layout(guides = 'auto') | q3)
image.png
image.png

Mixscape identifies cells with no detectable perturbation

在这里,我们假设每个目标基因类别是两个高斯分布的混合,一个代表敲除 (KO) 细胞,另一个代表非扰动 (NP) 细胞。 我们进一步假设 NP 细胞的分布与表达非靶向 gRNA (NT) 的细胞的分布相同,并且我们尝试使用 mixtools 包中的函数 normalmixEM() 来估计 KO 细胞的分布。 接下来,我们计算一个细胞属于 KO 分布的后验概率,并将概率高于 0.5 的细胞分类为 KO。 应用这种方法,我们在 11 个目标基因类别中识别 KO,并检测每个类别中 gRNA 靶向效率的变化。

代码语言:javascript
复制
# Run mixscape.
eccite <- RunMixscape(
  object = eccite, 
  assay = "PRTB", 
  slot = "scale.data", 
  labels = "gene", 
  nt.class.name = "NT", 
  min.de.genes = 5, 
  iter.num = 10, 
  de.assay = "RNA", 
  verbose = F,
  prtb.type = "KO")

# Calculate percentage of KO cells for all target gene classes.
df <- prop.table(table(eccite$mixscape_class.global, eccite$NT),2)

df2 <- reshape2::melt(df)
df2$Var2 <- as.character(df2$Var2)
test <- df2[which(df2$Var1 == "KO"),]
test <- test[order(test$value, decreasing = T),]
new.levels <- test$Var2
df2$Var2 <- factor(df2$Var2, levels = new.levels )
df2$Var1 <- factor(df2$Var1, levels = c("NT", "NP", "KO"))
df2$gene <- sapply(as.character(df2$Var2), function(x) strsplit(x, split = "g")[[1]][1])
df2$guide_number <- sapply(as.character(df2$Var2), 
                           function(x) strsplit(x, split = "g")[[1]][2])
df3 <- df2[-c(which(df2$gene == "NT")),]

p1 <- ggplot(df3, aes(x = guide_number, y = value*100, fill= Var1)) +
  geom_bar(stat= "identity") +
  theme_classic()+
  scale_fill_manual(values = c("grey49", "grey79","coral1")) + 
  ylab("% of cells") +
  xlab("sgRNA")

p1 + theme(axis.text.x = element_text(size = 18, hjust = 1), 
           axis.text.y = element_text(size = 18), 
           axis.title = element_text(size = 16), 
           strip.text = element_text(size=16, face = "bold")) + 
  facet_wrap(vars(gene),ncol = 5, scales = "free") +
  labs(fill = "mixscape class") +theme(legend.title = element_text(size = 14),
          legend.text = element_text(size = 12))
image.png
image.png

Inspecting mixscape results

为了确保 mixscape 为细胞分配正确的扰动状态,我们可以使用下面的函数来查看目标基因类(例如 IFNGR2)内细胞的扰动分数分布和后验概率,并将其与 NT 细胞的那些进行比较。 此外,我们可以进行差异表达 (DE) 分析并显示只有 IFNGR2 KO 细胞的 IFNG 通路基因表达降低。 最后,作为一项独立检查,我们可以查看 NP 和 KO 细胞中已知为 PD-L1 调节因子的靶基因的 PD-L1 蛋白表达值。

代码语言:javascript
复制
# Explore the perturbation scores of cells.
PlotPerturbScore(object = eccite, 
                 target.gene.ident = "IFNGR2", 
                 mixscape.class = "mixscape_class", 
                 col = "coral2") +labs(fill = "mixscape class")
image.png
image.png
代码语言:javascript
复制
# Inspect the posterior probability values in NP and KO cells.
VlnPlot(eccite, "mixscape_class_p_ko", idents = c("NT", "IFNGR2 KO", "IFNGR2 NP")) +
  theme(axis.text.x = element_text(angle = 0, hjust = 0.5),axis.text = element_text(size = 16) ,plot.title = element_text(size = 20)) + 
  NoLegend() +
  ggtitle("mixscape posterior probabilities")
image.png
image.png
代码语言:javascript
复制
# Run DE analysis and visualize results on a heatmap ordering cells by their posterior 
# probability values.
Idents(object = eccite) <- "gene"
MixscapeHeatmap(object = eccite, 
                ident.1 = "NT", 
                ident.2 = "IFNGR2", 
                balanced = F, 
                assay = "RNA", 
                max.genes = 20, angle = 0, 
                group.by = "mixscape_class", 
                max.cells.group = 300, 
                size=6.5) + NoLegend() +theme(axis.text.y = element_text(size = 16))
image.png
image.png
代码语言:javascript
复制
# Show that only IFNG pathway KO cells have a reduction in PD-L1 protein expression.
VlnPlot(
  object = eccite, 
  features = "adt_PDL1", 
  idents = c("NT","JAK2","STAT1","IFNGR1","IFNGR2", "IRF1"), 
  group.by = "gene", 
  pt.size = 0.2, 
  sort = T, 
  split.by = "mixscape_class.global", 
  cols = c("coral3","grey79","grey39")) +
  ggtitle("PD-L1 protein") +
  theme(axis.text.x = element_text(angle = 0, hjust = 0.5), plot.title = element_text(size = 20), axis.text = element_text(size = 16))
image.png
image.png

Visualizing perturbation responses with Linear Discriminant Analysis (LDA)

我们使用 LDA 作为降维方法来可视化特定扰动的集群。 LDA 尝试使用基因表达和标签作为输入来最大化已知标签(mixscape 类)的可分离性。

代码语言:javascript
复制
# Remove non-perturbed cells and run LDA to reduce the dimensionality of the data.
Idents(eccite) <- "mixscape_class.global"
sub <- subset(eccite, idents = c("KO", "NT"))

# Run LDA.
sub <- MixscapeLDA(
  object = sub, 
  assay = "RNA", 
  pc.assay = "PRTB", 
  labels = "gene", 
  nt.label = "NT", 
  npcs = 10, 
  logfc.threshold = 0.25, 
  verbose = F)

# Use LDA results to run UMAP and visualize cells on 2-D. 
# Here, we note that the number of the dimensions to be used is equal to the number of 
# labels minus one (to account for NT cells).
sub <- RunUMAP(
  object = sub,
  dims = 1:11,
  reduction = 'lda',
  reduction.key = 'ldaumap',
  reduction.name = 'ldaumap')

# Visualize UMAP clustering results.
Idents(sub) <- "mixscape_class"
sub$mixscape_class <- as.factor(sub$mixscape_class)

# Set colors for each perturbation.
col = setNames(object = hue_pal()(12),nm = levels(sub$mixscape_class))
names(col) <- c(names(col)[1:7], "NT", names(col)[9:12])
col[8] <- "grey39"

p <- DimPlot(object = sub, 
             reduction = "ldaumap", 
             repel = T, 
             label.size = 5, 
             label = T, 
             cols = col) + NoLegend()

p2 <- p+ 
  scale_color_manual(values=col, drop=FALSE) + 
  ylab("UMAP 2") +
  xlab("UMAP 1") +
  custom_theme
p2
image.png
image.png

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • overview
  • Loading required packages
  • Loading Seurat object containing ECCITE-seq dataset
  • RNA-based clustering is driven by confounding sources of variation
  • Calculating local perturbation signatures mitigates confounding effects
  • Mixscape identifies cells with no detectable perturbation
  • Inspecting mixscape results
  • Visualizing perturbation responses with Linear Discriminant Analysis (LDA)
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档