首页
学习
活动
专区
圈层
工具
发布
30 篇文章
1
使用PHATE复现Science Immunology上文章的结果
2
你确定你研究的是成纤维细胞吗
3
读取loom格式的单细胞文件
4
velocyto的正确安装方法
5
Seurat4.0系列教程20:单细胞对象的格式转换
6
Seurat4.0系列教程8:细胞周期评分和回归分析
7
MACA: 一款自动注释细胞类型的工具
8
肺癌四阶段:AAH-AIS-MIA-IA的单细胞图谱
9
你认为是双细胞人家说是全新细胞亚群
10
copykat为什么没有infercnv直观呢
11
一大波神经元单细胞亚群相关的标志基因
12
单细胞转录组分析—追踪移植后造血干细胞的分化
13
单细胞转录组揭示肺腺癌特有的肿瘤微环境
14
小细胞肺癌(SCLC)病人的scRNA-seq数据分析
15
单细胞转录组分析COVID-19重症患者肺泡巨噬细胞亚型
16
CancerSCEM: 人类癌症单细胞表达图谱数据库
17
你真的需要如此多的单细胞亚群注释工具吗
18
使用PHATE进行单细胞高维数据的可视化
19
小鼠早期原肠化的转录异质性和细胞命运决定的scRNA-seq图谱
20
单细胞测序揭示PD-L1免疫治疗联合紫杉醇化疗在三阴性乳腺癌中的作用机制
21
单细胞转录组的细分亚群的降维聚类分群加上gsea或者gsva以及转录因子和拟时序流程(仅需8000元)
22
单细胞不同亚群和状态能区分吗
23
肿瘤相关成纤维细胞异质性
24
肿瘤样品的单细胞需要提取上皮细胞继续细分
25
乳腺癌患者抗PD1治疗期间肿瘤内变化的单细胞图谱
26
晚期非小细胞肺癌肿瘤异质性和微环境的单细胞分析
27
脑组织单细胞悬液制备流程
28
什么,你想要的单细胞亚群比例太少了?
29
让Single cell UMAP注释支棱起来
30
RNAvelocity4:velocyto.R的使用
清单首页生信文章详情

使用PHATE复现Science Immunology上文章的结果

在上篇文章中,我们初步探索了PHATE的使用方法,发现它在揭示一些连续分化过程中不同细胞状态之间的微小局部差异具有很好的效果,同时也能保留细胞全局的整体结构。在本节教程中,我们将复现演示近期发表在Science Immunology期刊上的一篇文章的结果,进一步学习PHATE的相关使用方法。

文章信息

文章题目:A reservoir of stem-like CD8+ T cells in the tumor-draining lymph node preserves the ongoing anti-tumor immune response 期刊:Science Immunology 日期:2021年9月2日 DOI:https://doi.org/10.1126/sciimmunol.abg7836

文章结果图:

本文复现图:

文章数据下载

文章处理后的基因表达矩阵文件存放在GEO数据库中,检索号为 GSE182509[1],这里只提供了pkl格式的表达矩阵,需要用python的pickle包进行读取,我已将其转换为TSV文件存放在我的百度云盘中,有需要的可以下载使用。

链接:https://pan.baidu.com/s/1IoSIYoEfTzZarLWXWvgvzg 提取码:gkd9

加载示例数据

代码语言:javascript
复制
# 导入所需的python包
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import phate
import scprep
import magic

# 加载示例数据
chronic = scprep.io.load_tsv("GSM5530565_Chronic_LCMV_processed.matrix.txt")
acute = scprep.io.load_tsv("GSM5530564_Acute_LCMV_processed.matrix.txt")
tumor8 = scprep.io.load_tsv("GSM5530560_Lung_week8_processed.matrix.txt")
dln8 = scprep.io.load_tsv("GSM5530561_LN_week8_processed.matrix.txt")
tumor17 = scprep.io.load_tsv("GSM5530562_Lung_week17_processed.matrix.txt")
dln17 = scprep.io.load_tsv("GSM5530563_LN_week17_processed.matrix.txt")

# 查看示例数据
chronic.head()
#                    Mrpl15 (ENSMUSG00000033845)  ...  CAAA01147332.1 (ENSMUSG00000095742)
#0                                                ...
#AAACCTGAGAGTAAGG-1                          0.0  ...                             0.000000
#AAACCTGGTGTGAAAT-1                          0.0  ...                             0.000000
#AAACGGGAGGCTCATT-1                          0.0  ...                             0.000000
#AAACGGGGTAGCTTGT-1                          0.0  ...                             0.000000
#AAACGGGGTCGAATCT-1                          0.0  ...                             0.552073
#[5 rows x 11595 columns]
chronic.shape
#(1185, 11595)
acute.shape
#(10768, 12960)
tumor8.shape
#(806, 11749)
dln8.shape
#(1742, 12116)
tumor17.shape
#(731, 11150)
dln17.shape
#(876, 11595)

数据质控预处理和数据归一化

由于下载的数据是预先已经进行质控过滤和归一化处理的,这里将不再进行处理。详细用法见上期 [使用PHATE进行单细胞高维数据的可视化]

使用PHATE进行低维嵌入降维可视化

代码语言:javascript
复制
### analysis for chronic sample ###
#Embedding Data Using PHATE
# Instantiating the PHATE estimator
phate_operator = phate.PHATE(n_jobs=-2)

Y_phate = phate_operator.fit_transform(chronic)
Y_phate
#array([[ -5.34563999,   6.11670333],
#       [ 29.8771147 ,   8.33029219],
#       [  9.44218856, -25.94568946],
#       ...,
#       [-10.86402299,  -1.15554   ],
#       [-14.65615727,  -1.03794057],
#       [-16.53150258,   4.50993521]])

细胞聚类分群

代码语言:javascript
复制
# cell clustering
clusters = phate.cluster.kmeans(phate_operator, n_clusters=8)
clusters
#array([0, 2, 1, ..., 4, 4, 4], dtype=int32)

# 聚类结果可视化
scprep.plot.scatter2d(Y_phate, c=clusters, figsize=(8,7), cmap="Spectral",
                      ticks=False, label_prefix="PHATE", title= "Chronic LCMV")
plt.savefig("plot_phate_chronic_2d_by_cluster.png")

使用MAGIC进行数据填充

代码语言:javascript
复制
#Creating the MAGIC operator
magic_op = magic.MAGIC()

# Running MAGIC for all genes
chronic_magic = magic_op.fit_transform(chronic, genes='all_genes')
chronic_magic.head()
#                    Mrpl15 (ENSMUSG00000033845)  ...  CAAA01147332.1 (ENSMUSG00000095742)
#0                                                ...
#AAACCTGAGAGTAAGG-1                     0.150924  ...                             0.076723
#AAACCTGGTGTGAAAT-1                     0.153466  ...                             0.067584
#AAACGGGAGGCTCATT-1                     0.146601  ...                             0.064838

# rename columns names
chronic_magic.columns = [i.split(" ")[0] for i in chronic_magic.columns.tolist()]

# marker基因可视化
markers = ["Sell", "Ccr7", "Tcf7", "Slamf6", "Xcl1", "Il7r", "Eomes", "Tbx21", "Gzmb", "Prf1", "Pdcd1", "Havcr2", "Cd101", "Cx3cr1", "Cxcr6"]

for marker in markers:
        # 2d plot
        scprep.plot.scatter2d(Y_phate, c=chronic_magic[marker], figsize=(8,7), cmap="Reds", ticks=False, label_prefix="PHATE", title=marker + " magic expression")
        plt.savefig("plot_chronic_magic_marker_2d_" + marker + ".png")

根据这些marker基因的表达情况,我们将不同的cluster进行细胞类型的注释,得到以下的细胞注释结果。

  • Tnaive:Sell, Ccr7
  • Tsl: Tcf7, Slamf6, Xcl1
  • Ttrans: Cx3cr1, Cxcr6
  • Tex: Pdcd1, Havcr2, Cd101

多样本合并分析

代码语言:javascript
复制
### analysis for combined five sample ###
# Merge all datasets and create a vector representing the time point of each sample
alldata = [chronic,tumor8,tumor17,dln8,dln17]

EBT_counts, sample_labels = scprep.utils.combine_batches(
    alldata,
    ["Chronic","Early Tumor","Late Tumor","Early LN","Late LN"],
    append_to_cell_names=True
)
del alldata # removes objects from memory

EBT_counts.head()
#                            Mrpl15 (ENSMUSG00000033845)  ...  CAAA01147332.1 (ENSMUSG00000095742)
#AAACCTGAGAGTAAGG-1_Chronic                          0.0  ...                             0.000000
#AAACCTGGTGTGAAAT-1_Chronic                          0.0  ...                             0.000000
#AAACGGGAGGCTCATT-1_Chronic                          0.0  ...                             0.000000
#AAACGGGGTAGCTTGT-1_Chronic                          0.0  ...                             0.000000
#AAACGGGGTCGAATCT-1_Chronic                          0.0  ...                             0.552073
#[5 rows x 10246 columns]

EBT_counts.shape
#(5340, 10246)

sample_labels
#AAACCTGAGAGTAAGG-1_Chronic    Chronic
#AAACCTGGTGTGAAAT-1_Chronic    Chronic
#AAACGGGAGGCTCATT-1_Chronic    Chronic
#AAACGGGGTAGCTTGT-1_Chronic    Chronic
#AAACGGGGTCGAATCT-1_Chronic    Chronic
#                               ...
#Name: sample_labels, Length: 5340, dtype: object

PHATE降维可视化

代码语言:javascript
复制
#Embedding Data Using PHATE
# Instantiating the PHATE estimator
phate_operator = phate.PHATE(n_jobs=-2)

Y_phate = phate_operator.fit_transform(EBT_counts)
Y_phate
#array([[ 79.05651647,  15.42592929],
#       [ 20.72815444,  25.65566379],
#       [ 90.57712893,  -4.77917562],
#       ...,
#       [-52.39011592,  39.20142516],
#       [-28.51731009,  -8.66499775],
#       [-46.00734805, -17.37265621]])

scprep.plot.scatter2d(Y_phate, c=sample_labels, figsize=(10,8), cmap="Spectral",
                      ticks=False, label_prefix="PHATE")
plt.savefig("plot_phate_2d_by_sample.png")
代码语言:javascript
复制
# 3D visualization
phate_operator.set_params(n_components=3)
Y_phate_3d = phate_operator.transform()
Y_phate_3d
#array([[ 77.85894712,  13.21732854, -11.29708591],
#       [ 17.23971363,  19.47621167, -27.54015408],
#       [ 87.69788489,   4.0231593 ,  13.16082805],
#       ...,
#       [-46.46850791,  42.68065699,  -2.93836416],
#       [-20.53357918,  -4.9821892 , -22.70265013],
#       [-36.18351133,  -9.56494547, -29.67206442]])

scprep.plot.scatter3d(Y_phate_3d, c=sample_labels, figsize=(8,6), cmap="Spectral",
                      ticks=False, label_prefix="PHATE")
plt.savefig("plot_phate_3d_by_sample.png")

细胞聚类分群

代码语言:javascript
复制
# cell clustering
clusters = phate.cluster.kmeans(phate_operator, n_clusters=12)
clusters
#array([8, 6, 1, ..., 2, 4, 4], dtype=int32)

# save meta data
meta = pd.merge(pd.DataFrame(sample_labels),pd.DataFrame(clusters,index=sample_labels.index,columns=["cluster"]),left_index=True,right_index=True)
meta.head()
#                           sample_labels  cluster
#AAACCTGAGAGTAAGG-1_Chronic       Chronic        8
#AAACCTGGTGTGAAAT-1_Chronic       Chronic        6
#AAACGGGAGGCTCATT-1_Chronic       Chronic        1
#AAACGGGGTAGCTTGT-1_Chronic       Chronic        8
#AAACGGGGTCGAATCT-1_Chronic       Chronic        8

meta.to_csv("metadata.csv")

scprep.plot.scatter3d(Y_phate_3d, c=clusters, figsize=(8,6), cmap="Spectral",
                      ticks=False, label_prefix="PHATE")
plt.savefig("plot_phate_3d_by_cluster.png")
代码语言:javascript
复制
scprep.plot.scatter2d(Y_phate, c=clusters, figsize=(8,6), cmap="Spectral",
                      ticks=False, label_prefix="PHATE")
plt.savefig("plot_phate_2d_by_cluster.png")
代码语言:javascript
复制
# to save as a gif:
scprep.plot.rotate_scatter3d(Y_phate_3d, c=sample_labels,
                             figsize=(8,6), cmap="Spectral",
                             ticks=False, label_prefix="PHATE", filename="phate.gif")

使用MAGIC进行数据填充

代码语言:javascript
复制
# rename columns names
EBT_counts.columns = [i.split(" ")[0] for i in EBT_counts.columns.tolist()]

# MAGIC imputation
#Creating the MAGIC operator
magic_op = magic.MAGIC()

#Running MAGIC with gene selection
#bmmsc_magic = magic_op.fit_transform(bmmsc_data, genes=["Mpo", "Klf1", "Ifitm1"])
#bmmsc_magic.head()

#Visualizing gene-gene relationships
#fig, (ax1, ax2) = plt.subplots(1,2, figsize=(16, 6))

#scprep.plot.scatter(x=bmmsc_data['Mpo'], y=bmmsc_data['Klf1'], c=bmmsc_data['Ifitm1'],  ax=ax1,
#                    xlabel='Mpo', ylabel='Klf1', legend_title="Ifitm1", title='Before MAGIC')

#scprep.plot.scatter(x=bmmsc_magic['Mpo'], y=bmmsc_magic['Klf1'], c=bmmsc_magic['Ifitm1'], ax=ax2, xlabel='Mpo', ylabel='Klf1', legend_title="Ifitm1", title='After MAGIC')
#plt.tight_layout()
#plt.show()

# Running MAGIC for all genes
EBT_counts_magic = magic_op.fit_transform(EBT_counts, genes='all_genes')
EBT_counts_magic.head()

markers = ["Sell", "Ccr7", "Tcf7", "Slamf6", "Xcl1", "Il7r", "Eomes", "Tbx21", "Gzmb", "Prf1", "Pdcd1", "Havcr2", "Cd101", "Cx3cr1", "Cxcr6"]

for marker in markers:
        # 2d plot
        scprep.plot.scatter2d(Y_phate, c=EBT_counts_magic[marker], figsize=(8,6), cmap="Reds", ticks=False, label_prefix="PHATE", title=marker + " magic expression")
        plt.savefig("plot_magic_marker_2d_" + marker + ".png")
        # 3d plot
        scprep.plot.scatter3d(Y_phate_3d, c=EBT_counts_magic[marker], figsize=(8,6), cmap="Reds", ticks=False, label_prefix="PHATE", title=marker + " magic expression")
        plt.savefig("plot_magic_marker_3d_" + marker + ".png")

基因差异表达分析

代码语言:javascript
复制
# Differential analysis
# DE by samples
de_samples = scprep.stats.differential_expression_by_cluster(EBT_counts_magic,clusters=sample_labels,direction="up")
de_samples
de_samples["Chronic"]

for key,value in de_samples.items():
        print(value.head(n=10))

# save DE data
de_samples["Chronic"].to_csv("DEs_Chronic.csv")
de_samples["Early Tumor"].to_csv("DEs_Early_Tumor.csv")
de_samples["Early LN"].to_csv("DEs_Early_LN.csv")

# DE by clusters
de_clusters = scprep.stats.differential_expression_by_cluster(EBT_counts_magic,clusters=clusters,direction="up")
de_clusters
de_clusters[0]

for key,value in de_clusters.items():
        print(value.head(n=10))

文中链接

[1]

GSE182509: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE182509

下一篇
举报
领券