生信交流与合作请关注公众号@生信探索
我把常用的函数写成了几个包,方便之后使用; bioquest 包括三个子包 tl、pl、st 分别是常用的工具包 DdatafFame 的处理、画图、字符串处理; genekit 包括基因名转换、格式转换、差异分析、提取 TCGA 数据等的函数; sckit 包括单细胞分析的一些函数; 可以在https://jihulab.com/BioQuest找到这些函数。
在读入和后续分析的时候会出现报错,解决方法如下,修改后再读入。
兼容性问题:修改 tissue_positions.csv 为 tissue_positions_list.csv,并且把 header 删除
cd ~/Project/ST/data/A/outs/spatial
cp tissue_positions.csv tissue_positions_list.csv
sed -i '1d' tissue_positions_list.csv
cd ~/Project/ST/data/P/outs/spatial
cp tissue_positions.csv tissue_positions_list.csv
sed -i '1d' tissue_positions_list.csv
from pathlib import Path
import re
import warnings
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scanpy as sc
import seaborn as sns
import squidpy as sq
import bioquest as bq
import sckit as sk
可以使用 scanpy 或 squidpy 读入,两者是兼容的
adata_dct = {}
for i in Path("data").glob("**/outs"):
_s = str(i).split('/')[1]
_a = sc.read_visium(i,library_id=_s)
_a.var_names_make_unique()
adata_dct[_s] = _a
adata = sc.concat(adata_dct,label="library_id",uns_merge="unique")
adata.var_names_make_unique()
adata.obs_names_make_unique()
adata
# AnnData object with n_obs × n_vars = 6049 × 32285
# obs: 'in_tissue', 'array_row', 'array_col', 'library_id'
# uns: 'spatial'
# obsm: 'spatial'
跟普通 scRNA-seq 没有太大差别
sk.qcMetrics(adata,batch_key="library_id",output_dir=OUTPUT_DIR,mitochondrion=True)
sk.qc_hist(adata,batch_key="library_id",n_genes_by_counts=(0,6000),total_counts=(0,60000),output_dir=OUTPUT_DIR)
顺序是 A、P、A、P
gene_bottom,gene_up = np.quantile(adata.obs.n_genes_by_counts.values, [.2,.98])
count_bottom,count_up = np.quantile(adata.obs.total_counts.values, [.2,.98])
afilter = {
"pct_counts_Mito":"x<30",
"n_genes_by_counts":f"{gene_bottom}<x<{gene_up}",
"total_counts":f"{count_bottom}<x<{count_up}"
}
adata = sk.subset(adata,afilter,inplace=False) # inplace还没实现,adata[]取子集后id会改变
adata = sk.normalise(adata,batch_key='library_id',flavor="Seurat_v3",n_top_genes=3000,output_dir=OUTPUT_DIR,pca_use_hvg=True,n_jobs=24, inplace=False)
sc.tl.leiden(adata, key_added="Cluster")
sc.tl.tsne(adata,n_jobs=24)
sc.pl.tsne(adata,color="Cluster",legend_loc="on data",show=False,legend_fontoutline=3);
sc.pl.umap(adata, color="Cluster",legend_loc="on data",show=False,legend_fontoutline=3);
fig, axes = plt.subplots(1, 2, figsize=(6, 3))
for x,y in zip(axes,sorted(adata.uns['spatial'].keys())):
sc.pl.spatial(adata[adata.obs.library_id==y],
frameon=False,
color="Cluster",
size=1.5,
library_id=y,
title=y,
ax=x,
show=False,
legend_loc="right margin" if y=='P' else None);
plt.subplots_adjust(wspace=0.1)
https://www.10xgenomics.com/cn/products/spatial-gene-expression
https://scanpy-tutorials.readthedocs.io/en/latest/spatial/basic-analysis.html
https://squidpy.readthedocs.io/en/stable/auto_tutorials/tutorial_visium_hne.html
https://www.jianshu.com/p/391b2ee2c22d
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。