大家好,今天我们来聊聊python中anndata对象(scanpy)和seurat对象的转化。
通常我们会有多个单细胞样品需要处理,但是有的作者上传了python中常用的h5格式的单细胞文件。但是我们更希望使用seuart来处理这些单细胞样本,应该如何做呢?假如我们目录下有如下样本:
注:因为是多个样本,所以需要我们在python中循环读取h5文件成一个list,然后concatenate成一个大的adata对象,最后导出
import scanpy as scimport os# Define the directory containing the h5 filesdirectory = '/home/data/t040413/20240125_kidney'# List all h5 files in the directoryh5_files = [f for f in os.listdir(directory) if f.endswith('.h5')]print(h5_files)# Initialize an empty list to store the AnnData objectsadata_list = []# Read each h5 file and append the AnnData object to the listfor file in h5_files: file_path = os.path.join(directory, file) adata = sc.read_10x_h5(file_path) adata.var_names_make_unique() # Extract the sample name from the file name sample_name = file.replace("_filtered_feature_bc_matrix.h5", "") print(sample_name) adata.obs['sample'] = sample_name # Assign sample name to each cell # Basic filtering sc.pp.filter_cells(adata, min_genes=200) sc.pp.filter_genes(adata, min_cells=3) adata_list.append(adata)# Concatenate all AnnData objects into oneall_data = adata_list[0].concatenate(adata_list[1:], batch_key='Sample')#batch_key 只是添加一个标签而已,真正的分组信息在最开始就弄好了# Print the concatenated dataprint(all_data)
从python中导出adata数据
import scipy.sparse as sparseimport scipy.io as sioimport scipy.stats as statsimport numpy as npcellinfo=all_data.obsgeneinfo=all_data.varmtx=all_data.X.Tcellinfo.to_csv("cellinfo.csv")geneinfo.to_csv("geneinfo.csv")sio.mmwrite("sparse_matrix.mtx",mtx)!pwd
cellinfo=read.csv("./cellinfo.csv",row.names = "X")head(cellinfo)geneinfo=read.csv("./geneinfo.csv",row.names = "X");head(geneinfo)geneinfo=geneinfo[,c(2,3)]head(geneinfo)# counts=ReadMtx(mtx = "./sparse_matrix.mtx",# cells = "./cellinfo.csv",# features = "./geneinfo.csv")counts=Matrix::readMM(file = "./sparse_matrix.mtx")head(counts)[,1:9]dim(counts)rownames(counts)=rownames(geneinfo)colnames(counts)=rownames(cellinfo)library(Seurat)kidney=CreateSeuratObject(counts = counts,project = "kidney",meta.data = cellinfo)dim(kidney) #
然后就可以在r中愉快的使用seuat进行下游分析。
目前这是我找到的比较通用的annadata对象转化为seuart对象的最优方法。很多r包提供了函数来实现annadata与seuart对象,但是报错频出,不如就用这个最原始的方法。
到这里,读者可以自行尝试h5ad文件转为seurat对象
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。