但是上面的分析流程已经是烂大街,其实你的表达矩阵还是可以有很多花样的。比如最近发表的文章:2019 Jun 4. doi: 10.1080/2162402X.2019.1617588 题目很长:Immune microenvironment profiling of gastrointestinal stromal tumors (GIST) shows gene expression patterns associated to immune checkpoint inhibitors response 但实际上就是在分析癌症病人的表达矩阵而已,Gene expression profiles (GEP) from 31 KIT/PDGFRA-mutant GIST(gastrointestinal stromal tumors),让我们一起来看看他们分析了哪些吧。
芯片和测序技术都可以得到表达矩阵
For the RNA-seq samples, the cDNA libraries were synthesized starting from 250 ng total RNA with TruSeq RNA Sample Prep Kit v2 (Illumina, San Diego, CA, USA) following the manufacturer’s protocol. HiScanSQ sequencer (Illumina) was used to generate sequences at 75bp in paired-end mode yielding an average of 61 million mapped reads/sample, reaching an average coverage of 44X. Read pairs were mapped on reference human genome and the gene expression was quantified using kallisto adopting the transcript per million (TPM) normalization.
For microarray samples, the RNA was quality-controlled and labeled as indicated by the Affymetrix expression technical manual and then hybridized to HG-U133Plus 2.0 arrays. Gene expression data were normalized and quantified by the RMA algorithm (package oligo, R-bioconductor).
The analytical tool CIBERSORT (Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts)35 was applied on 31 tumors samples that were analyzed either with Affymetrix Array (19 samples) or Illumina whole transcriptome sequencing (12 samples) as described in Table 1.
CIBERSORT uses a set of 22 immune cell reference profiles to derive a signature matrix which can be applied to deconvolute mixed samples in order to determine relative proportions of immune cells. Even if the CIBERSORT algorithm was originally developed using microarray data, it was declared as “platform agnostic”36 and, therefore, applicable to both Affymetrix and Illumina data.
The analysis was performed separately for the two set of data obtained with different techniques (microarray and RNA-seq).
For each set, an unsupervised hierarchical clustering analysis was adopted using the CIBERSORT absolute estimation with the aim to evaluate the variability of the main microenvironment cells subpopulations.
The transcript quantification data were first normalized and log2 transformed either with quantile normalization or log2TPM calculation respectively for microarray and RNA-seq data.
An additional normalization was performed by subtracting the arithmetic mean of ten housekeeping gene expression (Supplementary Table S7).