前面我们在生信技能树系统性介绍了大量RNA-seq相关背景知识,以及表达矩阵分析的一般流程
其中差异分析我们使用了limma/voom,edgeR,DESeq2这3个流程,很多朋友比较感兴趣到底应该是选择哪一个,而且它们的区别是?
具体的统计学原理我们推荐大家看:
这里我们直接看效果,正好最近重新复习TCGAbiolinks看到了这个图。
步骤分解:
DEA analyses of TCGA-BRCA data comparing luminal subtypes with normal samples. A-B) Volcano plots are shown where only those genes with logFC higher than 6 or lower than-6 are labelled and only the significant up-or down-regulated genes are shown as dots.
We carried out DEA using the limma (A) or edgeR pipelines (B) of TCGAbiolinks. C) The correlation plot between the logFC estimated by the two pipelines for the top 500 DE genes is shown. The genes discussed in the main text are highlighted in bold. D) The intersect between all the DE genes estimated by the two pipelines is shown using UpSet. https://doi.org/10.1371/journal.pcbi.1006701.g003
现在的CNS文章,或多或少使用一些TCGA教程,你把这个R包学习10遍,写成200篇自己的笔记,未来一年的学习计划。
这个包涵盖了TCGA的方方面面:https://bioconductor.org/packages/release/bioc/html/TCGAbiolinks.html
这里面的每一个case都值得你花几个小时去学习!