我读了很多关于在ggplot2中对热图的y轴进行排序的问答,因此我觉得写另一个热图很糟糕,但我似乎无法实现我想要的。(这可能是因为我是R的新手,刚刚开始掌握R的术语和工作原理。)提前感谢您的帮助!
我正在尝试生成一个用于基因富集分析的热图。我的数据以以下形式导入为.csv文件:Gene Category Description Variable1 Variable2 Variable3。因此,每一行都列出了一个基因,基因所属的类别(每个类别中有多个基因),基因类别的描述,与每个样本相关的数值(3列,每个样本一个值)。
我想要做的是按类别对y轴进行排序,同时按基因绘制值。(如果能给它贴上标签,那就太棒了!)下面是我到目前为止拥有的代码……它似乎按字母顺序对y轴进行排序。
library(ggplot2)
library(reshape2)
GO_sum <- read.csv("~/R/FuncEnr/GO_sum.csv", header=T)
GO_sum.m <- melt(GO_sum, id = c("Gene", "Category", "Description"), na.rm = FALSE)
(GOplot <- ggplot(GO_sum.m, aes(variable, Gene)) +
geom_tile(aes(fill = value), colour = "white") +
scale_fill_gradient2(low = "darkred", high = "darkblue", guide="colorbar"))谢谢!
以下是一些示例数据(复制和粘贴,另存为.csv):
Gene Category Description s1 s2 s3
G0001 GO:0000036 acyl carrier activity -1.357472549 -1.357472549 -0.703587499
G0002 GO:0000103 sulfate assimilation 0 -0.761925294 -1.772268589
G0003 GO:0000104 succiNAte dehydrogeNAse activity -1.192800096 -1.192800096 -1.192800096
G0014 GO:0000160 two-component sigNAl transduction system (phosphorelay) 0 -1.772268589 -1.192800096
G0005 GO:0000287 magnesium ion binding -1.772268589 -1.772268589 -1.192800096
G0006 GO:0000287 magnesium ion binding -1.192800096 -1.192800096 -1.164082367
G0007 GO:0000287 magnesium ion binding -1.132072566 -1.772268589 -1.772268589
G0008 GO:0000287 magnesium ion binding -1.452170577 0 -1.192800096
G0009 GO:0000287 magnesium ion binding 0 -1.772268589 -1.192800096
G0083 GO:0003676 nucleic acid binding -1.192800096 -1.192800096 -1.772268589
G0044 GO:0003676 nucleic acid binding -0.587905946 -0.363837338 -0.843984355
G0045 GO:0003676 nucleic acid binding 0.212339083 0.212339083 0.276358685
G0046 GO:0003676 nucleic acid binding -0.374137972 -0.761925294 -0.761925294
G0147 GO:0003677 DNA binding 0 0 0
G0048 GO:0003677 DNA binding -1.192800096 0 -1.192800096
G0049 GO:0003677 DNA binding 0.530699113 -0.340270054 -0.485584696
G0050 GO:0003677 DNA binding -1.192800096 -0.374137972 -0.374137972发布于 2013-06-10 15:49:09
我建议您使用facet_grid()按显示基因类别的列Description划分您的曲线图。使用参数scales="free_y"和space="free_y",您可以确保每个方面中的瓦片大小相同。只有您应该为Description使用较短的名称,因为较长的名称不适合。
ggplot(GO_sum.m, aes(variable, Gene)) +
geom_tile(aes(fill = value), colour = "white") +
scale_fill_gradient2(low = "darkred", high = "darkblue", guide="colorbar")+
facet_grid(Description~.,scales="free_y",space="free_y")https://stackoverflow.com/questions/16994260
复制相似问题