R绘图笔记 | 一般的散点图绘制

DoubleHelix

发布于 2020-11-03 11:31:26

5.2K0

发布于 2020-11-03 11:31:26

文章被收录于专栏：生物信息云

可先阅读文章：R绘图笔记 | R语言绘图系统与常见绘图函数及参数

1.利用plot()绘制散点图

R语言中plot()函数的基本格式如下：

plot(x,y,...)

plot函数中，x和y分别表示所绘图形的横坐标和纵坐标；函数中的...为附加的参数。plot函数默认的使用格式如下：

plot(x, y = NULL, type = "p", xlim = NULL, ylim = NULL, log = "", main = NULL, sub = NULL, xlab = NULL, ylab = NULL, ann = par("ann"), axes = TRUE, frame.plot = axes, panel.first = NULL, panel.last = NULL, asp = NA, ...)

主要参数的含义如下：

（1）type为一个字符的字符串，用于给定绘图的类型，可选的值如下：

"p"：绘点（默认值）；
"l"：绘制线；
"b"：同时绘制点和线；
"c"：仅绘制参数"b"所示的线；
"o"：同时绘制点和线，且线穿过点；
"h"：绘制出点到横坐标轴的垂直线；
"s"：绘制出阶梯图（先横后纵）；
"S"：绘制出阶梯图（先纵后竖）；
"n"：作空图。

（2）main参数字符串，给出图形的标题；

（3）sub参数字符串，给出图形的子标题；

（4）xlab 和 ylab参数字符串，用于给出x轴和y轴的标签。

（5）xlim 和 ylim参数都是二维向量，分别表示x轴和y轴的取值范围。

（6）pch参数。

绘制第一个散点图

####第一个图
x <- runif(50,0,2)
y <- runif(50,0,2)
plot(x, y, main="我的第一个散点图", sub="subtitle", 
     xlab="横坐标", ylab="纵坐标", pch=16)

添加文本和线

text(0.6,0.6,"(0.6,0.6)")
abline(h=.6,v=.6, col='red')

第二个散点图

####第二个图
x <- runif(50,0,2)
y <- runif(50,0,2)

plot(x, y, type="n", xlab="", ylab="", axes=F) 
points(x,y) #添加坐标点
axis(1) #添加横轴
axis(at=seq(0,2,0.5), side=2) #添加纵轴
box() #补齐散点图的边框
title(main="散点图", sub="subtitle", xlab="x轴", ylab="y轴")
abline(h=0.6,v=0.6,col="red")

2.利用ggpolt2绘图

data(trees) # 加载数据集
head(trees) # 预览数据集

绘图

ggplot(trees, aes(x=Girth,y=Height)) +
  geom_point()

ggplot(trees, aes(x=Girth,y=Height)) +
  geom_point(alpha=0.5)

ggplot(trees, aes(x=Girth,y=Height)) +
  stat_bin2d()

ggplot(trees, aes(x=Girth,y=Height)) +
  stat_bin2d(bins=50) +
  scale_fill_gradient(low="lightblue", high="red" ,limits=c(0,5))

高级绘图

ggplot(data = trees, aes(Girth,Volume)) +
  geom_point(fill="black",colour="black",size=3,shape=21) +
  geom_smooth(method = 'loess',span=0.4,se=TRUE,colour="#00A5FF",fill="#00A5FF",alpha=0.2)+ #(f)
  scale_y_continuous(breaks = seq(0, 125, 25))+
  theme(
    text=element_text(size=15,color="black"),
    plot.title=element_text(size=15,family="myfont",hjust=.5,color="black"),
    legend.position="none"
  )

geom_smooth()函数提供了平滑算法，基本能够满足平时实验数据处理的要求。

添加数据拟合线性模型绘图

fit <- lm(Volume ~ Girth, data = trees) #线性拟合
trees$predicted <- predict(fit)   # 保存预测值
trees$residuals <- residuals(fit) # 保存残差（有正有负）
trees$Abs_Residuals<-abs(trees$residuals)  #保存残差的绝对值

ggplot(trees, aes(x = Girth, y = Volume)) +
  geom_point(aes(fill =Abs_Residuals, size = Abs_Residuals),shape=21,colour="black") + # size also mapped
  #使用实际的值绘制气泡图，并将气泡的颜色和面积映射到残差的绝对值
  scale_fill_continuous(low = "black", high = "blue") + #填充颜色映射到蓝色单色渐变系
  geom_smooth(method = "lm", se = FALSE, color = "lightgrey") +
  #添加灰色的线性拟合模型
  geom_point(aes(y = predicted), shape = 1) + #添加空心圆圈的预测值
  geom_segment(aes(xend = Girth, yend = predicted), alpha = .2) + #添加实际值与预测值之间的连线
  guides(fill = guide_legend((title="Rresidual")),
         size = guide_legend((title="Rresidual")))+
  ylim(c(0,80))+
  xlab("Girth")+
  ylab("Volume")+
  theme(text=element_text(size=15,face="plain",color="black"),
        axis.title=element_text(size=10,face="plain",color="black"),
        axis.text = element_text(size=10,face="plain",color="black"),
        legend.position = "right",
        legend.title  = element_text(size=13,face="plain",color="black"),
        legend.text = element_text(size=10,face="plain",color="black"),
        legend.background = element_rect(fill=alpha("white",0)))

3.其他散点图函数

除了上面的包和函数可以绘制散点图外，还有一些包也可以绘制复杂性的散点图。比如说car包中的scatterplot()函数和lattice包的xyplot()函数。car包中的scatterplot()函数增强了散点图的许多功能，它可以很方便地绘制散点图，并能添加拟合曲线、边界箱线图和置信椭圆，还可以按子集绘图和交互式地识别点。

# 函数1：
scatterplot(formula, data, subset, xlab, ylab, id=FALSE,
    legend=TRUE, ...)
# 函数2：    
scatterplot(x, y, boxplots=if (by.groups) "" else "xy",
            regLine=TRUE, legend=TRUE, id=FALSE, ellipse=FALSE, grid=TRUE,
            smooth=TRUE,
            groups, by.groups=!missing(groups),
            xlab=deparse(substitute(x)), ylab=deparse(substitute(y)),
            log="", jitter=list(), cex=par("cex"),
            col=carPalette()[-1], pch=1:n.groups,
            reset.par=TRUE, ...)

重要参数：

formula # 模型公式；类似y~x，如果按组绘制，则类似y~x|z，其中z为分组变量；
data # 为模型公式中变量来源的数据集；
subset # 指定筛选数据子集；
x, y # 分别表示水平（x轴）和垂直（y轴）坐标的数字向量；
boxplots # 如为x，则在下方绘制水平x轴的边界箱线图；如为y，则在左边绘制垂直y轴的边界箱线图；
# 如为xy，则在水平和垂直轴上都绘制边界箱线图；设置""或FALSE则不绘制边界箱线图；
regLine # 默认添加拟合回归线；如为FALSE，则不添加；
# 指定lm()函数拟合回归线，默认参数为regLine=list(method=lm, lty=1, lwd=2, col=col)
legend # 逻辑词，当按组绘制散点图且为TRUE时图上显示图例；为FALSE则不绘制图例；
grid # 逻辑词，为TRUE则绘制浅灰色背景网格；
groups # 分组变量或因子；使用不同的颜色、绘图符号等来绘制分组图形；
by.groups # 为TRUE，则按分组拟合回归线；
xlab、ylab # x轴和y轴标签；
log # 绘制对数坐标轴；
jitter # 包含x、y或两者都有的列表；指定散点图中点的水平和垂直坐标的抖动因子；
cex # 设置绘图字符的大小，默认为1；
# 其他参数为cex.axis、cex.lab、cex.main和cex.sub等；
col # 未分组时，直接指定绘制颜色；分组时，设置参数长度应等于组数的颜色向量；
pch # 点的绘图符号；分组时默认按顺序使用字符；

library(car)
scatterplot(Volume ~ Girth, data = trees, 
            xlab="Girth", ylab="Volume",smooth=FALSE)

下面是帮助文档的案例。

scatterplot(prestige ~ income, data=Prestige, ellipse=TRUE)
scatterplot(prestige ~ income, data=Prestige, smooth=list(smoother=quantregLine))
# use quantile regression for median and quartile fits
scatterplot(prestige ~ income | type, data=Prestige,
            smooth=list(smoother=quantregLine, var=TRUE, span=1, lwd=4, lwd.var=2))
scatterplot(prestige ~ income | type, data=Prestige, legend=list(coords="topleft"))
scatterplot(vocabulary ~ education, jitter=list(x=1, y=1),
            data=Vocab, smooth=FALSE, lwd=3)
scatterplot(infantMortality ~ ppgdp, log="xy", data=UN, id=list(n=5))
scatterplot(income ~ type, data=Prestige)

另外，还有ggpubr包中的ggscatter()函数也可以绘制散点图。

library(ggpubr)
data(mtcars)
dat <- mtcars
dat$cyl <- as.factor(dat$cyl)
head(dat[, c("wt", "mpg", "cyl")], 3)
ggscatter(dat, x = "wt", y = "mpg",
          color = "cyl", shape = "cyl",
          palette = c("#00AFBB", "#E7B800", "#FC4E07"),
          ellipse = TRUE, mean.point = TRUE,
          star.plot = TRUE)

## 部分参数解释
data, x, y # data指数据框，x、y为数据框中用来绘制图形的变量
combine # 逻辑词，默认FALSE，仅当y是包含多个变量的向量时使用；如为TRUE，则创建组合面板图
merge # 逻辑词或字符；默认FALSE，仅当y是包含多个变量的向量时使用；如为TRUE，则在同一绘图区域合并多个y变量；
# 字符为"asis"或"flip"，如为"flip"，则y变量翻转为x轴刻度，x变量翻转为分组变量
color、fill # 设置点的颜色
palette # 设置线图颜色的调色板；可为灰色调色板"grey"；自定义调色板c("blue","red")
# ggsci包调色板:"npg","aaas","lancet","jco","ucscgb","uchicago","simpsons"和"rickandmorty"。
shape # 点的形状
size  # 数值，设置点和轮廓的大小
point  # 逻辑词，为TRUE，则在图上显示点
rug # 逻辑词，为TRUE，则显示边缘地毯
title # 图形标题
xlab、ylab  # 指定x轴、y轴的标签；当xlab = FALSE时隐藏标签，y轴同
facet.by  # 长度为1-2的字符向量，指定绘制分面的分组向量，分组向量应在数据框中
panel.labs  # 修改面板标签的字符向量的列表；用法：
# 一个分组向量：panel.labs = list(sex = c("Male", "Female")) 
# 两个分组向量：panel.labs = list(sex = c("Male", "Female"),rx = c("Obs","Lev","Lev2")).
short.panel.labs # 逻辑词，默认为TRUE，省略变量名称为面板创建简短标签
add  # 添加另一个绘图元素的字符向量；
# 如"none", "dotplot", "jitter", "boxplot", "point", 
# "mean", "mean_se", "mean_sd", "mean_ci", "mean_range", 
# "median", "median_iqr", "median_hilow", "median_q1q3", 
# "median_mad", "median_range"
add.params  # 参数add的参数(color, shape, size, fill, linetype)，
# 用法add.params = list(color = "red").
conf.int # 逻辑词，为TRUE，则增加置信区间
conf.int.level # 设置置信区间的置信水平，默认95%
fullrange # 仅在add!="none"时使用，拟合是跨越图的整个范围还是只跨越数据？
ellipse # 逻辑词，为TRUE，则在点周围绘制椭圆
ellipse.level # 点周围椭圆的大小，默认0.95
ellipse.type # 使用字符来指定框的类型，允许值有"convex", "confidence"、"t"、"norm"、"euclid"等
ellipse.alpha # 椭圆的透明度，用于指定填充颜色的透明度，无填充颜色，设置为0。
ellipse.border.remove # 逻辑词，为TRUE，则删除椭圆边框线
mean.point # 逻辑词，为TRUE，则将分组平均点添加到绘图中
mean.point.size # 指定平均点大小的数值
star.plot # 逻辑词，为TRUE，则生成星图
star.plot.lty、star.plot.lwd # 星图的线型和线宽
label # 包含点标签的列的名称，也可以是长度=nrow(data)的向量
font.label # 包含下列元素的列表：大小、类型、颜色等；用法：
# font.label = list(size = 14, face = "bold", color ="red")
font.family # 指定标签的字体格式
label.select # 字符向量，指定要显示的一些标签；
repel # 逻辑词，是否使用ggrepel避免过度绘制文本标签。
label.rectangle # 逻辑词，如为TRUE，则在文本下方添加矩形便于阅读
parse # 为TRUE，标签将被解析为表达式
cor.coef # 逻辑词，为TRUE，相关系数的p值添加到图上
cor.coeff.args # 一系列参数传递给stat_cor函数，用来自定义相关系数的显示，用法：
# cor.coeff.args = list(method = "pearson", label.x.npc = "right", label.y.npc = "top").
cor.method # 计算相关系数的方法，可用值："pearson", "kendall"或"spearman".
cor.coef.coord # 长度为2的数字向量，指定相关系数的x、y坐标，默认值为NULL
cor.coef.size # 相关系数文字字体的大小
ggp # 不为NULL，则将点添加到现有绘图中
show.legend.text # 逻辑词，图例中是否包含文字；
ggtheme # ggplot2主题名称，默认为theme_pubr()；
# 可用值包括theme_gray(),theme_bw(),theme_minimal(),theme_classic(),theme_void()...

参考资料：

1.R语言数据可视化之美，张杰/著

2.scatterplot()函数帮助文件

3.ggscatter()函数帮助文件

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2020-10-28，如有侵权请联系 cloudcommunity@tencent.com 删除

r 语言