ggstatsplot绘图|统计+可视化,学术科研神器

ggstatsplotggplot2包的扩展包,可以同时输出美观的图片和统计分析结果,对于经常做统计分析或者生信人来说非常有用。

数据准备

gapminder 数据集包含1952到2007年间(5年间隔)的142个国家的life expectancy, GDP per capita, 和 population信息。

#载入绘图R包
library(ggstatsplot)
#载入gapminder 数据集
library(gapminder)
head(gapminder)

ggstatsplot-R包含有很多绘图函数(文末会给出),本文仅展示ggbetweenstats函数使用方法。

ggbetweenstats绘图

1 基本绘图展示

显示2007年每个continent的预期寿命分布情况,并统计一下不同大陆之间平均预期寿命的是否有差异?差异是否显著?

#设置种子方便复现
set.seed(123)
# Oceania数据太少,去掉后分析
ggstatsplot::ggbetweenstats(
  data = dplyr::filter(
    .data = gapminder::gapminder,
    year == 2007, continent != "Oceania"
  ),
  x = continent,
  y = lifeExp,
  nboot = 10,
  messages = FALSE
)

可以看到图中展示出了2007年每个continent的预期寿命分布的箱线图,点图和小提琴图,均值,样本数;并且图形最上方给出了模型的一些统计量信息(整体)。

统计信息意义如下图所示(官网):

注:该函数根据分组变量中的个数自动决定是选择独立样本t检验(2组)还是单因素方差分析(3组或更多组)

2 添加统计值

上方给出了整体的检验P值,下面进行两两之间比较,并添加检验统计量

set.seed(123)
ggstatsplot::ggbetweenstats(
  data = dplyr::filter(
    .data = gapminder::gapminder,year == 2007, continent != "Oceania"),
  x = continent,y = lifeExp,
  nboot = 10,
  messages = FALSE,
  effsize.type = "unbiased", # type of effect size (unbiased = omega)
  partial = FALSE, # partial omega or omega?
  pairwise.comparisons = TRUE, # display results from pairwise comparisons
  pairwise.display = "significant", # display only significant pairwise comparisons
  pairwise.annotation = "p.value", # annotate the pairwise comparisons using p-values
  p.adjust.method = "fdr", # adjust p-values for multiple tests using this method
)

3 图形美化

#添加标题和说明,x轴和y轴标签,标记,离群值,更改主题以及调色板。

set.seed(123)
gapminder %>% # dataframe to use
  ggstatsplot::ggbetweenstats(
    data = dplyr::filter(.data = ., year == 2007, continent != "Oceania"),
    x = continent, # grouping/independent variable
    y = lifeExp, # dependent variables
    xlab = "Continent", # label for the x-axis
    ylab = "Life expectancy", # label for the y-axis
    plot.type = "boxviolin", # type of plot ,"box", "violin", or "boxviolin"
    type = "parametric", # type of statistical test , p (parametric), np ( nonparametric), r(robust), bf (Bayes Factor).
    effsize.type = "biased", # type of effect size
    nboot = 10, # number of bootstrap samples used
    bf.message = TRUE, # display bayes factor in favor of null hypothesis
    outlier.tagging = TRUE, # whether outliers should be flagged
    outlier.coef = 1.5, # coefficient for Tukey's rule
    outlier.label = country, # label to attach to outlier values
    outlier.label.color = "red", # outlier point label color
    mean.plotting = TRUE, # whether the mean is to be displayed
    mean.color = "darkblue", # color for mean
    messages = FALSE, # turn off messages
    ggtheme = ggplot2::theme_gray(), # a different theme
    package = "yarrr", # package from which color palette is to be taken
    palette = "info2", # choosing a different color palette
    title = "Comparison of life expectancy across continents (Year: 2007)",
    caption = "Source: Gapminder Foundation"
  ) + # modifying the plot further
  ggplot2::scale_y_continuous(
    limits = c(35, 85),
    breaks = seq(from = 35, to = 85, by = 5)
  )

其他绘图函数

Function

Plot

Description

ggbetweenstats

violin plots

for comparisons between groups/conditions

ggwithinstats

violin plots

for comparisons within groups/conditions

gghistostats

histograms

for distribution about numeric variable

ggdotplotstats

dot plots/charts

for distribution about labeled numeric variable

ggpiestats

pie charts

for categorical data

ggbarstats

bar charts

for categorical data

ggscatterstats

scatterplots

for correlations between two variables

ggcorrmat

correlation matrices

for correlations between multiple variables

ggcoefstats

dot-and-whisker plots

for regression models

更多请参照官方文档

https://indrajeetpatil.github.io/ggstatsplot/index.html

下一篇
举报

扫码关注云+社区

领取腾讯云代金券