前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >「R」forestmodel给多水平变量添加整体p值

「R」forestmodel给多水平变量添加整体p值

作者头像
王诗翔呀
发布2022-01-21 08:55:52
7310
发布2022-01-21 08:55:52
举报
文章被收录于专栏:优雅R优雅R

前段时间收到来信:

代码语言:javascript
复制
Hi Shixiang 

I am writing to you about the forestmodel package in R. 

Thank you so much for the wonderful package that you created. 

I was wondering if there is a way to display the wald test p-value which is important for variables that have more than two levels. I tried to work around the code but did not find a way out. 

Best 
Aniket 

我不是作者,搞错了人,问我干啥呢~自个提问嘛

代码语言:javascript
复制
Hi Aniket,

I am not the author of forestmodel, you can see from https://github.com/NikNakk/forestmodel/ or CRAN package info. As for your question, I cannot understand it based on the current info you provided. Could you post an issue on https://github.com/NikNakk/forestmodel/issues?

Best,

很明白的回复吧。这位一点也不尴尬,继续问我。。。还仔细说问题了。

代码语言:javascript
复制
Hi Shixiang 

I posted the issue on github. There is another user who has requested same help. 

It’s basically about displaying a global p-value (from ward test or likelihood ratio test) when there are multiple categories (more than two, for e.g race- Asian, black, Caucasian, Etc) in a variable, in addition to individual p-value against the reference. 

Best 
Aniket

好吧,这下让我有点兴趣了。我仔细看了下issue(https://github.com/NikNakk/forestmodel/issues/31),发现提问人是想要把多水平变量的p值展示在森林图上。

我觉得挺没有必要的,而且有点奇怪,为什么呢?我展示了一个示例:

代码语言:javascript
复制
> library("survival")
> library("dplyr")
> pretty_lung <- lung %>%
+   transmute(time,
+             status,
+             Age = age,
+             Sex = factor(sex, labels = c("Male", "Female")),
+             ECOG = factor(lung$ph.ecog),
+             `Meal Cal` = meal.cal
+   )
> table(lung$ph.ecog)

  0   1   2   3 
 63 113  50   1 
> coxph(Surv(time, status) ~ ., pretty_lung)
Call:
coxph(formula = Surv(time, status) ~ ., data = pretty_lung)

                 coef  exp(coef)   se(coef)      z       p
Age         7.848e-03  1.008e+00  1.080e-02  0.727 0.46727
SexFemale  -5.050e-01  6.035e-01  1.913e-01 -2.640 0.00829
ECOG1       2.756e-01  1.317e+00  2.221e-01  1.241 0.21462
ECOG2       7.852e-01  2.193e+00  2.543e-01  3.087 0.00202
ECOG3       1.792e+00  6.004e+00  1.036e+00  1.731 0.08348
`Meal Cal` -6.031e-05  9.999e-01  2.300e-04 -0.262 0.79313

Likelihood ratio test=21.56  on 6 df, p=0.001453
n= 180, number of events= 133 
   (48 observations deleted due to missingness)

这种多变量的cox回归中,p值展示的是整个模型的结果,而ECOG这个因子变量本身建模时被拆分成了3个变量,是没法得到一个p值的。那为啥搞这个呢?

继续的交流了解到他们就是想要进行批量的单变量分析,想要展示整个变量的p值,还给我用图形举例说明了。

image-20210831195841536

image-20210831195908154

了解了问题,再加上我很久之前修改过这个包的源代码,提交过PR,所以大体感觉这个问题不难。想办法把上图中右侧的reference在需要时右侧添加文字就好了。通过添加一个新的参数来控制这种行为。

安装包:

代码语言:javascript
复制
remotes::install_github("ShixiangWang/forestmodel")

看看效果:

代码语言:javascript
复制
library("survival")
library("dplyr")
library("forestmodel")

pretty_lung <- lung %>%
  transmute(time,
            status,
            Age = age,
            Sex = factor(sex, labels = c("Male", "Female")),
            ECOG = factor(lung$ph.ecog),
            `Meal Cal` = meal.cal
  )

print(forest_model(coxph(Surv(time, status) ~ ., pretty_lung), show_global_p = "bottom"))
print(forest_model(coxph(Surv(time, status) ~ ., pretty_lung), show_global_p = "aside"))

image-20210831202115822

在实现的过程中发现将global p值加到最下方也是有益的,并不仅限于单因素模型使用。

image-20210831200453864

代码语言:javascript
复制
print(forest_model(coxph(Surv(time, status) ~ ECOG, pretty_lung), show_global_p = "aside"))

image-20210831200526104

由于我之前开发的ezcox包可以直接接收传到forestmodel的其他参数,所以只要无需改动ezcox就完成了对该包该特性的支持。有使用ezcox包的读者可以试试。

由于做了不少维护和开发工作,就这两天forestmodel作者将我加入了作者列表[1]。这算是无心插柳吗?

参考资料

[1]

作者列表: https://github.com/NikNakk/forestmodel/blob/d001c80f5b001ff8c4c82201b1f223be64415b8b/DESCRIPTION#L9

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-09-08,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 优雅R 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 参考资料
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档