前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >R语言绘制临床基线表(Table1三线表)-compareGroups包

R语言绘制临床基线表(Table1三线表)-compareGroups包

原创
作者头像
凑齐六个字吧
发布2024-07-05 09:06:11
930
发布2024-07-05 09:06:11
举报
文章被收录于专栏:科研工具

compareGroupsR包是一个比较常用的用于绘制临床基线表的R包。

开发者对它的功能定义主要侧重于绘制描述性表格,可以显示多个变量的平均值、标准差、分位数或频率,以及运用统计学方法计算各组之间的P值。

今天就跟着github上的资料和网上各路大神的教程过一遍这个R包。参考资料链接附在推文末尾。

这是开发者告诉使用者这个R包的结构图,对于使用者来说最重要的就是三步:计算,构建和输出。

上述三步对应三个关键的函数:

compareGroups()

createTable()

export2word() 输出函数有很多变体。

除此之外,开发者告诫使用者该包的功能不是对数据进行质量控制。建议使用者导入分析的数据只包含需要分析的变量(或分析前需在R中处理好) ,并且要知道如何对变量进行分类 ,因为后续进行分析时需要将变量设定为因子以及命名(设置label属性)

1、安装和加载R包
代码语言:javascript
复制
#两种方法都可以
#install.packages("compareGroups")
#library(devtools); devtools::install_github(repo = "isubirana/compareGroups")

#加载R包
library(compareGroups)
2、导入数据(该示例数据是心血管相关的)
代码语言:javascript
复制
data("regicor", package = "compareGroups") #导入示例数据
str(regicor)

#'data.frame': 2294 obs. of  25 variables:
# $ id      : num  2.26e+03 1.88e+03 3.00e+09 3.00e+09 3.00e+09 ...
#  ..- attr(*, "label")= Named chr "Individual id"
#  .. ..- attr(*, "names")= chr "id"
# $ year    : Factor w/ 3 levels "1995","2000",..: 3 3 2 2 2 2 2 1 3 1 ...
#  ..- attr(*, "label")= Named chr "Recruitment year"
#  .. ..- attr(*, "names")= chr "year"
# $ age     : int  70 56 37 69 70 40 66 53 43 70 ...
#  ..- attr(*, "label")= Named chr "Age"
#  .. ..- attr(*, "names")= chr "age"
#展示了部分str后的数据

每一个$后面代表了一个参数,每个参数有三行。

第一行代表为参数内容,其中num,Factor,int这些就不解释了;

第二行为参数的label名称,也就是最终展示在表格上的各参数的名称;

第三行为参数的names,也就是导入文件中各参数的名称。

3、创建心血管事件(CV)和死亡时间变量
代码语言:javascript
复制
library(survival)
regicor$tcv <- with(regicor, Surv(tocv, as.integer(cv=='Yes')))
attr(regicor$tcv, "label") <- "Cardiovascular"
regicor$tdeath <- with(regicor, Surv(todeath, as.integer(death=='Yes')))
attr(regicor$tdeath, "label") <- "Mortality"

封装前:

封装后:

4、整体描述
代码语言:javascript
复制
compareGroups( ~ ., data = regicor)
#-------- Summary of results ---------
#   var                                              N    method            selection
#1  Individual id                                    2294 continuous normal ALL      
#2  Recruitment year                                 2294 categorical       ALL      
#3  Age                                              2294 continuous normal ALL      
#4  Sex                                              2294 categorical       ALL      
#5  Smoking status                                   2233 categorical       ALL      
#6  Systolic blood pressure                          2280 continuous normal ALL      
#7  Diastolic blood pressure                         2280 continuous normal ALL      
#8  History of hypertension                          2286 categorical       ALL      
#9  Hypertension treatment                           2251 categorical       ALL      
#10 Total cholesterol                                2193 continuous normal ALL      
# 展示部分

函数解释: “~”前面代表的是分组变量,代表后续按照这个变量分组,但我这里没有设置分组变量。“~”后边代表的需要纳入统计的变量,这里“.”代表所有变量。

结果解释: Var代表各种变量,N代表数量,method代表是什么类型的变量,selection代表纳入分析的样本量。

5、按照year进行三分组,并且去除id
代码语言:javascript
复制
compareGroups(year~ . -id, data = regicor)
#-------- Summary of results by groups of 'Recruitment year'---------
#   var                                              N    p.value  method            selection
#1  Age                                              2294 0.078*   continuous normal ALL      
#2  Sex                                              2294 0.506    categorical       ALL      
#3  Smoking status                                   2233 <0.001** categorical       ALL      
# 展示部分

chisq.test(regicor$sex, regicor$year)
# Pearson's Chi-squared test

#data:  regicor$sex and regicor$year
#X-squared = 1.364, df = 2, p-value = 0.5056

同理,用多个“-”符号 可以去除多个变量

这里的结果部分多了p值,这里的p值是按照数据类型进行分析。这里的sex用的是卡方检验,给出了不同sex在不同year中的p值,需要统计学知识做支撑了。

6、可以选择部分数据进行分析
代码语言:javascript
复制
# subset指定某一亚组数据,在纳入分析的变量进行分析
compareGroups(year ~ age  + smoker + chol, data = regicor, 
    subset = sex == "Female")
#-------- Summary of results by groups of 'year'---------

#  var               N    p.value  method            selection      
#1 Age               1193 0.351    continuous normal sex == "Female"
#2 Smoking status    1162 <0.001** categorical       sex == "Female"
#3 Total cholesterol 1139 0.004**  continuous normal sex == "Female"
#-----
#Signif. codes:  0 '**' 0.05 '*' 0.1 ' ' 1 

# selec可以指定某一亚组中按照某一参数的情况进行分析
compareGroups(year ~ age  + smoker + chol, data = regicor, 
              selec = list(smoker = chol > 100))
#-------- Summary of results by groups of 'Recruitment year'---------

#  var               N    p.value  method            selection 
#1 Age               2294 0.078*   continuous normal ALL       
#2 Smoking status    2143 <0.001** categorical       chol > 100
#3 Total cholesterol 2193 <0.001** continuous normal ALL       
#-----
#Signif. codes:  0 '**' 0.05 '*' 0.1 ' ' 1 

# 先脑补一下,自行尝试
compareGroups(year ~ age  + smoker + chol, data = regicor, 
              subset = sex == "Female",
              selec = list(smoker = chol > 100))
7、连续型变量可用method选择不同的统计方法。
代码语言:javascript
复制
compareGroups(year ~ age + smoker + chol, data=regicor, 
              method = c(chol=NA), 
              alpha= 0.01)

#-------- Summary of results by groups of 'Recruitment year'---------

#  var               N    p.value  method                selection
#1 Age               2294 0.078*   continuous normal     ALL      
#2 Smoking status    2233 <0.001** categorical           ALL      
#3 Total cholesterol 2193 <0.001** continuous non-normal ALL      
#-----
#Signif. codes:  0 '**' 0.05 '*' 0.1 ' ' 1 

method:统计检验方法。1:数值型变量,正态分布。2:数值型变量,非正态分布。3:分类变量。

NA:使用shapiro.test()决定是否正态分布,compareGroups函数将自动选择合适的方法。

alpha:正态性检验的阈值。

8、设置变量取值数、分组数
代码语言:javascript
复制
regicor$age7gr <- as.integer(cut(regicor$age, breaks = c(-Inf, 
    55, 60, 65, 70, 75, 80, Inf), right = TRUE))
compareGroups(year ~ age7gr, data = regicor, method = c(age7gr = NA))

#-------- Summary of results by groups of 'Recruitment year'---------

#  var    N    p.value method                selection
#1 age7gr 2294 0.422   continuous non-normal ALL      
#-----
#Signif. codes:  0 '**' 0.05 '*' 0.1 ' ' 1 

# min.dis:所有非因子型向量都会被认为是连续型的,除非某个变量的取值少于5个,可通过此参数更改这个标准。
compareGroups(year ~ age7gr, data = regicor, method = c(age7gr = NA), 
    min.dis = 8) 

#-------- Summary of results by groups of 'Recruitment year'---------

#  var    N    p.value method      selection
#1 age7gr 2294 0.163   categorical ALL      
#-----
#Signif. codes:  0 '**' 0.05 '*' 0.1 ' ' 1 

# max.ylev:设置分组变量的最大组数,说实话还没理解
compareGroups(year ~ sex + age7gr + chol, data = regicor, max.ylev = 7)
#-------- Summary of results by groups of 'Recruitment year'---------

#  var               N    p.value  method            selection
#1 Sex               2294 0.506    categorical       ALL      
#2 age7gr            2294 0.552    continuous normal ALL      
#3 Total cholesterol 2193 <0.001** continuous normal ALL      
#-----
#Signif. codes:  0 '**' 0.05 '*' 0.1 ' ' 1
9、当method这里制定变量设置为2时,默认格式变为"中位数[25%分位数,75%分位数]"
代码语言:javascript
复制
resu1 <- compareGroups(year ~ age + chol, data = regicor, 
    method = c(chol = 2))
createTable(resu1) #创建表格。
#--------Summary descriptives table by 'Recruitment year'---------
#_____________________________________________________________________ 
#                      1995          2000          2005      p.overall 
#                      N=431         N=786        N=1077               
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Age                54.1 (11.7)   54.3 (11.2)   55.3 (10.6)    0.078   
#Total cholesterol 225 [196;254] 222 [193;250] 209 [184;238]  <0.001   
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 

#分位数可以通过Q1和Q2进行指定,如果指定了Q1=0 则代表最小值,Q2=1则代表最大值
resu2 <- compareGroups(year ~ age + chol, data = regicor, 
                       method = c(chol = 2), 
                       Q1 = 0.025, Q3 = 0.975) #2.5%, 97.5%
createTable(resu2)
#--------Summary descriptives table by 'Recruitment year'---------
#_____________________________________________________________________ 
#                      1995          2000          2005      p.overall 
#                      N=431         N=786        N=1077               
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Age                54.1 (11.7)   54.3 (11.2)   55.3 (10.6)    0.078   
#Total cholesterol 225 [148;311] 222 [150;315] 209 [133;318]  <0.001 
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
10、simplify参数排除含有0的亚组
代码语言:javascript
复制
# 开发者设置了一个新的列
regicor$smk <- regicor$smoker
levels(regicor$smk) <- c("Never smoker", "Current or former < 1y", "Former >= 1y", "Unknown")
attr(regicor$smk, "label") <- "Smoking 4 cat."
cbind(table(regicor$smk))

# 不加simplify = FALSE 结果会给出warning:
compareGroups(year ~ age + smk + bmi, data = regicor)
#Warning message:
#In compare.i(X[, i], y = y, selec.i = selec[i], method.i = method[i],  :
##  Some levels of 'smk' are removed since no observation in that/those levels

# 加了之后就没有warning 啦~ 
compareGroups(year ~ age + smk + bmi, data = regicor, simplify = FALSE)
#-------- Summary of results by groups of 'Recruitment year'---------

#  var             N    p.value  method            selection
# 1 Age             2294 0.078*   continuous normal ALL      
# 2 Smoking 4 cat.  2233 .        categorical       ALL      
# 3 Body mass index 2259 <0.001** continuous normal ALL      
# -----
# Signif. codes:  0 '**' 0.05 '*' 0.1 ' ' 1 

# 支持更新对象
```{r}
# 具体不演示了,就是数据的增减~ 
# 如下是示例代码,需自行修改
res <- compareGroups(group ~ age + sex + smoke + waist + hormo, 
    data = predimed)
res
res <- update(res, . ~ . - sex + bmi + toevent, subset = sex == 
    "Female", method = c(waist = 2, tovent = 2), selec = list(bmi = !is.na(hormo)))
res
11、summary函数可以输出更详细的数据
代码语言:javascript
复制
res <- compareGroups(year ~ age + sex + smoker + chol, 
    method = c(chol = 2), data = regicor)
summary(res[c(1, 2, 4)])

# --- Descriptives of each row-variable by groups of 'Recruitment year' ---

#------------------- 
#row-variable: Age 

#      N    mean     sd       lower    upper    p.overall p.trend  p.1995 vs 2000 p.1995 vs 2005 p.2000 vs 2005
#[ALL] 2294 54.73627 11.04926 54.28388 55.18866                                                                
#1995  431  54.09745 11.7172  52.98813 55.20677 0.077837  0.031665 0.930249       0.143499       0.161195      
#2000  786  54.33715 11.21814 53.55168 55.12262                                                                
#2005  1077 55.28319 10.62606 54.64786 55.91853                                                                

#------------------- 
#row-variable: Sex 

#      Male Female Male%    Female%  p.overall p.trend  p.1995 vs 2000 p.1995 vs 2005 p.2000 vs 2005
#[ALL] 1101 1193   47.99477 52.00523                                                                
#1995  206  225    47.79582 52.20418 0.505601  0.543829 0.793746       0.793746       0.791583      
#2000  390  396    49.61832 50.38168                                                                
#2005  505  572    46.88951 53.11049                                                                

#------------------- 
#row-variable: Total cholesterol 

#      N    med Q1  Q3  lower upper p.overall p.trend p.1995 vs 2000 p.1995 vs 2005 p.2000 vs 2005
#[ALL] 2193 215 189 245 213   218                                                                 
#1995  403  225 196 254 220   230   0         0       0.330934       0              0             
#2000  715  222 193 250 217   227                                                                 
#2005  1075 209 184 238 206   211 
12、plot画图
代码语言:javascript
复制
plot(res[c(1)], file = "~/Desktop/", type = "png") # Age
plot(res[c(2)], file = "~/Desktop/", type = "png") # Sex
plot(res[c(3,4)], file = "~/Desktop/", type = "png") # smoker + chol

13、可提取其中的一些信息,比如P值、均值、比值比、风险比等
代码语言:javascript
复制
# 开发者提供了示例数据:SNPs,提供了代码hhh
library(SNPassoc)
data(SNPs)
tab <- createTable(compareGroups(casco ~ snp10001 + snp10002 + snp10005 + snp10008 + snp10009, SNPs))
pvals <- getResults(tab, "p.overall")
p.adjust(pvals, method = "BH")
# snp10001  snp10002  snp10005  snp10008  snp10009 
#0.7051300 0.7072158 0.7583432 0.7583432 0.7072158

# ORR,HR。 show.radtio参数设置为TRUE
res1 <- compareGroups(tdeath ~ age + sex + bmi + smoker, 
                      data = regicor, 
                      # ref = c(smoker = 1, sex = 2) 可以通过设置数字来设置参考水平哦
                      ref = 1)
createTable(res1, show.ratio = TRUE)
#--------Summary descriptives table by 'Mortality'---------
#______________________________________________________________________________________ 
#                             No event      Event           HR        p.ratio p.overall 
#                              N=1975       N=173                                       
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Age                        54.3 (11.0)  60.5 (10.3) 1.05 [1.04;1.07] <0.001   <0.001   
#Sex:                                                                           0.349   
#    Male                   943 (47.7%)  87 (50.3%)        Ref.        Ref.             
#    Female                 1032 (52.3%) 86 (49.7%)  0.87 [0.64;1.17]  0.349            
#Body mass index            27.4 (4.45)  30.2 (5.01) 1.11 [1.08;1.14] <0.001   <0.001   
#Smoking status:                                                               <0.001   
#    Never smoker           1052 (54.3%) 86 (49.7%)        Ref.        Ref.             
#    Current or former < 1y 488 (25.2%)  69 (39.9%)  1.74 [1.27;2.39]  0.001            
#    Former >= 1y           397 (20.5%)  18 (10.4%)  0.56 [0.34;0.94]  0.027            
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

# 还可以画丑丑的生存曲线..
plot(compareGroups(tdeath ~ sex, data = regicor), bivar = TRUE, 
    file = "~/Desktop/", type = "png")
14、创建描述性表格,关键函数createTable
代码语言:javascript
复制
res <- compareGroups(year ~ age + sex + smoker + chol, 
                     data = regicor, 
                     selec = list(chol = sex == "Female"))
restab <- createTable(res)

#可以用print查看描述性表格
print(restab, which.table = "descr")
#--------Summary descriptives table by 'Recruitment year'---------
#________________________________________________________________________ 
#                              1995        2000        2005     p.overall 
#                              N=431       N=786      N=1077              
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Age                        54.1 (11.7) 54.3 (11.2) 55.3 (10.6)   0.078   
#Sex:                                                             0.506   
#    Male                   206 (47.8%) 390 (49.6%) 505 (46.9%)           
#    Female                 225 (52.2%) 396 (50.4%) 572 (53.1%)           
#Smoking status:                                                 <0.001   
#    Never smoker           234 (56.4%) 414 (54.6%) 553 (52.2%)           
#    Current or former < 1y 109 (26.3%) 267 (35.2%) 217 (20.5%)           
#    Former >= 1y           72 (17.3%)  77 (10.2%)  290 (27.4%)           
#Total cholesterol          226 (42.4)  224 (44.9)  216 (50.3)    0.004   
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 

#可用性数据的表格
print(restab, which.table = "avail")
#---Available data----
#________________________________________________________________________ 
#                  [ALL] 1995 2000 2005      method           select      
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Age               2294  431  786  1077 continuous-normal       ALL       
#Sex               2294  431  786  1077    categorical          ALL       
#Smoking status    2233  415  758  1060    categorical          ALL       
#Total cholesterol 1139  207  362  570  continuous-normal sex == "Female" 
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
15、CreatTable中关键参数介绍
代码语言:javascript
复制
# 延续上面的数据
res <- compareGroups(year ~ age + sex + smoker + chol +death, 
                     data = regicor, 
                     selec = list(chol = sex == "Female"))
res
# hide.no = no, 如果某个变量含有no这个类别,可以全部隐藏
createTable(res, hide.no = "no") # 比如上面的death中含有no
#--------Summary descriptives table by 'Recruitment year'---------
#________________________________________________________________________ 
#                              1995        2000        2005     p.overall 
#                              N=431       N=786      N=1077              
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Age                        54.1 (11.7) 54.3 (11.2) 55.3 (10.6)   0.078   
#Sex:                                                             0.506   
#    Male                   206 (47.8%) 390 (49.6%) 505 (46.9%)           
#    Female                 225 (52.2%) 396 (50.4%) 572 (53.1%)           
#Smoking status:                                                 <0.001   
#    Never smoker           234 (56.4%) 414 (54.6%) 553 (52.2%)           
#    Current or former < 1y 109 (26.3%) 267 (35.2%) 217 (20.5%)           
#    Former >= 1y           72 (17.3%)  77 (10.2%)  290 (27.4%)           
#Total cholesterol          226 (42.4)  224 (44.9)  216 (50.3)    0.004   
#Overall death              18 (4.65%)  81 (11.0%)  74 (7.23%)   <0.001   
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 


# digits用于控制表格中小数点的位数
createTable(res, digits = c(age = 2, sex = 3))
#Age                        54.10 (11.72) 54.34 (11.22) 55.28 (10.63)   0.078   
#Sex:                                                                   0.506   
#    Male                   206 (47.796%) 390 (49.618%) 505 (46.890%)           
#    Female                 225 (52.204%) 396 (50.382%) 572 (53.110%)     


# show.n = TRUE 展示每个变量的所有可用数量
createTable(res, show.n = TRUE)
#--------Summary descriptives table by 'Recruitment year'---------
#_____________________________________________________________________________ 
#                              1995        2000        2005     p.overall  N   
#                              N=431       N=786      N=1077                   
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Age                        54.1 (11.7) 54.3 (11.2) 55.3 (10.6)   0.078   2294 
#Sex:                                                             0.506   2294 
#    Male                   206 (47.8%) 390 (49.6%) 505 (46.9%)                
#    Female                 225 (52.2%) 396 (50.4%) 572 (53.1%)                
#Smoking status:                                                 <0.001   2233 
#    Never smoker           234 (56.4%) 414 (54.6%) 553 (52.2%)                
#    Current or former < 1y 109 (26.3%) 267 (35.2%) 217 (20.5%)                
#    Former >= 1y           72 (17.3%)  77 (10.2%)  290 (27.4%)                
#Total cholesterol          226 (42.4)  224 (44.9)  216 (50.3)    0.004   1139 
#Overall death:                                                  <0.001   2148 
#    No                     369 (95.3%) 657 (89.0%) 949 (92.8%)                
#    Yes                    18 (4.65%)  81 (11.0%)  74 (7.23%)                 
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 


# show.descr = FALSE表示不展示描述统计部分,只显示P值
createTable(res, show.descr = FALSE)
#--------Summary descriptives table by 'Recruitment year'---------
#____________________________________ 
#                           p.overall 
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Age                          0.078   
#Sex:                                 
#    Male                     0.506   
#    Female                           
#Smoking status:                      
#    Never smoker            <0.001   
#    Current or former < 1y           
#    Former >= 1y                     
#Total cholesterol            0.004   
#Overall death:                       
#    No                      <0.001   
#    Yes                              
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

# 因变量有2个以上的类别,可通过show.p.trend = TRUE展示p-for-trend,符合正态分布通过pearson计算,不符合通过spearman计算
createTable(res, show.p.trend = TRUE)

# show.p.mul:分组变量多于两组可以进行两两比较,符合正态分布用Turkey,不符合用Benjamini & Hochberg
createTable(res, show.p.mul = TRUE)

# 因变量是2分类或生存数据,show.ratio = TRUE可展示ORR或者HR(上面提到过)
createTable(update(res, subset = year != 1995), show.ratio = TRUE)

#  digits.ratio控制ORR和HR的小数点位数
createTable(compareGroups(tdeath ~ year + age + sex, data = regicor),
            show.ratio = TRUE,
            digits.ratio = 3)
#--------Summary descriptives table by 'Mortality'---------
#________________________________________________________________________________ 
#                    No event      Event            HR          p.ratio p.overall 
#                     N=1975       N=173                                          
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Recruitment year:                                                       <0.001   
#    1995          369 (18.7%)  18 (10.4%)         Ref.          Ref.             
#    2000          657 (33.3%)  81 (46.8%)  2.416 [1.450;4.027]  0.001            
#    2005          949 (48.1%)  74 (42.8%)  1.509 [0.901;2.526]  0.118            
#Age               54.3 (11.0)  60.5 (10.3) 1.052 [1.037;1.067] <0.001   <0.001   
#Sex:                                                                     0.349   
#    Male          943 (47.7%)  87 (50.3%)         Ref.          Ref.             
#    Female        1032 (52.3%) 86 (49.7%)  0.867 [0.644;1.169]  0.349            
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
16、导出前内容修改:列名修改,行合并,列合并,strataTable快速分层,descrTable快速制表
代码语言:javascript
复制
# print或者导出表格时,header.labels可修改列名:
final <- createTable(compareGroups(tdeath ~ year + age + sex, data = regicor),
                    show.all = TRUE)
print(final, header.labels = c(p.overall = "p-value", all = "ALL"))
#--------Summary descriptives table by 'Mortality'---------
#_______________________________________________________________ 
#                      ALL        No event      Event    p-value 
#                     N=2148       N=1975       N=173            
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Recruitment year:                                       <0.001  
#    1995          387 (18.0%)  369 (18.7%)  18 (10.4%)          
#    2000          738 (34.4%)  657 (33.3%)  81 (46.8%)          
#    2005          1023 (47.6%) 949 (48.1%)  74 (42.8%)          
#Age               54.8 (11.0)  54.3 (11.0)  60.5 (10.3) <0.001  
#Sex:                                                     0.349  
#    Male          1030 (48.0%) 943 (47.7%)  87 (50.3%)          
#    Female        1118 (52.0%) 1032 (52.3%) 86 (49.7%)          
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 

# 可按照行合并表格-rbind
restab1 <- createTable(compareGroups(year ~ age + sex, data = regicor))
restab2 <- createTable(compareGroups(year ~ chol + smoker, data = regicor))
rbind(`Non-modifiable risk factors` = restab1, `Modifiable risk factors` = restab2)
#--------Summary descriptives table by 'Recruitment year'---------
#____________________________________________________________________________ 
#                                  1995        2000        2005     p.overall 
#                                  N=431       N=786      N=1077              
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ 
#Non-modifiable risk factors:
#    Age                        54.1 (11.7) 54.3 (11.2) 55.3 (10.6)   0.078   
#    Sex:                                                             0.506   
#        Male                   206 (47.8%) 390 (49.6%) 505 (46.9%)           
#        Female                 225 (52.2%) 396 (50.4%) 572 (53.1%)           
#Modifiable risk factors:
#    Total cholesterol          225 (43.1)  224 (44.4)  213 (45.9)   <0.001   
#    Smoking status:                                                 <0.001   
#        Never smoker           234 (56.4%) 414 (54.6%) 553 (52.2%)           
#        Current or former < 1y 109 (26.3%) 267 (35.2%) 217 (20.5%)           
#        Former >= 1y           72 (17.3%)  77 (10.2%)  290 (27.4%) 
x <- rbind(`Non-modifiable` = restab1, Modifiable = restab2)
rbind(`Non-modifiable` = restab1, Modifiable = restab2)[c(1,4)] #可以选择要想的变量

# 按列合并-cbind
res <- compareGroups(sex ~ age + chol, data = regicor)
alltab <- createTable(res, show.p.overall = FALSE)
femaletab <- createTable(update(res, subset = sex == "Female"), 
    show.p.overall = FALSE)
maletab <- createTable(update(res, subset = sex == "Male"), show.p.overall = FALSE)
cbind(ALL = alltab, FEMALE = femaletab, MALE = maletab)
#--------Summary descriptives table ---------
#___________________________________________________________________
#                            ALL              FEMALE        MALE    
#                  _______________________  ___________  ___________
#                     Male       Female       Female        Male     
#                    N=1101      N=1193       N=1193       N=1101    
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
#Age               54.8 (11.1) 54.7 (11.0)  54.7 (11.0)  54.8 (11.1) 
#Total cholesterol 217 (42.7)  220 (47.4)   220 (47.4)   217 (42.7)  
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

# strataTable 快速分层
res <- compareGroups(sex ~ age + chol, data = regicor)
restab <- createTable(res, hide.no = "no")
strataTable(restab, "cv")
#--------Summary descriptives table ---------
#______________________________________________________________________________________
#                                 No                                 Yes               
#                  _________________________________  _________________________________
#                     Male       Female    p.overall     Male       Female    p.overall 
#                     N=996      N=1075                  N=46        N=46               
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
#Age               54.7 (11.1) 54.6 (11.0)   0.785    58.2 (11.5) 56.7 (10.6)   0.511   
#Total cholesterol 216 (42.4)  219 (46.5)    0.124    223 (44.4)  225 (56.2)    0.827   
#¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

# descrTable可把compareGroup和createTable两步合并
# 自行尝试-建议上面步骤全部练熟了之后再用这个
descrTable(sex ~ age + chol, data = regicor)
17、导出数据
代码语言:javascript
复制
export2csv(restab, file='table1.csv') #导出为CSV
export2html(restab, file='table1.html') #导出为HTML
export2latex(restab, file='table1.tex') #导出为LaTeX
export2pdf(restab, file='table1.pdf') #导出为PDF
export2md(restab, file='table1.md') #导出为Markdown
export2word(restab, file='table1.docx') #导出为Word
export2xls(restab, file='table1.xlsx') #导出为Excel

# strip按变量添加条形行和颜色
export2md(restab, strip = TRUE, first.strip = TRUE)
# size修改字的大小
export2md(restab, size = 6)
# width修改变量列的宽度
export2md(restab, width = "400px")

参考资料:

1、https://htmlpreview.github.io/?https://github.com/isubirana/compareGroups/blob/master/compareGroups_vignette.html (开发者)

2、https://www.jstatsoft.org/article/view/v057i12 (开发者)

3、https://ayueme.github.io/R_medical_stat/comparegroups.html (阿越老师)

:若对内容有疑惑或者有发现明确错误的朋友,请联系后台(欢迎交流)。更多内容可关注公众号:生信方舟

- END -

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1、安装和加载R包
  • 2、导入数据(该示例数据是心血管相关的)
  • 3、创建心血管事件(CV)和死亡时间变量
  • 4、整体描述
  • 5、按照year进行三分组,并且去除id
  • 6、可以选择部分数据进行分析
  • 7、连续型变量可用method选择不同的统计方法。
  • 8、设置变量取值数、分组数
  • 9、当method这里制定变量设置为2时,默认格式变为"中位数[25%分位数,75%分位数]"
  • 10、simplify参数排除含有0的亚组
  • 11、summary函数可以输出更详细的数据
  • 12、plot画图
  • 13、可提取其中的一些信息,比如P值、均值、比值比、风险比等
  • 14、创建描述性表格,关键函数createTable
  • 15、CreatTable中关键参数介绍
  • 16、导出前内容修改:列名修改,行合并,列合并,strataTable快速分层,descrTable快速制表
  • 17、导出数据
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档