文章/答案/技术大牛

发布

社区首页 >问答首页 >当“对比只能应用于具有2个或更多水平的因素”时，如何进行GLM？

问当“对比只能应用于具有2个或更多水平的因素”时，如何进行GLM？
EN

Stack Overflow用户

提问于 2018-05-12 01:20:54

回答 1查看 4.6K关注 0票数 2

我想用glm在R中做一个回归，但是有没有办法这样做，因为我得到了对比度错误。

mydf <- data.frame(Group=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12),
                   WL=rep(c(1,0),12), 
                   New.Runner=c("N","N","N","N","N","N","Y","N","N","N","N","N","N","Y","N","N","N","Y","N","N","N","N","N","Y"), 
                   Last.Run=c(1,5,2,6,5,4,NA,3,7,2,4,9,8,NA,3,5,1,NA,6,10,7,9,2,NA))

mod <- glm(formula = WL~New.Runner+Last.Run, family = binomial, data = mydf)
#Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
# contrasts can be applied only to factors with 2 or more levels

regression

glm

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-29 02:11:15

使用这里定义的debug_contr_error和debug_contr_error2函数：How to debug “contrasts can be applied only to factors with 2 or more levels” error?，我们可以很容易地定位问题:变量New.Runner中只剩下一个级别。

info <- debug_contr_error2(WL ~ New.Runner + Last.Run, mydf)

info[c(2, 3)]
#$nlevels
#New.Runner 
#         1 
#
#$levels
#$levels$New.Runner
#[1] "N"

## the data frame that is actually used by `glm`
dat <- info$mf

单一水平的因子不能应用于对比，因为任何一种对比都会减少1的水平数。通过1 - 1 = 0，这个变量将从模型矩阵中删除。

那么，我们能不能简单地要求没有对比度应用于单级因子？不是的。所有的对比方法都禁止这样做：

contr.helmert(n = 1, contrasts = FALSE)
#Error in contr.helmert(n = 1, contrasts = FALSE) : 
#  not enough degrees of freedom to define contrasts

contr.poly(n = 1, contrasts = FALSE)
#Error in contr.poly(n = 1, contrasts = FALSE) : 
#  contrasts not defined for 0 degrees of freedom

contr.sum(n = 1, contrasts = FALSE)
#Error in contr.sum(n = 1, contrasts = FALSE) : 
#  not enough degrees of freedom to define contrasts

contr.treatment(n = 1, contrasts = FALSE)
#Error in contr.treatment(n = 1, contrasts = FALSE) : 
#  not enough degrees of freedom to define contrasts

contr.SAS(n = 1, contrasts = FALSE)
#Error in contr.treatment(n, base = if (is.numeric(n) && length(n) == 1L) n else length(n),  : 
#  not enough degrees of freedom to define contrasts

实际上，如果你仔细考虑，你会得出结论，没有对比的因子，只有一个水平的因子只是全1的一个虚拟变量，即截距。所以，你绝对可以做到以下几点：

dat$New.Runner <- 1    ## set it to 1, as if no contrasts is applied

mod <- glm(formula = WL ~ New.Runner + Last.Run, family = binomial, data = dat)
#(Intercept)   New.Runner     Last.Run  
#     1.4582           NA      -0.2507

由于rank-deficiency，你可以得到New.Runner的NA系数。事实上，applying contrasts is a fundamental way to avoid rank-deficiency。只是当一个因素只有一个水平时，对比的应用就成了一种悖论。

让我们来看看模型矩阵：

model.matrix(mod)
#   (Intercept) New.Runner Last.Run
#1            1          1        1
#2            1          1        5
#3            1          1        2
#4            1          1        6
#5            1          1        5
#6            1          1        4
#8            1          1        3
#9            1          1        7
#10           1          1        2
#11           1          1        4
#12           1          1        9
#13           1          1        8
#15           1          1        3
#16           1          1        5
#17           1          1        1
#19           1          1        6
#20           1          1       10
#21           1          1        7
#22           1          1        9
#23           1          1        2

(intercept)和New.Runner具有相同的列，并且只能估计其中的一个。如果你想估计New.Runner，去掉截取：

glm(formula = WL ~ 0 + New.Runner + Last.Run, family = binomial, data = dat)
#New.Runner    Last.Run  
#    1.4582     -0.2507

确保你彻底消化了等级不足的问题。如果您有多个单级因子，并且将所有因子都替换为1，则丢弃单个截取仍然会导致秩不足。

dat$foo.factor <- 1
glm(formula = WL ~ 0 + New.Runner + foo.factor + Last.Run, family = binomial, data = dat)
#New.Runner  foo.factor    Last.Run  
#    1.4582          NA     -0.2507

票数 4

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/50297260

复制

相似问题

问当“对比只能应用于具有2个或更多水平的因素”时，如何进行GLM？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问当“对比只能应用于具有2个或更多水平的因素”时，如何进行GLM？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问当“对比只能应用于具有2个或更多水平的因素”时，如何进行GLM？
EN