无法修复以下逻辑回归的以下错误
training=(IBM$Serial<625)
data=IBM[!training,]
dim(data)
stock.direction <- data$Direction
training_model=glm(stock.direction~data$lag2,data=data,family=binomial)
###Error### ---- Error in eval(family$initialize) : y values must be 0 <= y <= 1
我正在使用的数据中有几行
X Date Open High Low Close Adj.Close Volume Return lag1 lag2 lag3 Direction Serial
1 28-11-2012 190.979996 192.039993 189.270004 191.979996 165.107727 3603600 0.004010855 0.004010855 -0.001198021 -0.006354834 Up 1
2 29-11-2012 192.75 192.899994 190.199997 191.529999 164.720734 4077900 0.00114865 0.00114865 -0.004020279 -0.009502386 Up 2
3 30-11-2012 191.75 192 189.5 190.070007 163.465073 4936400 0.003630178 0.003630178 -0.001894039 -0.005576956 Up 3
4 03-12-2012 190.759995 191.300003 188.360001 189.479996 162.957703 3349600 0.001213907 0.001213907 -0.002480478 -0.001636046 Up 4
发布于 2018-01-28 00:29:26
它要求y值介于0和1之间的原因是因为数据中的分类特征(如“方向”)属于“字符”类型。你需要用as.factor(data$Direction)
把它们转换成‘因子’类型。所以:glm(Direction ~ lag2, data=...)
不需要声明stock.direction。
您可以使用命令class(variable)
检查变量的类,如果它们是字符,则可以转换为因子并在同一数据框中创建新列。那么它应该可以工作了。
发布于 2020-10-08 02:29:35
我得到了同样的错误"Error in eval(family$initialize):Y值必须是0 <= y <= 1“,并通过在red.csv函数中添加"stringsAsFactors=T”解决了这个问题。
之前: gene.train = read.csv("gene.train.csv",header=T) #错误
AFTER : gene.train = read.csv("gene.train.csv",header=T,stringsAsFactors=T) #没有错误。
发布于 2017-11-29 17:11:16
在不理解数据的情况下,您应该这样做
library(dplyr)
df <- read.table(header = T, stringsAsFactors = F, text ="X Date Open High Low Close Adj.Close Volume Return lag1 lag2 lag3 Direction Serial
1 28-11-2012 190.979996 192.039993 189.270004 191.979996 165.107727 3603600 0.004010855 0.004010855 -0.001198021 -0.006354834 Up 1
2 29-11-2012 192.75 192.899994 190.199997 191.529999 164.720734 4077900 0.00114865 0.00114865 -0.004020279 -0.009502386 Up 2
3 30-11-2012 191.75 192 189.5 190.070007 163.465073 4936400 0.003630178 0.003630178 -0.001894039 -0.005576956 Up 3
4 03-12-2012 190.759995 191.300003 188.360001 189.479996 162.957703 3349600 0.001213907 0.001213907 -0.002480478 -0.001636046 Up 4
1 28-11-2012 190.979996 192.039993 189.270004 191.979996 165.107727 3603600 0.004010855 0.004010855 -0.001198021 -0.006354834 Up 1
2 29-11-2012 192.75 192.899994 190.199997 191.529999 164.720734 4077900 0.00114865 0.00114865 -0.004020279 -0.009502386 Down 2
3 30-11-2012 191.75 192 189.5 190.070007 163.465073 4936400 0.003630178 0.003630178 -0.001894039 -0.005576956 Up 3
4 03-12-2012 190.759995 191.300003 188.360001 189.479996 162.957703 3349600 0.001213907 0.001213907 -0.002480478 -0.001636046 Down 4
") %>%
mutate(bin = ifelse(Direction == "Up", 1, 0))
glm(bin ~ High, family = "binomial", data = df)
https://stackoverflow.com/questions/47546658
复制相似问题