首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >R-xgb.cv测试/训练错误不会改变每次迭代

R-xgb.cv测试/训练错误不会改变每次迭代
EN

Stack Overflow用户
提问于 2018-12-12 05:14:19
回答 1查看 0关注 0票数 0

我用R编程,我正在尝试确定我想要运行的xgboost模型的最佳超参数。我有一个包含约700个变量的数据集(一些数字,其他一个编码)和约25,000个观测值。我试图预测每个观察结果是大(预测= 1)还是小(预测= 0)。问题是,当我运行xgb.cv的功能,train-errortest-error不会在每次迭代后改变。下面是我的代码和随后的打印输出。谁都可以解释为什么错误保持不变?非常感谢!

具体的R代码:

代码语言:javascript
复制
dtrain <- xgb.DMatrix(data = pred[train,], label = resp[train])
xgb.cv(data = dtrain,
              params = list(objective = "binary:logistic",
                            eta = 0.01,
                            max_depth = 10,
                            min_child_weight = 20,
                            colsample_bytree = 0.2),  
              nfold = 5,
              nrounds = 100,
              verbose = TRUE,
              early_stopping_rounds = 8,
              maximize = FALSE)

控制台打印输出:

代码语言:javascript
复制
[1] train-error:0.014422+0.000491   test-error:0.014422+0.001965
Multiple eval metrics are present. Will use test_error for early stopping.
Will train until test_error hasn't improved in 8 rounds.

[2] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[3] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[4] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[5] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[6] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[7] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[8] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[9] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
Stopping. Best iteration:
[1] train-error:0.014422+0.000491   test-error:0.014422+0.001965

再次感谢你的帮助!

EN

回答 1

Stack Overflow用户

发布于 2018-12-12 14:47:42

您必须在params列表中添加多个值!使用c()

代码语言:javascript
复制
  params = list(objective = "binary:logistic",
                             eta = c(0.01, 0.05, 0.1, 0.5, 1),
                             max_depth = 10,
                             min_child_weight = 20,
                             colsample_bytree = c(0.1, 0.2, 0.5, 1))

注意 - 您也可以找到与插入符号和mlr包最佳的音调

代码语言:javascript
复制
library(caret)
modelLookup(model = "xgbTree")
#     model        parameter                          label forReg forClass probModel
# 1 xgbTree          nrounds          # Boosting Iterations   TRUE     TRUE      TRUE
# 2 xgbTree        max_depth                 Max Tree Depth   TRUE     TRUE      TRUE
# 3 xgbTree              eta                      Shrinkage   TRUE     TRUE      TRUE
# 4 xgbTree            gamma         Minimum Loss Reduction   TRUE     TRUE      TRUE
# 5 xgbTree colsample_bytree     Subsample Ratio of Columns   TRUE     TRUE      TRUE
# 6 xgbTree min_child_weight Minimum Sum of Instance Weight   TRUE     TRUE      TRUE
# 7 xgbTree        subsample           Subsample Percentage   TRUE     TRUE      TRUE

# Time computing is very long
 tuneGrid <- expand.grid(nrounds = 1000,
                     max_depth = c(2:14),
                     eta = c(0.01:0.1),
                    gamma = c(0:1),
                     colsample_bytree = c(0:1),
                     min_child_weight = c(0:1),
                     subsample = c(0:1))


set.seed(1)
model <-train(form = factor(categ) ~ .,
          data = dtrain,
          method = "xgbTree",
          verbose = TRUE,
          metric = "Accuracy",
          nthread = 3,
          tuneGrid = tuneGrid)

# NB : categ is your categorical feature

MLR :

代码语言:javascript
复制
library(mlr)
lrn <- makeLearner(cl = "classif.xgboost", nrounds=10)

# Get all learner's parameters
getParamSet(x = lrn) # or used lrn$par.set


tsk <- makeClassifTask(data = dtrain, target = "categ")
ps <- makeParamSet(makeNumericParam(id = "eta", lower = 0, upper = 1),
               makeNumericParam(id = "lambda", lower =  0, upper = 200),
               makeIntegerParam(id = "max_depth", lower = 1, upper = 20))
control <- makeTuneControlMBO(budget = 100)
cv10 <- makeResampleDesc(method = "CV", iters = 10, stratify = TRUE)

# Optimal parameters
set.seed(1)
tr <- tuneParams(learner = lrn, 
             task = tsk, 
             resampling = cv10, 
             measures = acc, 
             par.set = ps, 
             control = control)

# Remplace by optimazied paramaters
lrn <- setHyperPars(learner = lrn, par.vals = tr$x)

# Evaluate performance
model <- mlr::train(learner = lrn, task = tsk)
model$learner.model
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/-100006265

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档