问XGB :组合H2o保持预测
EN

Stack Overflow用户

提问于 2018-07-21 13:40:10

回答 2查看 1.1K关注 0票数 4

使用时：

"keep_cross_validation_predictions": True
"keep_cross_validation_fold_assignment": True

在H2O的XGBoost估计器中，我无法将这些交叉验证的概率映射回原始数据集。有一个文档示例是针对R的，但不是针对Python的(结合了保持预测)。

有没有关于如何在Python中做到这一点的线索？

python

h2o

xgboost

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-07-22 06:23:54

交叉验证的预测存储在两个不同的位置-一个作为model.cross_validation_predictions()中长度为k的列表(对于k倍)，另一个作为H2O帧，其中CV preds的顺序与model.cross_validation_holdout_predictions()中的原始训练行相同。后者通常是人们想要的(我们后来添加了这一点，这就是为什么有两个版本)。

是的，不幸的是，在H2O用户指南的“交叉验证”部分中获得这个框架的R example没有Python版本(ticket可以解决这个问题)。在keep_cross_validation_predictions参数文档中，它只显示两个位置中的一个。

以下是使用XGBoost的更新示例，并显示了这两种类型的CV预测：

import h2o
from h2o.estimators.xgboost import H2OXGBoostEstimator
h2o.init()

# Import a sample binary outcome training set into H2O
train = h2o.import_file("http://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")

# Identify predictors and response
x = train.columns
y = "response"
x.remove(y)

# For binary classification, response should be a factor
train[y] = train[y].asfactor()

# try using the `keep_cross_validation_predictions` (boolean parameter):
# first initialize your estimator, set nfolds parameter
xgb = H2OXGBoostEstimator(keep_cross_validation_predictions = True, nfolds = 5, seed = 1)

# then train your model
xgb.train(x = x, y = y, training_frame = train)

# print the cross-validation predictions as a list
xgb.cross_validation_predictions()

# print the cross-validation predictions as an H2OFrame
xgb.cross_validation_holdout_predictions()

预测的CV预测帧如下所示：

Out[57]:
  predict         p0        p1
---------  ---------  --------
        1  0.396057   0.603943
        1  0.149905   0.850095
        1  0.0407018  0.959298
        1  0.140991   0.859009
        0  0.67361    0.32639
        0  0.865698   0.134302
        1  0.12927    0.87073
        1  0.0549603  0.94504
        1  0.162544   0.837456
        1  0.105603   0.894397

[10000 rows x 3 columns]

票数 3

Stack Overflow用户

发布于 2018-07-21 18:06:41

对于Python是an example of this on GBM，对于XGB应该完全相同。根据该页面，您应该能够这样做：

model = H2OXGBoostEstimator(keep_cross_validation_predictions = True)

model.train(x = predictors, y = response, training_frame = train)

cv_predictions = model.cross_validation_predictions()

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51453164

复制

相似问题

问XGB :组合H2o保持预测
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问XGB :组合H2o保持预测EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问XGB :组合H2o保持预测
EN