sklearn中的RandomForestClassifier有一个参数:
oob_score : bool (default=False) Whether to use out-of-bag samples to estimate the generalization accuracy.
中文叫‘袋外误差’,可以看出这个参数的意思是:使用oob来衡量test error.
关于oob的解释,stackoverflow上有比较全面的解释:OOB的解释 说下自己的理解:
这样就可以在训练的时候来进行测试了,经验表明:
out-of-bag estimate is as accurate as using a test set of the same size as the training set.
意思就是,oob是test error的一个无偏估计.
一句话总结下: 假设Zi=(xi,yi)
The out-of-bag (OOB) error is the average error for each Zi calculated using predictions from the trees that do not contain Zi in their respective bootstrap sample. This allows the RandomForestClassifier to be fit and validated whilst being trained.