This chapter will cover the following recipes:这章将包含以下部分:
1、K-fold cross validation K-fold 交叉验证
2、Automatic cross validation 自动交叉验证
3、Cross validation with ShuffleSplit 使用ShuffleSplit交叉检验
4、Stratified k-fold 分层K-fold
5、Poor man's grid search 穷举网格搜索
6、Brute force grid search 暴力网格搜索
7、Using dummy estimators to compare results 使用虚拟估计值与结果比较
8、Regression model evaluation 回归模型评估
9、Feature selection 特征选择
10、Feature selection on L1 norms 基于L1正则化的特征选择
11、Persisting models with joblib joblib持久模型
Even though by design the chapters are unordered, you could argue by virtue of the art of data science, we've saved the best for last.
尽管章节的设计是无序的,你可能会争论数据科学的艺术的美德,我们把最好的放在了最后。
For the most part, each recipe within this chapter is applicable to the various models we've worked with. In some ways, you can think about this chapter as tuning the parameters and features. Ultimately, we need to choose some criteria to determine the "best" model. We'll use various measures to define best. This is covered in the Regression model evaluation recipe.
在大多数的部分,这一章的每一步都可以被应用于我们工作中的各种模型,在一些方法中,你可以想象这章来调整参数和特征。最终,我们需要选择一些条件来确定最好的模型,我们将要使用各种手段来定义最好,这将涵盖回归模型的评估章节。
Then in the Cross validation with ShuffleSplit recipe, we will randomize the evaluation across subsets of the data to help avoid overfitting.
然后在ShuffleSplit交叉检验的部分,我们将对数据分组进行随机选择交叉验证来帮助避免过拟合。
本文系外文翻译,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。
本文系外文翻译,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。