我不知道它为什么不叫乐团。也许是一些参数混乱?
森林覆盖类型数据:
X=形状(581012,54)
Y=形状(581012,)
from sklearn.ensemble import VotingClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn import model_selection
classifier_names = ["logistic regression", "linear SVM", "nearest centroids", "decision tree"]
classifiers = [LogisticRegression, LinearSVC, NearestCentroid, DecisionTreeClassifier]
ensemble1 = VotingClassifier(classifiers)
ensemble2 = BaggingClassifier(classifiers)
ensemble3 = AdaBoostClassifier(classifiers)
ensembles = [ensemble1, ensemble2, ensemble3]
seed = 7
for ensemble in ensembles:
kfold = model_selection.KFold(n_splits=10, random_state=seed)
for classifier in classifiers:
model = ensemble(base_estimator=classifier, random_state=seed)
results = model_selection.cross_val_score(ensemble, X, Y, cv=kfold)
print(results.mean())
我希望集成运行分类器,但第一个集成没有运行。我首先将order更改为BaggingClassifier
,但显示的错误与不可调用的错误相同。
发布于 2019-01-06 14:37:26
对于VotingClassifier
,估计器应该是带有名称和型号的元组列表。请注意,您已经创建了一个模型类,然后在元组内部给出。
estimators :在VotingClassifier上调用fit方法的(string,estimator)元组列表将拟合那些将存储在类属性self.estimators_中的原始估计器的克隆。可以使用set_params将估计器设置为None。
对于其他两个集成,对于相同的基本模型,您只能解决一个基本估计器和n个估计器的问题。循环遍历不同的分类器,就像您对代码所做的那样,但每次都要重新定义集成模型。
object base_estimator:object or None,optional (default=None)适用于数据集的随机子集的基本估计器。如果没有,则基本估计器是决策树。
n_estimators:int,optional (default=10)集合中基本估计器的数量。
尝尝这个!
iris = datasets.load_iris()
X, y = iris.data[:, 1:3], iris.target
classifier_names = ["logistic regression","linear SVM","nearest centroids", "decision tree"]
classifiers = [LogisticRegression(), LinearSVC(), NearestCentroid(), DecisionTreeClassifier()]
ensemble1 = VotingClassifier([(n,c) for n,c in zip(classifier_names,classifiers)])
ensemble2 = BaggingClassifier(base_estimator= DecisionTreeClassifier() , n_estimators= 10)
ensemble3 = AdaBoostClassifier(base_estimator= DecisionTreeClassifier() , n_estimators= 10)
ensembles = [ensemble1,ensemble2,ensemble3]
seed = 7
for ensemble in ensembles:
kfold = model_selection.KFold(n_splits=10, random_state=seed)
results = model_selection.cross_val_score(ensemble, X, y, cv=kfold)
print(results.mean())
发布于 2019-01-07 04:35:50
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, BaggingClassifier, VotingClassifier
from sklearn import model_selection
import warnings
warnings.filterwarnings("ignore")
seed = 7
classifier_names = ["logistic regression","linear SVM","nearest centroids", "decision tree"]
classifiers = [LogisticRegression, LinearSVC, NearestCentroid, DecisionTreeClassifier]
for classifier in classifiers:
ensemble1 = RandomForestClassifier(estimator=classifier(), n_estimators= 20, random_state=seed)
ensemble2 = AdaBoostClassifier(base_estimator=classifier(),
n_estimators= 5, learning_rate=1, random_state=seed)
ensemble3 = BaggingClassifier(base_estimator=classifier(),
max_samples=0.5, n_estimators=20, random_state=seed)
ensemble4 = VotingClassifier([(n,c) for n,c in zip(classifier_namess, classifiers)], voting="soft")
ensembles = [ensemble1, ensemble2, ensemble3, ensemble4]
for ensemble in ensembles:
kfold = model_selection.KFold(n_splits=10, random_state=seed)
results = model_selection.cross_val_score(ensemble, X[1:100], y[1:100], cv=kfold)
print("The mean accuracy of {}:".format(results.mean()))
https://stackoverflow.com/questions/54051501
复制相似问题