我想为我的分类器计算AUC,精度,准确度。我正在做有监督的学习:
这是我的工作代码。这段代码适用于二进制类,但不适用于多类。请假设您有一个包含二进制类的数据帧:
sample_features_dataframe = self._get_sample_features_dataframe()
labeled_sample_features_dataframe = retrieve_labeled_sample_dataframe(sample_features_dataframe)
labeled_sample_features_dataframe, binary_class_series, multi_class_series = self._prepare_dataframe_for_learning(labeled_sample_features_dataframe)
k = 10
k_folds = StratifiedKFold(binary_class_series, k)
for train_indexes, test_indexes in k_folds:
train_set_dataframe = labeled_sample_features_dataframe.loc[train_indexes.tolist()]
test_set_dataframe = labeled_sample_features_dataframe.loc[test_indexes.tolist()]
train_class = binary_class_series[train_indexes]
test_class = binary_class_series[test_indexes]
selected_classifier = RandomForestClassifier(n_estimators=100)
selected_classifier.fit(train_set_dataframe, train_class)
predictions = selected_classifier.predict(test_set_dataframe)
predictions_proba = selected_classifier.predict_proba(test_set_dataframe)
roc += roc_auc_score(test_class, predictions_proba[:,1])
accuracy += accuracy_score(test_class, predictions)
recall += recall_score(test_class, predictions)
precision += precision_score(test_class, predictions)
最后我将结果除以K,当然是为了得到平均AUC,精度等。这段代码工作得很好。但是,我不能对多类进行同样的计算:
train_class = multi_class_series[train_indexes]
test_class = multi_class_series[test_indexes]
selected_classifier = RandomForestClassifier(n_estimators=100)
selected_classifier.fit(train_set_dataframe, train_class)
predictions = selected_classifier.predict(test_set_dataframe)
predictions_proba = selected_classifier.predict_proba(test_set_dataframe)
我发现对于多类,我必须添加参数“加权”来表示平均值。
roc += roc_auc_score(test_class, predictions_proba[:,1], average="weighted")
我遇到错误:提升ValueError(“不支持{0}格式”.format(Y_type))
ValueError:不支持多类格式
https://stackoverflow.com/questions/39685740
复制相似问题