问在Python中手工计算AUC
EN

Stack Overflow用户

提问于 2018-07-21 03:34:11

回答 1查看 855关注 0票数 0

使用R，我可以使用以下代码和for循环手动计算和绘制AUC：

test = data.frame(cbind(dt$DV, predicted_prob))
colnames(test)[1] = 'DV' 
colnames(test)[2] = 'DV_pred_prob' 

TP = rep(NA,101)
FN = rep(NA,101)
FP = rep(NA,101)
TN = rep(NA,101)
Sensitivity = rep(NA,101)
Specificity = rep(NA,101)
AUROC = 0

for(i in 0:100){
  test$temp = 0
  test[test$DV_pred_prob > (i/100),"temp"] = 1
  TP[i+1] = nrow(test[test$DV==1 & test$temp==1,])
  FN[i+1] = nrow(test[test$DV==1 & test$temp==0,])
  FP[i+1] = nrow(test[test$DV==0 & test$temp==1,])
  TN[i+1] = nrow(test[test$DV==0 & test$temp==0,])
  Sensitivity[i+1] = TP[i+1] / (TP[i+1] + FN[i+1] )
  Specificity[i+1] = TN[i+1] / (TN[i+1] + FP[i+1] )
  if(i>0){
    AUROC = AUROC+0.5*(Specificity[i+1] - Specificity[i])*(Sensitivity[i+1] + 
Sensitivity[i])
  }
}

data = data.frame(cbind(Sensitivity,Specificity,id=(0:100)/100))

我正在尝试用Python语言编写相同的代码，但是遇到了错误"TypeError：'Series‘对象是可变的，因此它们不能被散列“

我对Python非常陌生，我正在尝试用R和Python来掌握两种语言。有人能为我指出解决这个问题的正确方向吗？

predictions = pd.DataFrame(predictions[1])
actual = pd.DataFrame(y_test)
test = pd.concat([actual.reset_index(drop=True), predictions], axis=1)
# Rename column Renew to 'actual' and '1' to 'predictions'
test.rename(columns={"Renew": "actual", 1: "predictions"}, inplace=True)

TP = np.repeat('NA', 101)
FN = np.repeat('NA', 101)
FP = np.repeat('NA', 101)
TN = np.repeat('NA', 101)
Sensitivity = np.repeat('NA', 101)
Specificity = np.repeat('NA', 101)
AUROC = 0

for i in range(100):
    test['temp'] = 0
    test[test['predictions'] > (i/100), "temp"] = 1
    TP[i+1] = [test[test["actual"]==1 and test["temp"]==1,]].shape[0]
    FN[i+1] = [test[test["actual"]==1 and test["temp"]==0,]].shape[0]
    FP[i+1] = [test[test["actual"]==0 and test["temp"]==1,]].shape[0]
    TN[i+1] = [test[test["actual"]==0 and test["temp"]==0,]].shape[0]
    Sensitivity[i+1] = TP[i+1] / (TP[i+1] + FN[i+1])
    Specificity[i+1] = TN[i+1] / (TN[i+1] + FP[i+1])
    if(i > 0):
            AUROC = AUROC+0.5*(Specificity[i+1] - Specificity[i])* 
(Sensitivity[i+1] + Sensitivity[i])

错误似乎发生在包含(i/100)的代码部分。

python

for-loop

auc

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-21 04:14:24

Pandas索引并不像你所期望的那样工作。您不能使用df[rows, cols]，而应使用.loc (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html)

所以是的-你是对的，错误是由你的行引起的：

test[test['predictions'] > (i/100), "temp"] = 1。

要修复它，您可以使用：

test.loc[test['predictions'] > (i/100), "temp"] = 1。

..。然后，您将在以下4行中遇到问题，这些行遵循以下格式：

TP[i+1] = test[test["actual"]==1 and test["temp"]==1,].shape[0]

您需要将每个求值语句括在圆括号中，并将and更改为&。有一个很好的讨论为什么会出现在这里：Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()。因此，您的代码应该如下所示：

TP[i+1] = len(test[(test["actual"]==1) & (test["temp"]==1)])

注意:我们可以使用len函数来计算行数，而不是使用dataframes shape属性的第一个元素。不过，这只是我的偏好。

最后，您不能在python中以这种方式设置'NA‘值；您可以使用np.NAN。最后的if语句将失败，因为您已将字符串数组作为占位符。我想np.zeros(101)会为你工作的。

你的完整代码和我的编辑：

predictions = pd.DataFrame(predictions[1])
actual = pd.DataFrame(y_test)
test = pd.concat([actual.reset_index(drop=True), predictions], axis=1)

# Rename column Renew to 'actual' and '1' to 'predictions'

test.columns = ['actual', 'predictions'] #<- You can assign column names using a list

TP = np.zeros(101)
FN = np.zeros(101)
FP = np.zeros(101)
TN = np.zeros(101)
Sensitivity = np.zeros(101)
Specificity = np.zeros(101)
AUROC = 0

for i in range(10):
    test['temp'] = 0
    test.loc[test['predictions'] > (i / 100), 'temp'] = 1
    TP[i+1] = len(test[(test["actual"]==1) & (test["temp"]==1)])
    FN[i+1] = len(test[(test["actual"]==1) & (test["temp"]==0)])
    FP[i+1] = len(test[(test["actual"]==0) & (test["temp"]==1)])
    TN[i+1] = len(test[(test["actual"]==0) & (test["temp"]==0)])
    Sensitivity[i+1] = TP[i+1] / (TP[i+1] + FN[i+1])
    Specificity[i+1] = TN[i+1] / (TN[i+1] + FP[i+1])
    if i > 0:
            AUROC += 0.5 * (Specificity[i+1] - Specificity[i]) *  (Sensitivity[i+1] + Sensitivity[i])

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51449292

复制

相似问题

问在Python中手工计算AUC
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Python中手工计算AUCEN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Python中手工计算AUC
EN