前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >目标检测中AP和mAP计算详解(代码全解)

目标检测中AP和mAP计算详解(代码全解)

作者头像
小草AI
发布2019-11-01 16:58:06
5.2K0
发布2019-11-01 16:58:06
举报

作者:JimmyHua

来源:https://zhuanlan.zhihu.com/p/70667071

转自:CVer

定义

Accuracy:准确率

✔️ 准确率=预测正确的样本数/所有样本数,即预测正确的样本比例(包括预测正确的正样本和预测正确的负样本,不过在目标检测领域,没有预测正确的负样本这一说法,所以目标检测里面没有用Accuracy的)。

Precision:查准率

✔️ recision表示某一类样本预测有多准。

✔️ Precision针对的是某一类样本,如果没有说明类别,那么Precision是毫无意义的(有些地方不说明类别,直接说Precision,是因为二分类问题通常说的Precision都是正样本的Precision)。

Recall:召回率

✔️ Recall和Precision一样,脱离类别是没有意义的。说道Recall,一定指的是某个类别的Recall。Recall表示某一类样本,预测正确的与所有Ground Truth的比例。

✍️ Recall计算的时候,分母是Ground Truth中某一类样本的数量,而Precision计算的时候,是预测出来的某一类样本数。

F1 Score:平衡F分数

F1分数,它被定义为查准率和召回率的调和平均数

更加广泛的会定义

分数,其中

分数在统计学在常用,并且,

分数中,召回率的权重大于查准率,而

分数中,则相反。

AP: Average Precision

以Recall为横轴,Precision为纵轴,就可以画出一条PR曲线,PR曲线下的面积就定义为AP,即:

PR曲线

由于计算积分相对困难,因此引入插值法,计算AP公式如下:

计算面积:

原理:

代码详解

computer_mAP.py

from voc_eval import voc_eval
import os

mAP = []
# 计算每个类别的AP
for i in range(8):
    class_name = str(i)  # 这里的类别名称为0,1,2,3,4,5,6,7
    rec, prec, ap = voc_eval( path/{}.txt ,  path/Annotations/{}.xml ,  path/test.txt , class_name,  ./ )
    print("{} :	 {} ".format(class_name, ap))
    mAP.append(ap)

mAP = tuple(mAP)

print("***************************")
# 输出总的mAP
print("mAP :	 {}".format( float( sum(mAP)/len(mAP)) ))

AP计算

import numpy as np

def voc_ap(rec, prec, use_07_metric=False):
    """ ap = voc_ap(rec, prec, [use_07_metric])
    Compute VOC AP given precision and recall.
    If use_07_metric is true, uses the
    VOC 07 11 point method (default:False).
    """
    # 针对2007年VOC,使用的11个点计算AP,现在不使用
    if use_07_metric:
        # 11 point metric
        ap = 0.
        for t in np.arange(0., 1.1, 0.1):
            if np.sum(rec >= t) == 0:
                p = 0
            else:
                p = np.max(prec[rec >= t])
            ap = ap + p / 11.
    else:
        # correct AP calculation
        # first append sentinel values at the end
        mrec = np.concatenate(([0.], rec, [1.]))  #[0.  0.0666, 0.1333, 0.4   , 0.4666,  1.]
        mpre = np.concatenate(([0.], prec, [0.])) #[0.  1.,     0.6666, 0.4285, 0.3043,  0.]

        # compute the precision envelope
        # 计算出precision的各个断点(折线点)
        for i in range(mpre.size - 1, 0, -1):
            mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])  #[1.     1.     0.6666 0.4285 0.3043 0.    ]

        # to calculate area under PR curve, look for points
        # where X axis (recall) changes value
        i = np.where(mrec[1:] != mrec[:-1])[0]  #precision前后两个值不一样的点
        print(mrec[1:], mrec[:-1])
        print(i) #[0, 1, 3, 4, 5]

        # AP= AP1 + AP2+ AP3+ AP4
        ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
    return ap

rec = np.array([0.0666, 0.1333,0.1333, 0.4, 0.4666])
prec = np.array([1., 0.6666, 0.6666, 0.4285, 0.3043])
ap = voc_ap(rec, prec)

print(ap) #输出:0.2456

voc_eval详解

1. Annotation

<annotation>
    <folder>VOC2007</folder>
    <filename>009961.jpg</filename>
    <source>
        <database>The VOC2007 Database</database>
        <annotation>PASCAL VOC2007</annotation>
        <image>flickr</image>
        <flickrid>334575803</flickrid>
    </source>
    <owner>
        <flickrid>dictioncanary</flickrid>
        <name>Lucy</name>
    </owner>
    <size><!--image shape-->
        <width>500</width>
        <height>374</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented><!--是否有分割label-->
    <object>
        <name>dog</name> <!--类别-->
        <pose>Unspecified</pose><!--物体的姿态-->
        <truncated>0</truncated><!--物体是否被部分遮挡(>15%)-->
        <difficult>0</difficult><!--是否为难以辨识的物体, 主要指要结体背景才能判断出类别的物体。虽有标注, 但一般忽略这类物体-->
        <bndbox><!--bounding box-->
            <xmin>69</xmin>
            <ymin>4</ymin>
            <xmax>392</xmax>
            <ymax>345</ymax>
        </bndbox>
    </object>
</annotation>

2. Prediction

<image id> <confidence> <left> <top> <right> <bottom>

Example

class_0.txt:
000004 0.702732 89 112 516 466
000006 0.870849 373 168 488 229
000006 0.852346 407 157 500 213
000006 0.914587 2 161 55 221
000008 0.532489 175 184 232 201

3. Eval

# --------------------------------------------------------
# Fast/er R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Bharath Hariharan
# --------------------------------------------------------

import xml.etree.ElementTree as ET
import os
import pickle
import numpy as np

# 读取annotation里面的label数据
def parse_rec(filename):
    """ Parse a PASCAL VOC xml file """
    tree = ET.parse(filename)
    objects = []
    for obj in tree.findall( object ):
        obj_struct = {}
        obj_struct[ name ] = obj.find( name ).text
        obj_struct[ pose ] = obj.find( pose ).text
        obj_struct[ truncated ] = int(obj.find( truncated ).text)
        obj_struct[ difficult ] = int(obj.find( difficult ).text)
        bbox = obj.find( bndbox )
        obj_struct[ bbox ] = [int(bbox.find( xmin ).text),
                              int(bbox.find( ymin ).text),
                              int(bbox.find( xmax ).text),
                              int(bbox.find( ymax ).text)]
        objects.append(obj_struct)

    return objects

# 计算AP,参考前面介绍
def voc_ap(rec, prec, use_07_metric=False):
    """ ap = voc_ap(rec, prec, [use_07_metric])
    Compute VOC AP given precision and recall.
    If use_07_metric is true, uses the
    VOC 07 11 point method (default:False).
    """
    if use_07_metric:
        # 11 point metric
        ap = 0.
        for t in np.arange(0., 1.1, 0.1):
            if np.sum(rec >= t) == 0:
                p = 0
            else:
                p = np.max(prec[rec >= t])
            ap = ap + p / 11.
    else:
        # correct AP calculation
        # first append sentinel values at the end
        mrec = np.concatenate(([0.], rec, [1.]))
        mpre = np.concatenate(([0.], prec, [0.]))

        # compute the precision envelope
        for i in range(mpre.size - 1, 0, -1):
            mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])

        # to calculate area under PR curve, look for points
        # where X axis (recall) changes value
        i = np.where(mrec[1:] != mrec[:-1])[0]

        # and sum (Delta recall) * prec
        ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
    return ap

# 主函数,读取预测和真实数据,计算Recall, Precision, AP
def voc_eval(detpath,
             annopath,
             imagesetfile,
             classname,
             cachedir,
             ovthresh=0.5,
             use_07_metric=False):
    """rec, prec, ap = voc_eval(detpath,
                                annopath,
                                imagesetfile,
                                classname,
                                [ovthresh],
                                [use_07_metric])
    Top level function that does the PASCAL VOC evaluation.
    detpath: Path to detections
        detpath.format(classname) 需要计算的类别的txt文件路径.
    annopath: Path to annotations
        annopath.format(imagename) label的xml文件所在的路径
    imagesetfile: 测试txt文件,里面是每个测试图片的地址,每行一个地址
    classname: 需要计算的类别
    cachedir: 缓存标注的目录
    [ovthresh]: IOU重叠度 (default = 0.5)
    [use_07_metric]: 是否使用VOC07的11点AP计算(default False)
    """
    # assumes detections are in detpath.format(classname)
    # assumes annotations are in annopath.format(imagename)
    # assumes imagesetfile is a text file with each line an image name
    # cachedir caches the annotations in a pickle file

    # first load gt 加载ground truth。
    if not os.path.isdir(cachedir):
        os.mkdir(cachedir)
    cachefile = os.path.join(cachedir,  annots.pkl )
    # read list of images
    with open(imagesetfile,  r ) as f:
        lines = f.readlines()
    #所有文件名字。
    imagenames = [os.path.basename(x.strip()).split( .jpg )[0] for x in lines]

    #如果cachefile文件不存在,则写入
    if not os.path.isfile(cachefile):
        # load annots
        recs = {}
        for i, imagename in enumerate(imagenames):
            recs[imagename] = parse_rec(annopath.format(imagename))
            if i % 100 == 0: # 进度条
                print(  Reading annotation for {:d}/{:d} .format(
                    i + 1, len(imagenames)))
        # save
        print(  Saving cached annotations to {:s} .format(cachefile))
        with open(cachefile,  wb ) as f:
            #写入cPickle文件里面。写入的是一个字典,左侧为xml文件名,右侧为文件里面个各个参数。
            pickle.dump(recs, f)
    else:
        # load
        with open(cachefile,  rb ) as f:
            recs = pickle.load(f)

    # 对每张图片的xml获取函数指定类的bbox等
    class_recs = {}  # 保存的是 Ground Truth的数据
    npos = 0
    for imagename in imagenames:
        # 获取Ground Truth每个文件中某种类别的物体
        R = [obj for obj in recs[imagename] if obj[ name ] == classname]

        bbox = np.array([x[ bbox ] for x in R])
        #  different基本都为0/False.
        difficult = np.array([x[ difficult ] for x in R]).astype(np.bool)
        det = [False] * len(R) #list中形参len(R)个False。
        npos = npos + sum(~difficult) #自增,~difficult取反,统计样本个数

        # 记录Ground Truth的内容
        class_recs[imagename] = { bbox : bbox,
                                  difficult : difficult,
                                  det : det}

    # read dets 读取某类别预测输出
    detfile = detpath.format(classname)

    with open(detfile,  r ) as f:
        lines = f.readlines()

    splitlines = [x.strip().split(   ) for x in lines]
    image_ids = [x[0].split( . )[0] for x in splitlines]  # 图片ID

    confidence = np.array([float(x[1]) for x in splitlines]) # IOU值
    BB = np.array([[float(z) for z in x[2:]] for x in splitlines]) # bounding box数值

    # 对confidence的index根据值大小进行降序排列。
    sorted_ind = np.argsort(-confidence) 
    sorted_scores = np.sort(-confidence)
    BB = BB[sorted_ind, :] #重排bbox,由大概率到小概率。
    image_ids = [image_ids[x] for x in sorted_ind] # 图片重排,由大概率到小概率。

    # go down dets and mark TPs and FPs
    nd = len(image_ids) 

    tp = np.zeros(nd)
    fp = np.zeros(nd)
    for d in range(nd):
        R = class_recs[image_ids[d]]  #ann

           
        1. 如果预测输出的是(x_min, y_min, x_max, y_max),那么不需要下面的top,left,bottom, right转换
        2. 如果预测输出的是(x_center, y_center, h, w),那么就需要转换
        3. 计算只能使用[left, top, right, bottom],对应lable的[x_min, y_min, x_max, y_max]
           
        bb = BB[d, :].astype(float)

        # 转化为(x_min, y_min, x_max, y_max)
        top = int(bb[1]-bb[3]/2)
        left = int(bb[0]-bb[2]/2)
        bottom = int(bb[1]+bb[3]/2)
        right = int(bb[0]+bb[2]/2)
        bb = [left, top, right, bottom]

        ovmax = -np.inf  # 负数最大值
        BBGT = R[ bbox ].astype(float)

        if BBGT.size > 0:
            # compute overlaps
            # intersection
            ixmin = np.maximum(BBGT[:, 0], bb[0])
            iymin = np.maximum(BBGT[:, 1], bb[1])
            ixmax = np.minimum(BBGT[:, 2], bb[2])
            iymax = np.minimum(BBGT[:, 3], bb[3])
            iw = np.maximum(ixmax - ixmin + 1., 0.)
            ih = np.maximum(iymax - iymin + 1., 0.)
            inters = iw * ih

            # union
            uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +
                   (BBGT[:, 2] - BBGT[:, 0] + 1.) *
                   (BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)

            overlaps = inters / uni
            ovmax = np.max(overlaps) # 最大重叠
            jmax = np.argmax(overlaps) # 最大重合率对应的gt
        # 计算tp 和 fp个数
        if ovmax > ovthresh:
            if not R[ difficult ][jmax]:
                # 该gt被置为已检测到,下一次若还有另一个检测结果与之重合率满足阈值,则不能认为多检测到一个目标
                if not R[ det ][jmax]: 
                    tp[d] = 1.
                    R[ det ][jmax] = 1 #标记为已检测
                else:
                    fp[d] = 1.
        else:
            fp[d] = 1.
        print("**************")

    # compute precision recall
    fp = np.cumsum(fp)  # np.cumsum() 按位累加
    tp = np.cumsum(tp)
    rec = tp / float(npos)

    # avoid divide by zero in case the first detection matches a difficult
    # ground truth
    # np.finfo(np.float64).eps 为大于0的无穷小
    prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps) 
    ap = voc_ap(rec, prec, use_07_metric)

    return rec, prec, ap

参考

???- 评估函数eval.py

https://www.cnblogs.com/JZ-Ser/articles/7846399.html

???- voc_eval.py 解析

https://blog.csdn.net/shawncheer/article/details/78317711

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2019-10-29,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 机器学习与python集中营 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Accuracy:准确率
  • Precision:查准率
  • Recall:召回率
  • F1 Score:平衡F分数
  • AP: Average Precision
  • 代码详解
  • computer_mAP.py
  • AP计算
  • voc_eval详解
  • 参考
相关产品与服务
图像识别
腾讯云图像识别基于深度学习等人工智能技术,提供车辆,物体及场景等检测和识别服务, 已上线产品子功能包含车辆识别,商品识别,宠物识别,文件封识别等,更多功能接口敬请期待。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档