【目标检测】YOLOv5：标签中文显示/自定义颜色

zstar

发布于 2022-09-21 08:15:05

2.6K0

发布于 2022-09-21 08:15:05

文章被收录于专栏：往期博文往期博文

前言

本篇主要用来实现将YOLOv5输出的标签转成中文，并且自定义标签颜色的需求。我所使用的是YOLOv5-5.0版本。

源码逻辑分析

在detect.py中，这两行代码设置标签名字和颜色。

# Get names and colors
names = model.module.names if hasattr(model, 'module') else model.names
colors = [[random.randint(0, 255) for _ in range(3)] for _ in names]

可以发现，类别名字并不是在运行检测时导入的，而是内嵌在保存的模型参数文件中。

新建一个load_model.py文件，加载训练好的模型：

import torch
ckpt1 = torch.load('runs/train/exp21/weights/best.pt')
print("Done")

启动断点调试：

可以看到，类别名称包含在了模型内部：

而至于颜色，每次运行，程序会随机生成RGB三个数值，并不稳定。

思路分析

了解了上面的加载逻辑之后，为了实现中文显示的需求，主要有两种思路。

思路一

思路一：直接在data.yaml中，将names改成中文。这种思路需要注意，文件默认打开并不是UTF-8编码，需要对文件读取编码进行修改。在train.py中，将

with open(opt.data) as f:

改为

with open(opt.data, encoding='UTF-8') as f:

在test.py中，将

with open(data) as f:

改为

with open(data, encoding='UTF-8') as f:

这种思路意味着模型需要重新训练，并且后面还是会存在一些小问题。

思路二

思路二：直接在渲染标签的时候进行文字转换。但是opencv默认不支持中文，因此需要下列步骤：

将opencv图片格式转换成PIL的图片格式；
使用PIL绘制文字；
PIL图片格式转换成oepncv的图片格式；

思路实现

采用思路二进行操作。

下载字体

首先是下载支持中文的字体，我所采用的是SimHei这款字体，下载链接： http://www.font5.com.cn/ziti_xiazai.php?id=151&part=1237887120&address=0

混淆矩阵字体修改

在utils/metrics.py文件中，开头添加代码：

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

之后，将这段代码

sn.set(font_scale=1.0 if self.nc < 50 else 0.8)  # for label size

改为

sn.set(font='SimHei', font_scale=1.0 if self.nc < 50 else 0.8)  # for label size

中文标签/颜色修改

在detect.py的Write results中，添加这部分

 # Write results
for *xyxy, conf, cls in reversed(det):
    if save_txt:  # Write to file
        xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
        line = (cls, *xywh, conf) if opt.save_conf else (cls, *xywh)  # label format
        with open(txt_path + '.txt', 'a') as f:
            f.write(('%g ' * len(line)).rstrip() % line + '\n')

    if save_img or view_img:  # Add bbox to image
        # label = f'{names[int(cls)]} {conf:.2f}'
        # label = None  # 修改隐藏标签
        # plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)

        # 增加中文标签
        label = '%s %.2f' % (names[int(cls)], conf)
        # 设置固定颜色
        color_dict = {'1': [0, 131, 252], '2': [190, 90, 92], '3': [142, 154, 78], '4': [2, 76, 82], '5': [119, 80, 5], '6': [189, 163, 234]}
        # 中文输出
        if names[int(cls)] == 'truck':
            ch_text = '%s %.2f' % ('类别1', conf)
            color_single = color_dict['1']
        elif names[int(cls)] == 'panzer':
            ch_text = '%s %.2f' % ('类别2', conf)
            color_single = color_dict['2']
        elif names[int(cls)] == 'tank':
            ch_text = '%s %.2f' % ('类别3', conf)
            color_single = color_dict['3']
        elif names[int(cls)] == 'SUV':
            ch_text = '%s %.2f' % ('类别4', conf)
            color_single = color_dict['4']
        elif names[int(cls)] == 'cam_net':
            ch_text = '%s %.2f' % ('类别5', conf)
            color_single = color_dict['5']
        elif names[int(cls)] == 'cam_tar':
            ch_text = '%s %.2f' % ('类别6', conf)
            color_single = color_dict['6']

        im0 = plot_one_box(xyxy, im0, label=label, ch_text=ch_text, color=color_single, line_thickness=3)

其中，颜色我根据自己整理的调色盘进行吸取筛选。

之后，在utils/plots.py中导入库

from PIL import Image, ImageDraw, ImageFont

修改plot_one_box这个函数：

def cv2ImgAddText(img, text, left, top, textColor=(0, 255, 0), textSize=25):
    # 图像从OpenCV格式转换成PIL格式
    if (isinstance(img, np.ndarray)):  # 判断是否OpenCV图片类型
        img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    draw = ImageDraw.Draw(img)
    fontText = ImageFont.truetype("Font/simhei.ttf", textSize, encoding="utf-8")
    draw.text((left, top - 2), text, textColor, font=fontText)
    return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)



def plot_one_box(x, img, color=None, label=None, ch_text=None, line_thickness=None):
    # Plots one bounding box on image img
    tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1  # line/font thickness
    color = color or [random.randint(0, 255) for _ in range(3)]
    c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
    cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
    if label:
        tf = max(tl - 1, 1)  # font thickness
        t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
        c2 = c1[0] + t_size[0], c1[1] - t_size[1]
        cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # filled
        # cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
        img_text = cv2ImgAddText(img, ch_text, c1[0], c2[1], (255, 255, 255), 25)
    return img_text