手把手教你实现基于丹摩智算的YoloV8自定义数据集的训练、测试

AI浩

发布于 2024-10-22 12:41:58

43900

代码可运行

文章被收录于专栏：AI智韵AI智韵

运行总次数：0

代码可运行

摘要

DAMODEL（丹摩智算）是专为AI打造的智算云，致力于提供丰富的算力资源与基础设施助力AI应用的开发、训练、部署。

官网链接：https://www.damodel.com/console/overview

平台的优势

💡 超友好！

配备124G大内存和100G大空间系统盘，一键部署，三秒启动，让AI开发从未如此简单！

💡 资源多！

从入门级到专业级GPU全覆盖，无论初级开发还是高阶应用，你的需求，我们统统Cover！

💡 性能强！

自建IDC，全新GPU，每一位开发者都能体验到顶级的计算性能和专属服务，大平台值得信赖！

💡 超实惠！

超低价格体验优质算力服务，注册即送优惠券！还有各类社区优惠活动，羊毛薅不停！

支持的GPU

不仅有常用的RTX 4090，还有 H800 PCle和H800 SXM这样的高端GPU，这些都是国内买不到的！

显卡	显存-GB	内存-GB/卡	CPU-核心/卡	存储	简介
RTX 4090	24	60	11	100G系统盘50G数据盘	性价比配置，推荐入门用户选择，适合模型推理场景
RTX 4090	24	124	15	100G系统盘50G数据盘	性价比配置，推荐入门用户与专业用户选择，适合模型推理场景
H800 SXM	80	252	27	100G系统盘50G数据盘	顶级配置，推荐专业用户选择，适合模型训练与模型推理场景
H800 PCle	80	124	21	100G系统盘50G数据盘	顶级配置，推荐专业用户选择，适合模型训练与模型推理场景
L40S	48	124	21	100G系统盘50G数据盘	专业级配置，推荐专业用户选择，适合模型训练与模型推理场景
P40	24	12	6	100G系统盘50G数据盘	性价比配置，推荐入门用户选择，适合模型推理场景

制作数据集

Labelme数据集

数据集选用我以前自己标注的数据集。下载链接： https://download.csdn.net/download/hhhhhhhhhhwwwwwwwwww/63242994 类别如下： ['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2', 'other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10', 'b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4', 'globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33', 'an-70', 'su-24', 'tu-22', 'il-76']

格式转换

将Lableme数据集转为yolov8格式的数据集，转换代码如下：

import os
import shutil

import numpy as np
import json
from glob import glob
import cv2
from sklearn.model_selection import train_test_split
from os import getcwd


def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return (x, y, w, h)


def change_2_yolo5(files, txt_Name):
    imag_name=[]
    for json_file_ in files:
        json_filename = labelme_path + json_file_ + ".json"
        out_file = open('%s/%s.txt' % (labelme_path, json_file_), 'w')
        json_file = json.load(open(json_filename, "r", encoding="utf-8"))
        # image_path = labelme_path + json_file['imagePath']
        imag_name.append(json_file_+'.jpg')
        height, width, channels = cv2.imread(labelme_path + json_file_ + ".jpg").shape
        for multi in json_file["shapes"]:
            points = np.array(multi["points"])
            xmin = min(points[:, 0]) if min(points[:, 0]) > 0 else 0
            xmax = max(points[:, 0]) if max(points[:, 0]) > 0 else 0
            ymin = min(points[:, 1]) if min(points[:, 1]) > 0 else 0
            ymax = max(points[:, 1]) if max(points[:, 1]) > 0 else 0
            label = multi["label"].lower()
            if xmax <= xmin:
                pass
            elif ymax <= ymin:
                pass
            else:
                cls_id = classes.index(label)
                b = (float(xmin), float(xmax), float(ymin), float(ymax))
                bb = convert((width, height), b)
                out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
                # print(json_filename, xmin, ymin, xmax, ymax, cls_id)
    return imag_name

def image_txt_copy(files,scr_path,dst_img_path,dst_txt_path):
    """
    :param files: 图片名字组成的list
    :param scr_path: 图片的路径
    :param dst_img_path: 图片复制到的路径
    :param dst_txt_path: 图片对应的txt复制到的路径
    :return:
    """
    for file in files:
        img_path=scr_path+file
        print(file)
        shutil.copy(img_path, dst_img_path+file)
        scr_txt_path=scr_path+file.split('.')[0]+'.txt'
        shutil.copy(scr_txt_path, dst_txt_path + file.split('.')[0]+'.txt')


if __name__ == '__main__':
    classes = ['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2',
               'other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10',
               'b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4',
               'globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33',
               'an-70', 'su-24', 'tu-22', 'il-76']

    # 1.标签路径
    labelme_path = "USA-Labelme/"
    isUseTest = True  # 是否创建test集
    # 3.获取待处理文件
    files = glob(labelme_path + "*.json")

    files = [i.replace("\\", "/").split("/")[-1].split(".json")[0] for i in files]
    for i in files:
        print(i)
    trainval_files, test_files = train_test_split(files, test_size=0.1, random_state=55)
    # split
    train_files, val_files = train_test_split(trainval_files, test_size=0.1, random_state=55)
    train_name_list=change_2_yolo5(train_files, "train")
    print(train_name_list)
    val_name_list=change_2_yolo5(val_files, "val")
    test_name_list=change_2_yolo5(test_files, "test")
    #创建数据集文件夹。
    file_List = ["train", "val", "test"]
    for file in file_List:
        if not os.path.exists('./VOC/images/%s' % file):
            os.makedirs('./VOC/images/%s' % file)
        if not os.path.exists('./VOC/labels/%s' % file):
            os.makedirs('./VOC/labels/%s' % file)
    image_txt_copy(train_name_list,labelme_path,'./VOC/images/train/','./VOC/labels/train/')
    image_txt_copy(val_name_list, labelme_path, './VOC/images/val/', './VOC/labels/val/')
    image_txt_copy(test_name_list, labelme_path, './VOC/images/test/', './VOC/labels/test/')

运行完成后就得到了yolov8格式的数据集。

本地调试

在官网上下载YoloV8，GitHub链接： GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite 或者直接执行命令pip install ultralytics，如果你打算修改模型，或者二次创新，不建议使用安装命令安装。

下载到本地后解压，将生成的yolo数据集放到datasets（需要创建datasets文件夹）文件夹下面，如下图：

安装必要的库文件，安装命令：

pip install opencv-python
pip install numpy==1.23.5
pip install pyyaml
pip install tqdm
pip install matplotlib

上面这些安装命令，缺哪些就安装哪些，注意numpy的版本，如果是2.0以上版本一定要把版本降下来。

然后在根目录新建VOC.yaml文件，如下图:

添加内容：

train: ./VOC/images/train # train images
val: ./VOC/images/val # val images
test: ./VOC/images/test # test images (optional)

names: ['c17', 'c5', 'helicopter', 'c130', 'f16', 'b2',
    'other', 'b52', 'kc10', 'command', 'f15', 'kc135', 'a10',
    'b1', 'aew', 'f22', 'p3', 'p8', 'f35', 'f18', 'v22', 'f4',
    'globalhawk', 'u2', 'su-27', 'il-38', 'tu-134', 'su-33',
    'an-70', 'su-24', 'tu-22', 'il-76']

然后新建train.py，如下图：

在train.py添加代码：

from ultralytics import YOLO
if __name__ == '__main__':
    # 加载模型
    model = YOLO("ultralytics/cfg/models/v8/yolov8l.yaml")  # 从头开始构建新模型
    print(model.model)

    # Use the model
    results = model.train(data="VOC.yaml", epochs=100, device='0', batch=16,workers=0)  # 训练模型

然后就可以看是训练了，点击run开始运行train.py。

基于丹摩智算的训练

创建账号，登录官网后，就可以看到主页面了。

点击GPU云实例，然后再点击创建实例，进入创建实例的页面。

在这里插入图片描述

付费类型：可以选择按量付费，也可以选择包日，包月等。根据自己的需求选择。实例配置：可以选择GPU的数量，CPU的核数等信息来筛选列表的中配置。

选择具体的配置后，配置合适容量的数据盘。

在已选配置栏中，可以看到目前的详细配置信息。

接下来选择镜像，目前主流平台的框架都是支持的，选择Pytorch，就可以看到Pytorch的镜像信息。

点击创建密钥对，弹出创建密钥的窗口，创建密钥或者导入公钥！

点击立即创建就可以创建实例了。

在这里插入图片描述

我创建了一个P40的实例，因为4090被抢没了！等待一会就可以了！

创建好后，点击 JupyterLab 进入控制台。

将我们刚才创建的工程压缩成zip的压缩包，等待上传。

点击，文件夹样子的标签，进入根目录，然后点击↑，进入上传文件的页面。

选择文件，点击打开。

上传完成后，点击Terminal，就可以进入我们熟悉的命令行界面。

输入ls，就可以看到我们刚才上传的压缩包！

然后，输入：

unzip ultralytics-main.zip

解压文件，如下图：

在这里插入图片描述

解压后就可以在左侧的目录中看到解压后的文件夹。点击进入。

点击train.py，Open With→Editor。

打开train.py后就可以修改train.py里面的参数了。

安装YoloV8运行所需要的库：

pip install opencv-python

如果遇到ImportError: libGL.so.1: cannot open shared object file: No such file or direc，这样的错误，需要安装：

pip install opencv-python-headless

pip install pyyaml
pip install tqdm
pip install matplotlib
pip install pandas

如果遇到有些苦文件下载不下来，可以尝试设置源，命令：

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

然后，执行命令，python train.py就可以运行了.

测试

test.py代码如下：

from ultralytics import YOLO


if __name__ == '__main__':
    # Load a model
    # model = YOLO('yolov8m.pt')  # load an official model
    model = YOLO('runs/detect/train/weights/best.pt')  # load a custom model
    results = model.predict(source="ultralytics/assets", device='0')  # predict on an image
    print(results)