我正在尝试detectron2,并想训练样本模型。
当运行下面的代码时,我会得到(<class 'RuntimeError'>, RuntimeError('No CUDA GPUs are available'), <traceback object at 0x7f42b094ebc0>)
。代码如下:
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()
# import some common libraries
import matplotlib.pyplot as plt
import numpy as np
import cv2
# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.data.datasets import register_coco_instances
import random
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
import os
# To verify the data loading is correct, let's visualize the annotations of randomly selected samples in the training set:
register_coco_instances("fruits_nuts", {}, "../data/trainval.json", "../data/images")
fruits_nuts_metadata = MetadataCatalog.get("fruits_nuts")
dataset_dicts = DatasetCatalog.get("fruits_nuts")
'''
for d in random.sample(dataset_dicts, 3):
img = cv2.imread(d["file_name"])
visualizer = Visualizer(img[:, :, ::-1], metadata=fruits_nuts_metadata, scale=0.5)
vis = visualizer.draw_dataset_dict(d)
cv2.imshow('new', vis.get_image()[:, :, ::-1])
cv2.waitKey(0)
'''
# train model
cfg = get_cfg()
cfg.merge_from_file("../detectron2_repo/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.DATASETS.TRAIN = ("fruits_nuts",)
cfg.DATASETS.TEST = () # no metrics implemented for this dataset
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl" # initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.02
cfg.SOLVER.MAX_ITER = 300 # 300 iterations seems good enough, but you can certainly train longer
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 # faster, and good enough for this toy dataset
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 3 # 3 classes (data, fig, hazelnut)
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
我运行了火炬的脚本collect_env.py
:
/home/project/.venv/bin/python /home/project/src/collect_env.py
Collecting environment information...
PyTorch version: 1.10.2+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31
Python version: 3.8.10 (default, Nov 26 2021, 20:14:08) [GCC 9.3.0] (64-bit runtime)
Python platform: Linux-5.13.0-27-generic-x86_64-with-glibc2.29
Is CUDA available: False
CUDA runtime version: 10.1.243
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.22.1
[pip3] torch==1.10.2
[pip3] torchvision==0.11.3
[conda] Could not collect
Process finished with exit code 0
我在系统上有一个RTX3080显卡。然而,在我看来,它似乎找不到。
有什么建议吗?
有没有办法不用数据自动化系统来进行培训?
谢谢你的回复!
发布于 2022-04-12 01:07:44
我不确定这对你有用吗。但让我们从Windows用户的角度来看。我在Windows 10上使用Detectron2,启用了RTX3060膝上型计算机GPU CUDA。
你首先要检查的是数据自动化系统。您可以使用以下命令进行检查:
nvcc -V
应该显示这一信息:
C:\Users\User>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_May__3_19:41:42_Pacific_Daylight_Time_2021
Cuda compilation tools, release 11.3, V11.3.109
Build cuda_11.3.r11.3/compiler.29920130_0
要检查您的Pytorch是否在启用CUDA的情况下安装,请使用以下命令(来自其网站的引用):
import torch
torch.cuda.is_available()
正如您在这个问题中共享的系统信息一样,您还没有在您的系统上安装CUDA。而且您的系统没有检测到系统上任何可用的GPU (驱动程序)。据我所知,他们建议通过(Nvidia) GPU安装Pytorch 库达来运行Detectron2。(有关更多细节,您可以查看Pytorch网站和Detectron2 GitHub回购 )。
或,您可以使用以下选项:
将这一行代码添加到您的python程序中(作为此issues#300的参考):
cfg.MODEL.DEVICE = "cpu"
希望能帮上忙。干杯。
https://stackoverflow.com/questions/70910160
复制相似问题