开源地址:https://github.com/matterport/Mask_RCNN 个人主页:http://www.yansongsong.cn/
MaskRCNN是何凯明基于以往的faster rcnn架构提出的新的卷积网络,一举完成了object instance segmentation. 该方法在有效地目标的同时完成了高质量的语义分割。 文章的主要思路就是把原有的Faster-RCNN进行扩展,添加一个分支使用现有的检测对目标进行并行预测。
此开源代码:这是在Python 3,Keras和TensorFlow上实现Mask R-CNN。该模型为图像中对象的每个实例生成边界框和分割蒙版。它基于特征金字塔网络(FPN)和ResNet101骨干网。
存储库包括:
代码记录在案,设计易于扩展。如果您在研究中使用它,请考虑引用此存储库(下面的bibtex)。如果您从事3D视觉,您可能会发现我们最近发布的Matterport3D数据集也很有用。该数据集是由我们的客户捕获的3D重建空间创建的,这些客户同意将其公开供学术使用。您可以在此处查看更多示例。
首先在项目源码地址下载源码到本机中:https://github.com/matterport/Mask_RCNN
Python 3.4,TensorFlow 1.3,Keras 2.0.8和其他常见软件包requirements.txt
。
要在MS COCO上进行训练或测试,还需要:
如果您使用Docker,则已验证代码可以在 此Docker容器上运行。
为什么需要安装pycocotools,经过看源码发现,训练coco数据集时用到了pycocotools这个模块,如果不安装会报错无法正常运行。
pycocotools
从这些回购中的一个训练或测试MS COCO安装。(这里就是1.2 MS COCO要求,需要安装pycocotools
)
上述都执行完成的话,keras版本的MaskRCNN就安装完成了。下面我们动手试用一下。
用安装Mask RCNN的python环境打开 jupyter notebook,命令行,或shell运行:
jupyter notebook
指定jupyter notebook默认路径,便于打开项目工程可以参考这个博客:https://www.cnblogs.com/awakenedy/p/9075712.html
运行完成后,会自动打开一个网页,如果不能就手动复制一下地址打开。
进入下载的MaskRCNN的根目录,打开 samples/demo.ipynb 文件。
代码如下:
Mask R-CNN Demo
A quick intro to using the pre-trained model to detect and segment objects.
In [1]:导入相关文件,设置参数,下载网络模型等:由于下载速度慢,建议直接下载https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5到根目录在运行下面代码
import os
import sys
import random
import math
import numpy as np
import skimage.io
import matplotlib
import matplotlib.pyplot as plt
# Root directory of the project
ROOT_DIR = os.path.abspath("../")
# Import Mask RCNN
sys.path.append(ROOT_DIR) # To find local version of the library
from mrcnn import utils
import mrcnn.model as modellib
from mrcnn import visualize
# Import COCO config
sys.path.append(os.path.join(ROOT_DIR, "samples/coco/")) # To find local version
import coco
%matplotlib inline
# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "logs")
# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
# Download COCO trained weights from Releases if needed
if not os.path.exists(COCO_MODEL_PATH):
utils.download_trained_weights(COCO_MODEL_PATH)
# Directory of images to run detection on
IMAGE_DIR = os.path.join(ROOT_DIR, "images")
Using TensorFlow backend.
We'll be using a model trained on the MS-COCO dataset. The configurations of this model are in the CocoConfig
class in coco.py
.
For inferencing, modify the configurations a bit to fit the task. To do so, sub-class the CocoConfig
class and override the attributes you need to change.
In [2]:进行一些参数设置
class InferenceConfig(coco.CocoConfig):
# Set batch size to 1 since we'll be running inference on
# one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
GPU_COUNT = 1
IMAGES_PER_GPU = 1
config = InferenceConfig()
config.display()
Configurations:
BACKBONE resnet101
BACKBONE_STRIDES [4, 8, 16, 32, 64]
BATCH_SIZE 1
BBOX_STD_DEV [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE None
DETECTION_MAX_INSTANCES 100
DETECTION_MIN_CONFIDENCE 0.7
DETECTION_NMS_THRESHOLD 0.3
FPN_CLASSIF_FC_LAYERS_SIZE 1024
GPU_COUNT 1
GRADIENT_CLIP_NORM 5.0
IMAGES_PER_GPU 1
IMAGE_CHANNEL_COUNT 3
IMAGE_MAX_DIM 1024
IMAGE_META_SIZE 93
IMAGE_MIN_DIM 800
IMAGE_MIN_SCALE 0
IMAGE_RESIZE_MODE square
IMAGE_SHAPE [1024 1024 3]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE 14
MASK_SHAPE [28, 28]
MAX_GT_INSTANCES 100
MEAN_PIXEL [123.7 116.8 103.9]
MINI_MASK_SHAPE (56, 56)
NAME coco
NUM_CLASSES 81
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 1000
POST_NMS_ROIS_TRAINING 2000
PRE_NMS_LIMIT 6000
ROI_POSITIVE_RATIO 0.33
RPN_ANCHOR_RATIOS [0.5, 1, 2]
RPN_ANCHOR_SCALES (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE 1
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD 0.7
RPN_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 1000
TOP_DOWN_PYRAMID_SIZE 256
TRAIN_BN False
TRAIN_ROIS_PER_IMAGE 200
USE_MINI_MASK True
USE_RPN_ROIS True
VALIDATION_STEPS 50
WEIGHT_DECAY 0.0001
In [3]:建立网络模型,载入参数
# Create model object in inference mode.
model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)
# Load weights trained on MS-COCO
model.load_weights(COCO_MODEL_PATH, by_name=True)
WARNING:tensorflow:From c:\datas\apps\rj\miniconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From c:\datas\apps\rj\miniconda3\envs\tf_gpu\lib\site-packages\mask_rcnn-2.1-py3.6.egg\mrcnn\model.py:772: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
The model classifies objects and returns class IDs, which are integer value that identify each class. Some datasets assign integer values to their classes and some don't. For example, in the MS-COCO dataset, the 'person' class is 1 and 'teddy bear' is 88. The IDs are often sequential, but not always. The COCO dataset, for example, has classes associated with class IDs 70 and 72, but not 71.
To improve consistency, and to support training on data from multiple sources at the same time, our Dataset
class assigns it's own sequential integer IDs to each class. For example, if you load the COCO dataset using our Dataset
class, the 'person' class would get class ID = 1 (just like COCO) and the 'teddy bear' class is 78 (different from COCO). Keep that in mind when mapping class IDs to class names.
To get the list of class names, you'd load the dataset and then use the class_names
property like this.
# Load COCO dataset
dataset = coco.CocoDataset()
dataset.load_coco(COCO_DIR, "train")
dataset.prepare()
# Print class names
print(dataset.class_names)
We don't want to require you to download the COCO dataset just to run this demo, so we're including the list of class names below. The index of the class name in the list represent its ID (first class is 0, second is 1, third is 2, ...etc.)
In [4]:配置类别名
# COCO Class names
# Index of the class in the list is its ID. For example, to get ID of
# the teddy bear class, use: class_names.index('teddy bear')
class_names = ['BG', 'person', 'bicycle', 'car', 'motorcycle', 'airplane',
'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird',
'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear',
'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
'kite', 'baseball bat', 'baseball glove', 'skateboard',
'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
'keyboard', 'cell phone', 'microwave', 'oven', 'toaster',
'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors',
'teddy bear', 'hair drier', 'toothbrush']
In [5]:读入照片进行识别,原文中采用从images文件夹随机读取的方式。我这里注释掉了前两句,采用读取自己准备的照片,这里是我的母校照片。 大家只需要将image_file改为自己准备照片地址即可。
# Load a random image from the images folder
#file_names = next(os.walk(IMAGE_DIR))[2]
#image = skimage.io.imread(os.path.join(IMAGE_DIR, random.choice(file_names)))
image_file = os.path.join(IMAGE_DIR, "ahnu.jpg")
image = skimage.io.imread(image_file)
# Run detection
results = model.detect([image], verbose=1)
# Visualize results
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'],
class_names, r['scores'])
Processing 1 images
image shape: (768, 1024, 3) min: 0.00000 max: 255.00000 uint8
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151.10000 float64
image_metas shape: (1, 93) min: 0.00000 max: 1024.00000 float64
anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float32
我训练了
samples/shapes/train_shapes.ipynb例子,并成功调用了多GPU,如果大家遇到问题可以看我下面的解决方法。。
我们为MS COCO提供预先训练的砝码,使其更容易入手。您可以使用这些权重作为起点来训练您自己在网络上的变化。培训和评估代码在samples/coco/coco.py
。您可以在Jupyter笔记本中导入此模块(请参阅提供的笔记本中的示例),或者您可以直接从命令行运行它:
# Train a new model starting from pre-trained COCO weights
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=coco
# Train a new model starting from ImageNet weights
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=imagenet
# Continue training a model that you had trained earlier
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5
# Continue training the last model you trained. This will find
# the last trained weights in the model directory.
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=last
您还可以使用以下命令运行COCO评估代码:
# Run COCO evaluation on the last trained model
python3 samples/coco/coco.py evaluate --dataset=/path/to/coco/ --model=last
应设置培训计划,学习率和其他参数samples/coco/coco.py
。
首先阅读关于气球颜色飞溅样本的博客文章。它涵盖了从注释图像到培训再到在示例应用程序中使用结果的过程。
总之,要在您自己的数据集上训练模型,您需要扩展两个类:
Config
该类包含默认配置。对其进行子类化并修改您需要更改的属性。
Dataset
此类提供了一种使用任何数据集的一致方法。它允许您使用新数据集进行培训,而无需更改模型的代码。它还支持同时加载多个数据集,如果要检测的对象在一个数据集中并非全部可用,则此选项非常有用。
见例子samples/shapes/train_shapes.ipynb
,samples/coco/coco.py
,samples/balloon/balloon.py
,和samples/nucleus/nucleus.py
。
本人测试了samples/shapes/train_shapes.ipynb,单GPU训练基本都没有问题,使用多GPU运行时可能会出现这个问题:
Keras object has no attribute '_is_graph_network'
解决方法:
降级Keras到2.1.6可以解决这个问题 pip install keras==2.1.6 加速安装 pip install keras==2.1.6 -i https://pypi.tuna.tsinghua.edu.cn/simple
这个实现大部分都遵循Mask RCNN文章,但在一些情况下我们偏向于代码简单性和泛化。这些是我们意识到的一些差异。如果您遇到其他差异,请告诉我们。