图像地点场景类型识别(PlaceCNN)实践

sparkexpert

发布于 2019-05-26 14:01:26

2.8K1

发布于 2019-05-26 14:01:26

文章被收录于专栏：大数据智能实战

　　从图像中判断图像场景所处的地点类型，是图像理解的一种常见任务。本质上场景类别标注数据足够的情况下，它可以属于图像分类的一种，因此直接利用现有成熟的网络架构如ResNet就可以实现较高精度的图像涉及场所的识别。

　　本文实践采自：http://places2.csail.mit.edu/download.html

该数据集涵盖了365种图像场景，同时还提供了多种网络架构的预训练模型，主要如下：

Pre-trained CNN models on Places365-Standard:

AlexNet-places365: deploy weights
GoogLeNet-places365: deploy weights
VGG16-places365: deploy weights
VGG16-hybrid1365: deploy weights
ResNet152-places365 fine-tuned from ResNet152-ImageNet: deploy weights
ResNet152-hybrid1365: deploy weights
ResNet152-places365 trained from scratch using Torch: torch model converted caffemodel:deploy weights. It is the original ResNet with 152 layers. On the validation set, the top1 error is 45.26% and the top5 error is 15.02%.
ResNet50-places365 trained from scratch using Torch: torch model. It is Preact ResNet with 50 layers. The top1 error is 44.82% and the top5 error is 14.71%.
To use the alexnet and vgg16 caffemodels in Torch, use the torch library loadcaffe, where you could simply load the caffe model use the following commands. But note that the input image scale should be from 0-255, which is different to the 0-1 scale in the previous resnet Torch models trained from scratch in fb.resnet.torch.

2、实验结果

将上图地点分类为：酒巴、饭店或者咖啡屋。

这是数据集中的一张测试照片，定义为会议室。

这个候车厅的识别也是非常准确的。

见：https://timgsa.baidu.com/timg?image&quality=80&size=b9999_10000&sec=1523878667027&di=287398ec5e55869341ba2747794612a3&imgtype=0&src=http%3A%2F%2Fimg.pconline.com.cn%2Fimages%2Fphotoblog%2F8%2F7%2F0%2F0%2F8700542%2F20094%2F30%2F1241086150942.jpg