Large Kernel Matters–Improve Semantic Segmentation by Global Convolutional Network https://arxiv.org/abs/1703.02719
语义分割问题需要同时解决两个问题:classification 和 localization, 将图像中的每个物体精确分割出来,同时对每个物体进行分类。分类和定位这两个问题对于CNN 设计要求有所区别。 For the classification task, the models are required to be invariant to various transformations like translation and rotation.
But for the localization task, models should be transformation-sensitive, i.e., precisely locate every pixel for each semantic category
当前的语义分割算法主要侧重于 localization , which may be suboptimal for classification
怎么解决这个contradictory 了? 这里我们的策略是使用 Large Kernel
这里我们设计了一个 Global Convolutional Network 采用 Large Kernel from the localization view, the structure must be fully-convolutional without any fully-connected layer or global pooling layer that used by many classification networks,since the latter will discard localization information
from the classification view, motivated by the densely-connected structure of classification models, the kernel size of the convolutional structure should be as large as possible.
对于 GCN 模块计算量问题: Instead of directly using larger kernel or global convolution, our GCN module employs a combination of 1 × k + k × 1 and k × 1 + 1 × k convolutions, which enables densely connections within a large k×k region inthe feature map.
为了提升物体边缘分割精度,提出Boundary Refinement we propose a Boundary Refinement (BR) block shown in Figure 2 C. Here, we models the boundary alignment as a residual structure.
Kernal 尺寸越大,效果越好
GCN 与其他两个convolution结构 对比:
上面三个图主要证明 GCN 效果比其他卷积结构好
PASCAL VOC 2012 test set
Cityscapes test set