语义分割 Global Deconvolutional Networks for Semantic Segmentation BMVC 2016 https://github.com/DrSleep/GDN
基于CNN的语义分割在近两年得到飞速的发展,但是这种 pixel-wise labelling with CNNs has its own unique challenges: 特征图的精确放大 + context 信息的嵌入 1)an accurate deconvolution, or upsampling, of low-resolution output into a higher-resolution segmentation mask 2)an inclusion of global information, or context, within locally extracted features
本文提出一个网络结构 Global Deconvolutional Network 解决这两个问题。本文的模型最大亮点是在保持较高精度同时 significantly 降低了模型的参数量
3 Global Deconvolutional Network 3.1 Baseline Models 这里我们选择了两个开源的基准分割模型: FCN-32s and DeepLab ,他们两个都是基于 VGG 16-layer net,将全连接层变为卷积层,目标函数用 pixel-wise softmax loss 表示
3.2 Global Interpolation 输入图像经过一系列卷积和池化后得到一个 encoded image,其尺寸降采样很多。为了输出原始图像尺寸的分割图像,我们需要同时对这个 encoded image 进行 decode and upsample。 这里我们设计了一个 a learnable global interpolation
假定 x 表示 decoded information, 输入RGB图像为 I , 上采样后的信号为 y
我们的这个上采样不是根据最近的四个点数据信息来计算的,而是包括了更多的信息进来 Opposite to a simple bilinear interpolation, which operates only on the closest four points, the equation above allows to include much more information on the rectangular grid
this operation is differentiable
3.3 Multi-task loss loss functions 定义如下:
本文提出的每个模型其目的都是为了提取全局信息,将其嵌入到网络中去。本文提出的这个插值方法也是有效的上采样方法。 Overall, each component of the proposed approach aims to capture global information and incorporate it into the network, hence the name global deconvolutional network. Besides that, the proposed interpolation also effectively upsamples the coarse output and a nonlinear upsampling can be achieved with the addition of an activation function on the top of the block.
4 Experiments