NIPS2018深度学习论文及代码集锦

文章来源：企鹅号 - 机器学习blog

[1]DVAE#: Discrete Variational Autoencoders withRelaxed Boltzmann Priors

Arash Vahdat , Evgeny Andriyash , William G. Macready

Quadrant.ai, D-Wave Systems Inc.

https://papers.nips.cc/paper/7457-dvae-discrete-variational-autoencoders-with-relaxed-boltzmann-priors.pdf

在变分自编码(VAEs)中，波尔兹曼机分布在二值隐含变量中是一种有效的先验知识。但是，之前的一些训练离散变分自编码的方法利用的是证据下界，而不是紧致的重要性加权的界。

这篇文章提出两种方法将波尔兹曼机用于连续分布，并且保证训练时具有重要性加权的界。主要是利用泛化重叠变换和高斯积分技巧。

几种方法的效果对比如下

代码地址

https://github.com/QuadrantAI/dvae

[2]Deep Attentive Tracking via Reciprocative Learning

Shi Pu, Yibing Song, Chao Ma, Honggang Zhang, Ming-Hsuan Yang

Beijing University of Posts and Telecommunications, Tencent AI Lab, Shanghai Jiao Tong University, University of California at Merced

https://papers.nips.cc/paper/7463-deep-attentive-tracking-via-reciprocative-learning.pdf

视觉注意力，来源于认知神经科学，可以帮助我们关注到目标数据的最相关的部分。近年来，很多学者都在尝试将注意力机制用于计算机视觉中。在视觉追踪中，如果目标物体的外观发生较大变化，对其进行追踪就具有一定的挑战性。注意力映射通过对时序鲁棒的特征加以选择性注意力，这样就有助于视觉追踪。

现有的基于检测的追踪方法主要利用额外的注意力模块来生成特征权重。这篇文章中，提出一种互补学习算法，利用视觉注意力来训练深层的分类器。该算法包含前向传播和后向操作来生成视觉映射，这里的视觉映射在训练时可以作为原始分类误差函数的正则项。深层分类器通过学习，可以对目标物体中的发生外观变化的区域具有鲁棒性。

这篇文章的主要贡献如下

本文所提算法的框架及解释如下

本文所提方法的效果示例如下

不同参数及不同方法的效果对比如下

不同数据集上几种方法的效果对比如下

其中CCOT对应的论文为

Beyond correlation filters: Learningcontinuous convolution operators for visual tracking,ECCV 2016

对应代码

https://github.com/martin-danelljan/Continuous-ConvOp

CREST对应的论文为

Crest: Convolutional residuallearning for visual tracking,ICCV 2017

对应代码

https://github.com/ybsong00/CREST-Release

mdnet对应的论文为

Learning multi-domain convolutional neural networks for visual tracking,CVPR 2016

对应代码

https://github.com/HyeonseobNam/MDNet

mcpf对应的论文为

Multi-task correlation particle filter for robust object tracking,CVPR 2017

相关论文集

https://github.com/foolwood/benchmark_results

adnet对应的论文为

Action-decision networks for visual trackingwith deep reinforcement learning,CVPR 2017

对应代码

https://github.com/hellbell/ADNet

BACF对应的论文为

Learning background-aware correlation filters forvisual tracking,CVPR 2017

对应代码

https://github.com/LCAR979/BACF

相关论文及代码集

https://github.com/HEscop/TBCF

SINT对应的论文为

Siamese instance search for tracking,CVPR 2016

对应代码

https://github.com/taotaoorange/SINT

SRDCFD对应的论文为

Adaptive decontamination of thetraining set: A unified formulation for discriminative visual tracking,CVPR 2016

ACFN对应的论文为

Attentional correlationfilter network for adaptive visual tracking,CVPR 2017

对应代码

https://github.com/jongwon20000/ACFN

在数据集上几种方法的效果对比如下

其中staple对应的论文为

Staple: Complementarylearners for real-time tracking,CVPR 2016

对应代码

https://github.com/bertinetto/staple

ebt对应的论文为

Beyond local search: Tracking objects everywhere with instancespecificproposals,CVPR 2016

对应代码

http://www.votchallenge.net/vot2016/download/02_EBT.zip

siamFC对应的论文为

Fully-convolutionalsiamese networks for object tracking,ECCVW 2016

对应代码

https://github.com/torrvision/siamfc-tf

https://github.com/rafellerc/Pytorch-SiamFC

https://github.com/bertinetto/siamese-fc

代码地址

https://ybsong00.github.io/nips18_tracking/index

[3]Attention in Convolutional LSTM for GestureRecognition

Liang Zhang, Guangming Zhu, Lin Mei, Peiyi Shen, Syed Afaq Ali Shah, Mohammed Bennamoun

Xidian University, Central Queensland University, University of Western Australia

https://papers.nips.cc/paper/7465-attention-in-convolutional-lstm-for-gesture-recognition.pdf

卷积长短时记忆网络广泛应用于行为和动作识别，并且学者们已经将多种注意力机制嵌入到lstm或者卷积lstm中。之前已有学者将三维卷积神经网络和卷积lstm结合用于手势识别，这篇文章将注意力机制融入卷积lstm中。

这篇文章主要讨论几种卷积lstm的变形，

1）移除卷积lstm中三门限单元中的卷积结构

2）将注意力机制用于卷积lstm的输入

3）对输入进行重构

4）各通道带有注意力机制的输出门限

作者们发现三门限中的空间卷积对时空特征融合几乎没有帮助，输入和输出门中融入注意力机制不能对特征融合进行提升。卷积lstm有助于时序特种的融合，同时循环操作在输入为空间特征或时空特征时，可以学习长时间时空特征。基于此，作者们提出lstm的一种新变形，其中卷积结构只嵌入到lstm中对输入的状态变换中。

四种lstm的变形图示如下

几种网络的效果对比如下

跟其他方法的效果对比如下

其中ResNet50对应的论文为

Gesture recognition: Focus on the hands,CVPR 2018

Pyramidal C3D对应的论文为

Large-scale isolated gesturerecognition using pyramidal 3d convolutional networks,ICPR 2016

C3D对应的论文为

Large-scalegesture recognition with a fusion of rgb-d data based on the c3d model，ICPR 2016

Res3D对应的论文为

Multimodal gesture recognition based on the resc3d network,ICCV 2017

3DCNN+BiConvLSTM+2DCNN对应的论文为

Learning spatiotemporal features using 3dcnn and convolutional lstm for gesture recognition,ICCV 2017

对应代码

https://github.com/GuangmingZhu/Conv3D_BICLSTM

代码地址

https://github.com/GuangmingZhu/AttentionConvLSTM

扫码

添加站长进交流群

领取专属 10元无门槛券

私享最新 技术干货

NIPS2018深度学习论文及代码集锦

相关快讯

扫码

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐