[计算机视觉论文速递] 2018-03-01

Amusi

发布于 2018-04-12 09:38:21

1.1K0

发布于 2018-04-12 09:38:21

文章被收录于专栏：CVer

[1]《Stereoscopic Neural Style Transfer》

CVPR 2018

论文首次尝试对3D电影或AR/VR的新需求进行立体神经风格转换。首先对立体图像的左、右视图应用现有的单目风格传递方法进行了细致的研究。这表明，在最终的风格迁移结果中，原始的视差一致性不能很好的保存，这会导致观众3D观感疲劳。为了解决这一问题，我们通过在非闭塞区域实施双向视差约束，将新的视差损失引入到广泛采用的风格损失函数中。针对实际的实时解决方案, 我们通过联合培训化子网络和视差子网络，提出了第一个前馈网络，并将其集成到一个特征级的中间域中。我们的视差子网络也是第一个同时进行双向视差和遮挡掩模估计的端解除人间网络。最后，通过考虑时间相干性和视差一致性，有效地将网络扩展到立体视频。实验结果表明，该方法在定量和定性上明显优于the baseline算法。

注：该论文和2018-02-28论文速递的Tecent AI Lab发表的《Neural Stereoscopic Image Style Transfer》有相似之处，当然侧重点不同。看来立体图像风格迁移也可以“水”一大波呀！很有意思的研究！

arxiv：https://arxiv.org/abs/1802.10591

[2]《Novelty Detection with GAN》

论文提出一种基于生成对抗性网络(GAN)框架的方法来确定输入是来自已知类的集合，还是从哪个特定类，或者从未知域中，不属于任何已知的类。论文表明，a multi-class discriminator trained with a generator that generates samples from a mixture of nominal and novel data distributions is the optimal novelty detector（这段还是搬原文比较好）。我们将生成器与具有特征匹配损耗的混合发生器进行了近似分析，并通过经验表明, 该方法优于传统的新颖检测方法。我们的研究结果展示了GAN框架在新颖性检测任务中的一个简单而又强大的新应用。

注：Novelty Detection，一个很有意思的研究方向。

arxiv：https://arxiv.org/abs/1802.10560

[3]《Using Deep Learning for Segmentation and Counting within Microscopy Data》

论文提出一种卷积神经网络方法，其使用最近描述的特征金字塔网络(FPN)并结合VGG-style神经网络，在一个给定的显微图像进行细胞分割和计数。

注：细胞计数，一个很有意思的DL应用方向。细胞计数是一个无处不在，但繁琐的任务, 将大大受益于自动化。从基本的生物学问题到临床试验，细胞计数提供了关键的定量反馈, 推动研究。不幸的是, 细胞计数通常是一项手工任务，十分耗费时间。由于重叠的细胞、多焦平面的存在以及成像质量差等因素, 使得任务变得更加困难。

arxiv：https://arxiv.org/abs/1802.10548

[4]《Brain Tumor Segmentation and Radiomics Survival Prediction: Contribution to the BRATS 2017 Challenge》

论文提出一种基于卷积神经网络(CNN)的鲁棒性分割算法。算法框架受到流行的U-Net启发，并经过仔细的修改，以最大限度地提高脑肿瘤的分割性能。

注：U-Net在图像分割邻域果真很硬！

arxiv：https://arxiv.org/abs/1802.10508

[5]《HSI-CNN: A Novel Convolution Neural Network for Hyperspectral Image》

ICPR 2018 review

论文针对高光谱图像数据的特征，提出了一种新的卷积神经网络框架, 称为 HSI-CNN。Firstly, the spectral-spatial feature is extracted from a target pixel and its neighbors. Then, a number of one-dimensional feature maps, obtained by convolution operation on spectral-spatial features, are stacked into a two-dimensional matrix. Finally, the two-dimensional matrix considered as an image is fed into standard CNN. This is why we call it HSI-CNN.（这里搬原文，实在美滋滋）

注：超光谱图像，咦！好厉害的样子！

arxiv：https://arxiv.org/abs/1802.10478

[6]《A Simple Method to improve Initialization Robustness for Active Contours driven by Local Region Fitting Energy》

论文提出一种简单、通用的方法来提高基于局部拟合模型的初始轮廓的鲁棒性。所提出方法的核心思想是在轮廓的两边交换拟合值，使轮廓内的拟合值始终大于曲线演化过程中轮廓外的值 (或小于)。实验结果表明，该方法可以提高初始轮廓的鲁棒性，同时在局部拟合模型中保持原有优势。

注：Active contour models很有意思，但我并不了解，但这篇论文没有用Deep Learning，没有用Deep Learning哦！

arxiv：https://arxiv.org/abs/1802.10437

[7]《FINE-GRAINED WOUND TISSUE ANALYSIS USING DEEP NEURAL NETWORK》

论文提出使用预先训练的深度神经网络(DNN)进行patch-level的特征提取和分类。实验表明：state-of-the-art。论文后续会公开数据库，以促进研究创伤评估（很良心了）。

注：创伤评估(wound assessment)很有意思的研究方向。

arxiv：https://arxiv.org/abs/1802.10426

[8]《Convolutional Neural Networks with Alternately Updated Clique》

CVPR 2018

论文提出了一种具有alternately updated clique的新型卷积神经网络 (CliqueNet)。The CliqueNet has some unique properties. For each layer, it is both the input and output of

any other layer in the same block, so that the information flow among layers is maximized. During propagation, the newly updated layers are concatenated to re-update previously updated layer, and parameters are reused for multiple times. This recurrent feedback structure is able to bring higher level visual information back to refine lowlevel filters and achieve spatial attention. （这里直接搬原文）。

实验结果表明：对包括CIFAR-10、CIFAR-100、SVHN和ImageNet在内的图像识别数据集测试，本文提出的模型实现了以较少的参数做到了the state-of-the-art。

注：北京大学和上海交通大学的联名Paper（瑟瑟发抖）

arxiv：https://arxiv.org/abs/1802.10419

[9]《Compressing Neural Networks using the Variational Information Bottleneck》

In this paper we focus on pruning individual neurons, which can simultaneously trim model size, FLOPs, and run-time memory. We demonstrate state-of-the-art compression rates across an array of datasets and network architectures.

注：清华大学、上海科技大学、微软研究院的联名Paper。这篇文章好像和[9]侧重点有点相似。

arxiv：https://arxiv.org/abs/1802.10399

[10]《Deep-6DPose: Recovering 6D Object Pose from a Single RGB Image》

论文提出了一种端到端的深度学习框架(Deep-6DPose)，它联合检测、分割以及最重要的是从单一RGB图像中恢复对象实例的6D姿势。

arxiv：https://arxiv.org/abs/1802.10367

[11]《Learning to Adapt Structured Output Space for Semantic Segmentation》

CVPR 2018

论文提出一种基于语义分割的域适应的对抗性学习方法。考虑到语义分割作为结构化输出，包含源域和目标领域之间的空间相似性，论文在输出空间中采用对抗性学习。为了进一步增强适应模型，构建了多层次对抗网络，有效地实现了不同特征层次的输出空间域适应。

注：代码开源啦！代码开源啦！（基于PyTorch）

arxiv：https://arxiv.org/abs/1802.10349

github：https://github.com/wasidennis/AdaptSegNet

[12]《Neural Photometric Stereo Reconstruction for General Reflectance Surfaces》

论文提出一种新的用于photometric stereo的卷积神经网络框架。photometric stereo是从在不同光照下观测到的多幅图像中恢复3D物体表面normals的一个问题。

arxiv：https://arxiv.org/abs/1802.10328

[13]《A Model for Medical Diagnosis Based on Plantar Pressure》

本文提出一种基于足底压力(plantar pressure)的卷积神经网络在医学诊断中的应用模型。

注：Deep Learning还能这么玩！太厉害了！足底压力分析！

arxiv：https://arxiv.org/abs/1802.10316

[14]《Lp-Norm Constrained Coding With Frank-Wolfe Network》

We propose the Frank-Wolfe Network (F-W Net), whose architecture is inspired by unrolling and truncating the Frank-Wolfe algorithm for solving an Lp-norm constrained problem.（很基础性的研究，这里直接搬原文，因为太难理解了）

arxiv：https://arxiv.org/abs/1802.10252

github：https://github.com/sunke123/FW-Net

[15]《Joint Event Detection and Description in Continuous Video Streams》

论文提出一种联合事件检测和描述网络(JEDDi-Net)，其以端到端的方式解决了密集的字幕任务。

arxiv：https://arxiv.org/abs/1802.10250

[16]《IM2HEIGHT: Height Estimation from Single Monocular Imagery via Fully Residual Convolutional-Deconvolutional Network》

论文提出一种全卷积-解卷积的网络框架，可以端到端训练，包括残差学习，其可以实现以从单目遥感图像估计高度信息。

注：Amazing！从单目遥感图像中估计高度！先Mark，这里和2018-02-28论文速递的《Mono-Camera 3D Multi-Object Tracking Using Deep Learning Detections and PMBM Filtering》有点像，也是从单目图像中估计距离。

arxiv：https://arxiv.org/abs/1802.10249

[17]《Neural Aesthetic Image Reviewer》

论文提出一种称为神经美学图像审阅者(Neural Aesthetic Image Reviewer)的模型，它不仅能给出图像的审美评分，还可以生成一个文本描述，解释为什么图像会导致一个看似合理的评分评分。针对不同任务的性能改进，提出了基于共享美学语义层和任务特定嵌入层的两种多任务体系结构。

注：想知道把自己美美的自拍放到模型中，会是啥输出结果？我想，一定是颜值爆表！

arxiv：https://arxiv.org/abs/1802.10240

[18]《Improved Explainability of Capsule Networks: Relevance Path by Agreement》

论文对CapsNets的结构和behaviors进行了研究和分析，并说明了这种网络的潜在的explainability特性。此外，the paper shows possibility of transforming deep learning architectures in to transparent networks via incorporation of capsules in different layers instead of convolution layers of the CNNs.。

注：对大名鼎鼎的CapsNets进行研究，必须点赞！

arxiv：https://arxiv.org/abs/1802.10204

[19]《Brain Tumor Type Classification via Capsule Networks》

论文利用CapNets对医学图像数据集进行分类。因为CapNest对旋转和仿射变换是健壮的，需要的训练数据要少得多，所以可能有效处理医学图像数据集，包括脑磁共振成像 (MRI) 图像。结果表明，该方法能成功克服CNNs在脑肿瘤分类的问题。

注：利用CapsNets进行实际应用，太前卫了！

arxiv：https://arxiv.org/abs/1802.10200

[20]《Tell Me Where to Look: Guided Attention Inference Network》

CVPR 2018

In one common framework we address three shortcomings of previous approaches in modeling such attention maps: We (1) first time make attention maps an explicit and natural component of the end-to-end training, (2) provide self-guidance directly on these maps by exploring supervision form the network itself to improve them, and (3) seamlessly bridge the gap between using weak and extra supervision if available. Despite its simplicity, experiments on the semantic segmentation task demonstrate the effectiveness of our methods. We clearly surpass the state-of-the-art on Pascal VOC 2012 val. and test set. Besides, the proposed framework provides a way not only explaining the focus of the learner but also feeding back with direct guidance towards specific tasks. Under mild assumptions our method can also be understood as a plug-in to existing weakly supervised learners to improve their generalization performance.

注：弱监督学习问题，很帅！

arxiv：https://arxiv.org/abs/1802.10171