前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >[计算机视觉论文速递] 2018-03-05

[计算机视觉论文速递] 2018-03-05

作者头像
Amusi
发布2018-04-12 09:59:39
7140
发布2018-04-12 09:59:39
举报
文章被收录于专栏:CVer

通知:这篇推文有16篇论文速递信息,涉及目标检测、图像分割、风格迁移和GAN等方向。

[1]《Hashing with Mutual Information》

2018 IEEE PAMI

Abstract:论文提出一种基于信息理论量、互信息(mutual)优化的监督哈希(hashing)方法。论文表明,优化互信息可以减少学习汉明空间中诱导邻域结构中的歧义,这对于获取高检索性能是至关重要的。为此,论文对minibatch随机梯度下降的深度神经网络中的互信息进行了优化,并提出了最大限度有效地利用现有监督的公式。

arxiv:https://arxiv.org/abs/1803.00974

[2]《Tree Species Identification from Bark Images Using Convolutional Neural Networks》

Under review for IROS 2018

Abstract:在这项工作中, 我们提出了一个新的数据集包含超过23,000幅高分辨率树皮图像,涵盖23种不同的物种,并利用深度学习建立一个基准(benchmark)。我们获得了93.88% 的精确度,并显示了在树的所有图像上使用多数投票方法获得97.81%精度的可能性。我们还进行了实验,表明在大量的图像上收集大量的树是更重要的,而单个树的图像应该在不同的位置拍摄。

注:咦!树皮检测,突然觉得我那个方向的检测也明朗起来了!

arxiv:https://arxiv.org/abs/1803.00949

[3]《Protecting JPEG Images Against Adversarial Attacks》

Accepted to IEEE Data Compression Conference

Abstract:These adversarial attacks make imperceptible modifications to an image that fool DNN classifiers. We present an adaptive JPEG encoder which defends against many of these attacks. Experimentally, we show that our method produces images with high visual quality while greatly reducing the potency of state-of-the-art attacks. Our algorithm requires only a modest increase in encoding time, produces a compressed image which can be decompressed by an off-theshelf JPEG decoder, and classified by an unmodified classifier.(这里搬原文,比较合适)

注:JPEG,好东西!

arxiv:https://arxiv.org/abs/1803.00940

[4]《Multi-Instance Dynamic Ordinal Random Fields for Weakly-supervised Facial Behavior Analysis》

submitted TIP (June 2017)

Abstract:We propose a Multi-Instance-Learning (MIL) approach for weakly-supervised learning problems, where a training set is formed by bags (sets of feature vectors or instances) and only labels at bag-level are provided. Specifically, we consider the Multi-Instance Dynamic-Ordinal-Regression (MI-DOR) setting, where the instance labels are naturally represented as ordinal variables and bags are structured as temporal sequences. To this end, we propose Multi-Instance Dynamic Ordinal Random Fields (MI-DORF).(不是偷懒,这里搬原文,比较合适)

注:多示例学习Multi-Instance-Learning (MIL),有意思的研究方向!

arxiv:https://arxiv.org/abs/1803.00907

[5]《Monocular Depth Estimation using Multi-Scale Continuous CRFs as Sequential Deep Networks》

Abstract:论文解决了单幅静止图像的单目深度估计问题。论文通过受多尺度卷积神经网络(cnn)近期工作的有效性的启发,提出了一种融合多个 CNN side 输出的互补信息的深层模型。与以往采用串联或加权平均方案的方法不同,论文是通过连续条件随机域 (CRFs) 获得积分。

注:我觉得我有必要花一天时间整理一下从单目图像估计深度距离方向的论文(这方面论文露脸频率很高)

arxiv:https://arxiv.org/abs/1803.00891

[6]《Pose-Robust Face Recognition via Deep Residual Equivariant Mapping》

Accepted to CVPR2018 中国香港中文大学和商汤科技

Abstract:In this study, we hypothesize that there is an inherent mapping between frontal and profile faces, and consequently, their discrepancy in the deep representation space can be bridged by an equivariant mapping. To exploit this mapping, we formulate a novel Deep Residual EquivAriant Mapping (DREAM) block, which is capable of adaptively adding residuals to the input deep representation to transform a profile face representation to a canonical pose that simplifies recognition. The DREAM block consistently enhances the performance of profile face recognition for many strong deep networks, including ResNet models, without deliberately augmenting training data of profile faces. The block is easy to use, light-weight, and can be implemented with a negligible computational overhead.(这么好的Paper,还需要我翻译?)

注:咦,侧脸识别,好厉害!为商汤科技打Call!

arxiv:https://arxiv.org/abs/1803.00839

[7]《Deep Unsupervised Intrinsic Image Decomposition by Siamese Training》

Abstract:We harness modern intrinsic decomposition tools based on deep learning to increase their applicability on realworld use cases. Traditional techniques are derived from the Retinex theory: handmade prior assumptions constrain an optimization to yield a unique solution that is qualitatively satisfying on a limited set of examples. Modern techniques based on supervised deep learning leverage largescale databases that are usually synthetic or sparsely annotated. Decomposition quality on images in the wild is therefore arguable. We propose an end-to-end deep learning solution that can be trained without any ground truth supervision, as this is hard to obtain. Time-lapses form an ubiquitous source of data that (under a scene staticity assumption) capture a constant albedo under varying shading conditions. We exploit this natural relationship to train in an unsupervised siamese manner on image pairs. Yet, the trained network applies to single images at inference time. We present a new dataset to demonstrate our siamese training on, and reach results that compete with the state of the art, despite the unsupervised nature of our training scheme. (讲真,我看不懂)

arxiv:https://arxiv.org/abs/1803.00805

[8]《Automated Map Reading: Image Based Localisation in 2-D Maps Using Binary Semantic Descriptors》

Abstract:论文利用图像与2维地图之间的语义匹配,描述了a novel approach to image based localisation in urban environments。它与绝大多数现有的图像数据库匹配方法相对比。我们使用高度紧凑的二进制描述符来表示位置上的语义特征,大大提高了与现有方法相比的可伸缩性, 并具有对可变成像条件更大的不变性的可能性。

arxiv:https://arxiv.org/abs/1803.00788

[9]《Contained Neural Style Transfer for Decorated Logo Generation》

Accepted by DAS2018

Abstract:我们建议使用与clip和文本的神经风格转换,以创造新的和真正的徽标。本文介绍了一种新的基于输入图像距离变换的损失函数,它可以保存文本和对象的轮廓。论文提出的方法只包含到指定区域的样式转移。

注:将风格迁移用在LOG上,好有意思!

arxiv:https://arxiv.org/abs/1803.00686

[10]《SD-CNN: a Shallow-Deep CNN for Improved Breast Cancer Diagnosis》

Abstract:In this research, we propose a Shallow-Deep Convolutional Neural Network (SD-CNN) where a shallow CNN is developed to derive "virtual" recombined images from LE images, and a deep CNN is employed to extract novel features from LE, recombined or "virtual" recombined images for ensemble models to classify the cases as benign vs. cancer.

arxiv:https://arxiv.org/abs/1803.00663

[11]《Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift》

Accepted to CVPR2018

Abstract:In this paper, we propose a deep cocktail network (DCTN) to battle the domain and category shifts among multiple sources. We evaluate DCTN in three domain adaptation benchmarks, which clearly demonstrate the superiority of our framework.(节选自部分摘要)

注:很硬的文章,推荐大家阅读,虽然我连摘要也看不懂!

arxiv:https://arxiv.org/abs/1803.00830

[12]《Driving Digital Rock towards Machine Learning: predicting permeability with Gradient Boosting and Deep Neural Networks》

Abstract:论文提出一项研究,旨在测试机器学习技术的适用性,以预测permeability of digitized rock samples。我们准备了一个训练集,包含3D图像的砂岩样品成像的 X 射线 microtomography和相应的渗透值模拟的孔隙网络方法。我们还使用Minkowski functionals和基于深度学习的3D图像描述符和2D切片作为预测模型训练和预测的输入特征。比较各种特征集和方法的预测功率。后期包括梯度增强和深神经网络 (DNN) 的各种结构。结果表明, 机器学习在基于图像的渗透率预测中具有适用性, 开辟了数字岩石研究的新领域。

注:Digital Rock?有趣的研究!

arxiv:https://arxiv.org/abs/1803.00758

[13]《Deep-neural-network based sinogram synthesis for sparse-view CT image reconstruction》

Abstract:论文介绍了一种深度神经网络的sinogram合成方法,其用于稀疏视图CT,并表明其优于现有的插值方法和迭代图像重建方法。

arxiv:https://arxiv.org/abs/1803.00694

[14]《Meta-Learning for Semi-Supervised Few-Shot Classification》

Published as a conference paper at ICLR 2018

Abstract:In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Recent progress in few-shot classification has featured meta-learning, in which a parameterized model for a learning algorithm is defined and trained on episodes representing different classification problems, each with a small labeled training set and its corresponding test set. In this work, we advance this few-shot classification paradigm towards a scenario where unlabeled examples are also available within each episode. We consider two situations: one where all unlabeled examples are assumed to belong to the same set of classes as the labeled examples of the episode, as well as the more challenging situation where examples from other distractor classes are also provided. To address this paradigm, we propose novel extensions of Prototypical Networks (Snell et al., 2017) that are augmented with the ability to use unlabeled examples when producing prototypes. These models are trained in an end-to-end way on episodes, to learn to leverage the unlabeled examples successfully. We evaluate these methods on versions of the Omniglot and miniImageNet benchmarks, adapted to this new framework augmented with unlabeled examples. We also propose a new split of ImageNet, consisting of a large set of classes, with a hierarchical structure. Our experiments confirm that our Prototypical Networks can learn to improve their predictions due to unlabeled examples, much like a semi-supervised algorithm would.(很硬的论文,不翻译!)

注:少量学习(Few-shot Learning),很有意思的研究方向!

arxiv:https://arxiv.org/abs/1803.00676

[15]《Semi-parametric Topological Memory for Navigation》

Published as a conference paper at ICLR 2018

Abstract:We introduce a new memory architecture for navigation in previously unseen environments, inspired by landmark-based navigation in animals. The proposed semi-parametric topological memory (SPTM) consists of a (non-parametric) graph with nodes corresponding to locations in the environment and a (parametric) deep network capable of retrieving nodes from the graph based on observations. The graph stores no metric information, only connectivity of locations corresponding to the nodes. We use SPTM as a planning module in a navigation system. Given only 5 minutes of footage of a previously unseen maze, an SPTM-based navigation agent can build a topological map of the environment and use it to confidently navigate towards goals. The average success rate of the SPTM agent in goal-directed navigation across test environments is higher than the best-performing baseline by a factor of three. A video of the agent is available at this https URL(https://youtu.be/vRF7f4lhswo)。

注:关键词—NAVIGATION

arxiv:https://arxiv.org/abs/1803.00653

[16]《Fast and accurate computation of orthogonal moments for texture analysis》

Abstract:In this work we propose a fast and stable algorithm for the computation of the orthogonal moments of an image. Indeed, the traditional orthogonal moments formulations are characterized by a high discriminative power, but also by a large computational complexity, which limits their real-time application. The recursive approach described in this paper aims to solve these limitations. In our experiments, we evaluate the effectiveness of the recursive formulations and its performance for the reconstruction task. The results show a great reduction in the computational complexity with respect to the closed form formulation, together with a greater accuracy in reconstruction. Then, in order to assess and compare the accuracy of the computed moments in texture analysis, we perform classification experiments on six well-known databases of texture images. Again, the recursive formulation performs better in classification than the closed form representation. More importantly, if computed from the GLCM of the image, the new moments outperform significantly some of the most diffused state-of-the-art descriptors for texture classification.

注:正交矩(orthogonal moments)不是很懂!

arxiv:https://arxiv.org/abs/1803.00638

--------我是分割线--------

嘿嘿,有点偷懒,很多Abstract没有翻译,还请见谅!

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2018-03-05,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 CVer 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档