专栏首页专知【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

【导读】专知内容组整理了最近五篇视觉目标跟踪(Object Tracking)相关文章,为大家进行介绍,欢迎查看!

1. Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks(使用递归神经网络学习视觉目标跟踪的层次特征)



作者:Li Wang,Ting Liu,Bing Wang,Xulei Yang,Gang Wang

摘要:Recently, deep learning has achieved very promising results in visual object tracking. Deep neural networks in existing tracking methods require a lot of training data to learn a large number of parameters. However, training data is not sufficient for visual object tracking as annotations of a target object are only available in the first frame of a test sequence. In this paper, we propose to learn hierarchical features for visual object tracking by using tree structure based Recursive Neural Networks (RNN), which have fewer parameters than other deep neural networks, e.g. Convolutional Neural Networks (CNN). First, we learn RNN parameters to discriminate between the target object and background in the first frame of a test sequence. Tree structure over local patches of an exemplar region is randomly generated by using a bottom-up greedy search strategy. Given the learned RNN parameters, we create two dictionaries regarding target regions and corresponding local patches based on the learned hierarchical features from both top and leaf nodes of multiple random trees. In each of the subsequent frames, we conduct sparse dictionary coding on all candidates to select the best candidate as the new target location. In addition, we online update two dictionaries to handle appearance changes of target objects. Experimental results demonstrate that our feature learning algorithm can significantly improve tracking performance on benchmark datasets.

期刊:arXiv, 2018年1月6日

网址

http://www.zhuanzhi.ai/document/c8ed971ddd77d456c1270db089240e13

2. Depth-Adaptive Computational Policies for Efficient Visual Tracking(基于深度适应计算策略的有效视觉跟踪)



作者:Chris Ying,Katerina Fragkiadaki

摘要:Current convolutional neural networks algorithms for video object tracking spend the same amount of computation for each object and video frame. However, it is harder to track an object in some frames than others, due to the varying amount of clutter, scene complexity, amount of motion, and object's distinctiveness against its background. We propose a depth-adaptive convolutional Siamese network that performs video tracking adaptively at multiple neural network depths. Parametric gating functions are trained to control the depth of the convolutional feature extractor by minimizing a joint loss of computational cost and tracking error. Our network achieves accuracy comparable to the state-of-the-art on the VOT2016 benchmark. Furthermore, our adaptive depth computation achieves higher accuracy for a given computational cost than traditional fixed-structure neural networks. The presented framework extends to other tasks that use convolutional neural networks and enables trading speed for accuracy at runtime.

期刊:arXiv, 2018年1月2日

网址

http://www.zhuanzhi.ai/document/b4cf6bf8987ce1aaeea88df664be1177

3. Long-Term Visual Object Tracking Benchmark(长期的视觉目标跟踪基准)



作者:Abhinav Moudgil,Vineet Gandhi

摘要:In this paper, we propose a new long video dataset (called Track Long and Prosper - TLP) and benchmark for visual object tracking. The dataset consists of 50 videos from real world scenarios, encompassing a duration of over 400 minutes (676K frames), making it more than 20 folds larger in average duration per sequence and more than 8 folds larger in terms of total covered duration, as compared to existing generic datasets for visual tracking. The proposed dataset paves a way to suitably assess long term tracking performance and possibly train better deep learning architectures (avoiding/reducing augmentation, which may not reflect realistic real world behavior). We benchmark the dataset on 17 state of the art trackers and rank them according to tracking accuracy and run time speeds. We further categorize the test sequences with different attributes and present a thorough quantitative and qualitative evaluation. Our most interesting observations are (a) existing short sequence benchmarks fail to bring out the inherent differences in tracking algorithms which widen up while tracking on long sequences and (b) the accuracy of most trackers abruptly drops on challenging long sequences, suggesting the potential need of research efforts in the direction of long term tracking.

期刊:arXiv, 2017年12月28日

网址

http://www.zhuanzhi.ai/document/fb89e63302d559deced080c7620e490b

4. Tracking in Aerial Hyperspectral Videos using Deep Kernelized Correlation Filters(深度核化相关滤波在空中高光谱视频中的应用)



作者:Burak Uzkent,Aneesh Rangnekar,Matthew J. Hoffman

摘要:Hyperspectral imaging holds enormous potential to improve the state-of-the-art in aerial vehicle tracking with low spatial and temporal resolutions. Recently, adaptive multi-modal hyperspectral sensors, controlled by Dynamic Data Driven Applications Systems (DDDAS) methodology, have attracted growing interest due to their ability to record extended data quickly from the aerial platforms. In this study, we apply popular concepts from traditional object tracking - (1) Kernelized Correlation Filters (KCF) and (2) Deep Convolutional Neural Network (CNN) features - to the hyperspectral aerial tracking domain. Specifically, we propose the Deep Hyperspectral Kernelized Correlation Filter based tracker (DeepHKCF) to efficiently track aerial vehicles using an adaptive multi-modal hyperspectral sensor. We address low temporal resolution by designing a single KCF-in-multiple Regions-of-Interest (ROIs) approach to cover a reasonable large area. To increase the speed of deep convolutional features extraction from multiple ROIs, we design an effective ROI mapping strategy. The proposed tracker also provides flexibility to couple it to the more advanced correlation filter trackers. The DeepHKCF tracker performs exceptionally with deep features set up in a synthetic hyperspectral video generated by the Digital Imaging and Remote Sensing Image Generation (DIRSIG) software. Additionally, we generate a large, synthetic, single-channel dataset using DIRSIG to perform vehicle classification in the Wide Area Motion Imagery (WAMI) platform . This way, the high-fidelity of the DIRSIG software is proved and a large scale aerial vehicle classification dataset is released to support studies on vehicle detection and tracking in the WAMI platform.

期刊:arXiv, 2017年12月27日

网址

http://www.zhuanzhi.ai/document/04b73ae2f925a548b8cf690eb0932717

5. Detect-and-Track: Efficient Pose Estimation in Videos(检测并跟踪:视频中的有效姿态估计)



作者:Rohit Girdhar,Georgia Gkioxari,Lorenzo Torresani,Manohar Paluri,Du Tran

摘要:This paper addresses the problem of estimating and tracking human body keypoints in complex, multi-person video. We propose an extremely lightweight yet highly effective approach that builds upon the latest advancements in human detection and video understanding. Our method operates in two-stages: keypoint estimation in frames or short clips, followed by lightweight tracking to generate keypoint predictions linked over the entire video. For frame-level pose estimation we experiment with Mask R-CNN, as well as our own proposed 3D extension of this model, which leverages temporal information over small clips to generate more robust frame predictions. We conduct extensive ablative experiments on the newly released multi-person video pose estimation benchmark, PoseTrack, to validate various design choices of our model. Our approach achieves an accuracy of 55.2% on the validation and 51.8% on the test set using the Multi-Object Tracking Accuracy (MOTA) metric, and achieves state of the art performance on the ICCV 2017 PoseTrack keypoint tracking challenge.

期刊:arXiv, 2017年12月26日

网址

http://www.zhuanzhi.ai/document/98ca61d328d8eb3633ea8010f75b0a5d

本文分享自微信公众号 - 专知(Quan_Zhuanzhi),作者:专知内容组(编)

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2018-01-23

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 【论文推荐】最新5篇度量学习(Metric Learning)相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

    【导读】专知内容组整理了最近五篇度量学习(Metric Learning)相关文章,为大家进行介绍,欢迎查看! 1. Additive Margin Softm...

    WZEARW
  • 【论文推荐】最新五篇命名实体识别(NER)相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

    【导读】专知内容组整理了最近五篇命名实体识别(Named Entity Recognition)相关文章,为大家进行介绍,欢迎查看! 1. Adversaria...

    WZEARW
  • 【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

    【导读】专知内容组整理了最近五篇情感分析(Sentiment Analysis)相关文章,为大家进行介绍,欢迎查看! 1. Deep Learning for ...

    WZEARW
  • 数字货币与经济危机:帮助各国应对危机(CS CY)

    在编写本报告时,当前的危机对金融界产生了深远的影响,提出了在微观和宏观层面振兴经济的创新方法的必要性。在这个非正式的分析和设计提案中,我们描述了数字资产的基础设...

    刘持诚
  • Tree - 236. Lowest Common Ancestor of a Binary Tree

    236. Lowest Common Ancestor of a Binary Tree

    用户5705150
  • 有序功能决策图(CS AI)

    设计一些BDD变体来利用布尔函数的特殊特性来实现更好的压缩速率。 预先决定使用哪一个变体和构造图表本身同样很困难,并且变体之间的转换通常需要高昂的成本。 这一观...

    时代在召唤
  • python——twisted

    Twisted is an event-driven networking engine in Python. It was born in the early...

    py3study
  • 基于深度学习技术的自动问答医学模型(CS CL)

    人工智能现在可以为不同的问题提供更多的解决方案,尤其是在医疗领域。这些问题之一是缺少对任何给定的医疗/健康相关问题的答案。互联网上充斥着许多论坛,人们可以通过这...

    刘子蔚
  • Single Shot MultiBox Detector论文翻译——中英文对照

    SSD: Single Shot MultiBox Detector Abstract We present a method for detecting ob...

    Tyan
  • OpenSSH 8.2发布 禁用ssh-rsa算法

    OpenSSH 8.2 发布了。OpenSSH 是 100% 完整的 SSH 协议 2.0 版本的实现,并且包括 sftp 客户端和服务器支持。此版本变化不少,...

    Debian社区

扫码关注云+社区

领取腾讯云代金券