机器人相关学术速递[6.24]

公众号-arXiv每日学术速递

发布于 2021-07-02 18:19:44

6490

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

cs.RO机器人相关，共计10篇

【1】 Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation 标题：从粗到精的Q-注意：基于离散化的视觉机器人操作的有效学习

作者：Stephen James,Kentaro Wada,Tristan Laidlow,Andrew J. Davison 机构：Dyson Robotics Lab, Imperial College London 备注：Videos and code found at this https URL 链接：https://arxiv.org/abs/2106.12534 摘要：回顾过去几年，深度强化学习（RL）的最大突破是在离散动作领域。然而，机器人操作本身就是一个连续的控制环境，但是这些连续的控制强化学习算法往往依赖于演员-评论家方法，由于演员和评论家的联合优化，这些方法效率低，训练难度大。为此，我们探讨了如何将离散动作RL算法的稳定性引入机器人操作领域。我们扩展了最近发布的ARM算法，将连续的次优姿态代理替换为离散的次优姿态代理。考虑到旋转的有界性，旋转的离散化是微不足道的，而平移本质上是无界的，这使得离散化很困难。通过对三维空间的离散化，将平移预测转化为体素预测问题；然而，大型工作空间的体素化是内存密集型的，并且不会与高密度的体素一起工作，这对于获得机器人操作所需的分辨率至关重要。因此，我们建议通过逐渐提高分辨率，从粗到细地应用这种体素预测。在每一步中，我们提取最高值的体素作为预测位置，然后作为下一步高分辨率体素化的中心。这种从粗到精的预测应用于多个步骤，给出了几乎无损的翻译预测。结果表明，与连续控制算法相比，本文提出的由粗到精算法能更有效地完成RLBench任务，甚至能在不到7分钟的时间内训练出一些实际任务，即表格rasa，只需3次演示。此外，我们还表明，通过移动到体素表示，我们能够很容易地合并来自多个摄像头的观察。摘要：Reflecting on the last few years, the biggest breakthroughs in deep reinforcement learning (RL) have been in the discrete action domain. Robotic manipulation, however, is inherently a continuous control environment, but these continuous control reinforcement learning algorithms often depend on actor-critic methods that are sample-inefficient and inherently difficult to train, due to the joint optimisation of the actor and critic. To that end, we explore how we can bring the stability of discrete action RL algorithms to the robot manipulation domain. We extend the recently released ARM algorithm, by replacing the continuous next-best pose agent with a discrete next-best pose agent. Discretisation of rotation is trivial given its bounded nature, while translation is inherently unbounded, making discretisation difficult. We formulate the translation prediction as the voxel prediction problem by discretising the 3D space; however, voxelisation of a large workspace is memory intensive and would not work with a high density of voxels, crucial to obtaining the resolution needed for robotic manipulation. We therefore propose to apply this voxel prediction in a coarse-to-fine manner by gradually increasing the resolution. In each step, we extract the highest valued voxel as the predicted location, which is then used as the centre of the higher-resolution voxelisation in the next step. This coarse-to-fine prediction is applied over several steps, giving a near-lossless prediction of the translation. We show that our new coarse-to-fine algorithm is able to accomplish RLBench tasks much more efficiently than the continuous control equivalent, and even train some real-world tasks, tabular rasa, in less than 7 minutes, with only 3 demonstrations. Moreover, we show that by moving to a voxel representation, we are able to easily incorporate observations from multiple cameras.

【2】 Decentralized Spatial-Temporal Trajectory Planning for Multicopter Swarms 标题：多目标群的分散时空轨迹规划

作者：Xin Zhou,Zhepei Wang,Xiangyong Wen,Jiangchao Zhu,Chao Xu,Fei Gao 机构： Zhejiang UniversityDecentralized Spatial-TemporalTrajectory Planning for Multicopter SwarmsXin Zhou, All authors are with the State Key Laboratory of Industrial Control Technology, China andHuzhou Institute 备注：A technical report from FAST Lab, Zhejiang University 链接：https://arxiv.org/abs/2106.12481 摘要：分散结构的多翼蜂群具有灵活性和鲁棒性，而有效的时空轨迹规划仍然是一个挑战。本文引入了分散式时空轨迹规划，将一种形式良好的轨迹表示MINCO引入到多agent场景中。我们的方法保证了每个agent在群体协调或混乱环境中的安全需求的约束下，都有高质量的局部规划。然后，将局部轨迹生成问题描述为一个无约束优化问题，并在毫秒内有效地求解。此外，设计了一种分散的异步机制来触发每个agent的局部规划。提出了一个系统的解决方案，详细描述了仔细的工程考虑。大量的基准测试和室内外实验验证了它的广泛适用性和高质量。我们的软件将发布供社区参考。摘要：Multicopter swarms with decentralized structure possess the nature of flexibility and robustness, while efficient spatial-temporal trajectory planning still remains a challenge. This report introduces decentralized spatial-temporal trajectory planning, which puts a well-formed trajectory representation named MINCO into multi-agent scenarios. Our method ensures high-quality local planning for each agent subject to any constraint from either the coordination of the swarm or safety requirements in cluttered environments. Then, the local trajectory generation is formulated as an unconstrained optimization problem that is efficiently solved in milliseconds. Moreover, a decentralized asynchronous mechanism is designed to trigger the local planning for each agent. A systematic solution is presented with detailed descriptions of careful engineering considerations. Extensive benchmarks and indoor/outdoor experiments validate its wide applicability and high quality. Our software will be released for the reference of the community.

【3】 Formalizing the Execution Context of Behavior Trees for Runtime Verification of Deliberative Policies 标题：形式化用于谨慎策略运行时验证的行为树的执行上下文

作者：Michele Colledanchise,Giuseppe Cicala,Daniele E. Domenichelli,Lorenzo Natale,Armando Tacchella 机构：it)Giuseppe Cicala and Armando Tacchella are with Universita degli Studidi Genova 链接：https://arxiv.org/abs/2106.12474 摘要：我们的研究目标是实现机器人控制体系结构中考虑组件的自动属性验证。我们专注于行为树（BTs）执行上下文的形式化，以提供一种可扩展但形式化的方法来支持运行时验证并防止意外的机器人行为妨碍部署。为此，我们考虑了一个消息传递模型，该模型同时包含并行组件的同步和异步组合，其中BTs和其他组件根据机器人软件体系结构中常用的通信模式执行和交互。我们引入一种正式的属性规范语言来编码需求和构建运行时监视器。我们在仿真和真实机器人上进行了一系列实验，证明了我们的方法在实际应用中的可行性，以及在典型机器人软件体系结构中的集成。我们还提供了一个操作系统级的虚拟化环境来重现模拟场景中的实验。摘要：Our research aims to enable automated property verification of deliberative components in robot control architectures. We focus on a formalization of the execution context of Behavior Trees (BTs) to provide a scalable, yet formally grounded, methodology to enable runtime verification and prevent unexpected robot behaviors to hamper deployment. To this end, we consider a message-passing model that accommodates both synchronous and asynchronous composition of parallel components, in which BTs and other components execute and interact according to the communication patterns commonly adopted in robotic software architectures. We introduce a formal property specification language to encode requirements and build runtime monitors. We performed a set of experiments both on simulations and on the real robot, demonstrating the feasibility of our approach in a realistic application, and its integration in a typical robot software architecture. We also provide an OS-level virtualization environment to reproduce the experiments in the simulated scenario.

【4】 Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers 标题：EURO-PVI：密集城市中心的行人车辆互动

作者：Apratim Bhattacharyya,Daniel Olmeda Reino,Mario Fritz,Bernt Schiele 备注：To appear at CVPR 2021 链接：https://arxiv.org/abs/2106.12442 摘要：行人和自行车路径的准确预测是在密集的城市环境中开发可靠的自动驾驶车辆必不可少的。车辆与行人或骑自行车者之间的相互作用对交通参与者的轨迹有重要影响，例如停车或转弯以避免碰撞。尽管最近的数据集和轨迹预测方法促进了自主车辆的发展，但建模的车辆-行人（骑自行车者）相互作用的数量很少。在这项工作中，我们提出欧洲PVI，行人和自行车的轨迹数据集。特别是，与现有数据集相比，我们的数据集在密集的城市场景中满足了更多样化和复杂的交互。为了解决在预测具有密集交互作用的未来轨迹时所面临的挑战，我们开发了一个联合推理模型，该模型可以学习城市场景中多个智能体之间的多模态共享潜在空间。这使得我们的联合-$\beta$-cVAE方法能够更好地模拟未来轨迹的分布。我们在nuScenes和Euro PVI数据集上获得了最新的结果，证明了捕捉自我车辆和行人（骑自行车者）之间的相互作用对于准确预测的重要性。摘要：Accurate prediction of pedestrian and bicyclist paths is integral to the development of reliable autonomous vehicles in dense urban environments. The interactions between vehicle and pedestrian or bicyclist have a significant impact on the trajectories of traffic participants e.g. stopping or turning to avoid collisions. Although recent datasets and trajectory prediction approaches have fostered the development of autonomous vehicles yet the amount of vehicle-pedestrian (bicyclist) interactions modeled are sparse. In this work, we propose Euro-PVI, a dataset of pedestrian and bicyclist trajectories. In particular, our dataset caters more diverse and complex interactions in dense urban scenarios compared to the existing datasets. To address the challenges in predicting future trajectories with dense interactions, we develop a joint inference model that learns an expressive multi-modal shared latent space across agents in the urban scene. This enables our Joint-$\beta$-cVAE approach to better model the distribution of future trajectories. We achieve state of the art results on the nuScenes and Euro-PVI datasets demonstrating the importance of capturing interactions between ego-vehicle and pedestrians (bicyclists) for accurate predictions.

【5】 Uncertainty-Aware Model-Based Reinforcement Learning with Application to Autonomous Driving 标题：基于不确定性感知模型的强化学习及其在自主驾驶中的应用

作者：Jingda Wu,Zhiyu Huang,Chen Lv 机构：Member, IEEE, A 链接：https://arxiv.org/abs/2106.12194 摘要：为了进一步提高强化学习（RL）的学习效率和性能，本文提出了一种新的基于不确定性感知模型的RL（UA-MBRL）框架，并在各种任务场景下的自主驾驶中进行了实现和验证。首先，建立了具有不确定性评估能力的动作条件集成模型作为虚拟环境模型。然后，基于自适应截断方法，提出了一种新的基于不确定性感知模型的RL框架，提供了agent与环境模型之间的虚拟交互，提高了RL的训练效率和性能。然后将所提出的算法应用于端到端的自主车辆控制任务中，并与各种驾驶场景下的最新方法进行了验证和比较。验证结果表明，所提出的UA-MBRL方法在学习效率和性能上优于现有的基于模型和无模型的RL方法。实验结果还表明，在各种自主驾驶场景下，该方法具有良好的自适应性和鲁棒性。摘要：To further improve the learning efficiency and performance of reinforcement learning (RL), in this paper we propose a novel uncertainty-aware model-based RL (UA-MBRL) framework, and then implement and validate it in autonomous driving under various task scenarios. First, an action-conditioned ensemble model with the ability of uncertainty assessment is established as the virtual environment model. Then, a novel uncertainty-aware model-based RL framework is developed based on the adaptive truncation approach, providing virtual interactions between the agent and environment model, and improving RL's training efficiency and performance. The developed algorithms are then implemented in end-to-end autonomous vehicle control tasks, validated and compared with state-of-the-art methods under various driving scenarios. The validation results suggest that the proposed UA-MBRL method surpasses the existing model-based and model-free RL approaches, in terms of learning efficiency and achieved performance. The results also demonstrate the good ability of the proposed method with respect to the adaptiveness and robustness, under various autonomous driving scenarios.

【6】 Collaborative Visual Inertial SLAM for Multiple Smart Phones 标题：多智能手机协同视觉惯性SLAM

作者：Jialing Liu,Ruyu Liu,Kaiqi Chen,Jianhua Zhang,Dongyan Guo 机构：College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China, School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China 备注：6 pages,4 figures,ICRA2021 链接：https://arxiv.org/abs/2106.12186 摘要：在大型场景和长期的AR应用中，映射的效率和精度至关重要。多agent协作SLAM是多用户AR交互的前提。多个智能手机的协作有可能提高任务完成的效率和健壮性，并且可以完成单个智能体无法完成的任务。然而，它依赖于健壮的通信、有效的位置检测、健壮的映射以及代理之间的有效信息共享。我们提出了一个多智能协作单目视觉惯性SLAM，部署在多个ios移动设备上，采用集中式架构。每个代理可以独立地探索环境，在线运行视觉惯性里程表模块，然后将所有测量信息发送到具有较高计算资源的中央服务器。服务器管理接收到的所有信息，检测重叠区域，合并和优化地图，并在需要时与代理共享信息。我们已经在公共数据集和真实环境中验证了系统的性能。该系统的映射和融合精度与VINS-Mono相当，后者需要更高的计算资源。摘要：The efficiency and accuracy of mapping are crucial in a large scene and long-term AR applications. Multi-agent cooperative SLAM is the precondition of multi-user AR interaction. The cooperation of multiple smart phones has the potential to improve efficiency and robustness of task completion and can complete tasks that a single agent cannot do. However, it depends on robust communication, efficient location detection, robust mapping, and efficient information sharing among agents. We propose a multi-intelligence collaborative monocular visual-inertial SLAM deployed on multiple ios mobile devices with a centralized architecture. Each agent can independently explore the environment, run a visual-inertial odometry module online, and then send all the measurement information to a central server with higher computing resources. The server manages all the information received, detects overlapping areas, merges and optimizes the map, and shares information with the agents when needed. We have verified the performance of the system in public datasets and real environments. The accuracy of mapping and fusion of the proposed system is comparable to VINS-Mono which requires higher computing resources.

【7】 Prevention and Resolution of Conflicts in Social Navigation -- a Survey 标题：社会通航冲突的预防与化解--一项调查报告

作者：Reuth Mirsky,Xuesu Xiao,Justin Hart,Peter Stone 链接：https://arxiv.org/abs/2106.12113 摘要：随着机器人在共享的人类-机器人环境中进行协作的目标日益临近，在这种环境下进行导航变得至关重要和令人满意。机器人技术的最新发展已经遇到并解决了人类-机器人混合环境中导航的一些挑战，近年来，我们观察到一系列相关工作专门针对如何处理社会导航中代理之间的冲突这一问题。这些贡献提供了模型、算法和评估指标，但是由于这一研究领域本身是跨学科的，许多相关论文不具有可比性，研究人员之间也没有标准词汇。这项调查的主要目的是通过提出这样一种共同语言来弥合这一差距，用它来调查现有的工作，并突出公开的问题。它首先定义社交导航中的冲突，并提供其组件的详细分类。这项调查然后映射现有的工作，同时讨论论文使用拟议的分类框架。最后，本文提出了当前社会导航前沿研究的一些方向和问题，以期对今后的重点研究工作有所帮助。摘要：With the approaching goal of having robots collaborate in shared human-robot environments, navigation in this context becomes both crucial and desirable. Recent developments in robotics have encountered and tackled some of the challenges of navigating in mixed human-robot environments, and in recent years we observe a surge of related work that specifically targets the question of how to handle conflicts between agents in social navigation. These contributions offer models, algorithms, and evaluation metrics, however as this research area is inherently interdisciplinary, many of the relevant papers are not comparable and there is no standard vocabulary between the researchers. The main goal of this survey is to bridge this gap by proposing such a common language, using it to survey existing work, and highlighting open problems. It starts by defining a conflict in social navigation, and offers a detailed taxonomy of its components. This survey then maps existing work while discussing papers using the framing of the proposed taxonomy. Finally, this paper propose some future directions and problems that are currently in the frontier of social navigation to help focus research efforts.

【8】 Bregman Gradient Policy Optimization 标题：Bregman梯度策略优化

作者：Feihu Huang,Shangqian Gao,Heng Huang 机构：Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, USA, Editor: 备注：18 pages, 3 pages 链接：https://arxiv.org/abs/2106.12112 摘要：在本文中，我们设计了一个新的基于Bregman发散和动量技术的Bregman梯度强化学习策略优化框架。提出了一种基于基本动量技术和镜像下降迭代的Bregman梯度策略优化算法。同时，提出了一种基于动量方差缩减技术的加速布雷格曼梯度策略优化算法（VR-BGPO）。此外，我们还提出了一个非凸条件下Bregman梯度策略优化的收敛性分析框架。具体地说，我们证明了BGPO的样本复杂度为$\tilde{O}（\epsilon^{-4}）$，对于每次迭代只需要一条轨迹的$\epsilon$平稳点，VR-BGPO的样本复杂度为$\tilde{O}（\epsilon^{-3}）$，对于寻找$\epsilon$平稳点，VR-BGPO的样本复杂度为$\tilde{O}（\epsilon^{-3}）$，每次迭代只需要一条轨迹。特别地，通过使用不同的Bregman分歧，我们的方法统一了许多现有的策略优化算法及其新的变体，如现有的（方差减少的）策略梯度算法和（方差减少的）自然策略梯度算法。在多个强化学习任务上的大量实验结果证明了新算法的有效性。摘要：In this paper, we design a novel Bregman gradient policy optimization framework for reinforcement learning based on Bregman divergences and momentum techniques. Specifically, we propose a Bregman gradient policy optimization (BGPO) algorithm based on the basic momentum technique and mirror descent iteration. At the same time, we present an accelerated Bregman gradient policy optimization (VR-BGPO) algorithm based on a momentum variance-reduced technique. Moreover, we introduce a convergence analysis framework for our Bregman gradient policy optimization under the nonconvex setting. Specifically, we prove that BGPO achieves the sample complexity of $\tilde{O}(\epsilon^{-4})$ for finding $\epsilon$-stationary point only requiring one trajectory at each iteration, and VR-BGPO reaches the best known sample complexity of $\tilde{O}(\epsilon^{-3})$ for finding an $\epsilon$-stationary point, which also only requires one trajectory at each iteration. In particular, by using different Bregman divergences, our methods unify many existing policy optimization algorithms and their new variants such as the existing (variance-reduced) policy gradient algorithms and (variance-reduced) natural policy gradient algorithms. Extensive experimental results on multiple reinforcement learning tasks demonstrate the efficiency of our new algorithms.

【9】 Robust Task Scheduling for Heterogeneous Robot Teams under Capability Uncertainty 标题：能力不确定下异构机器人团队的鲁棒任务调度

作者：Bo Fu,William Smith,Denise Rizzo,Matthew Castanier,Maani Ghaffari,Kira Barton 机构： and Kira Barton are with the Universityof Michigan 备注：Video: this https URL 链接：https://arxiv.org/abs/2106.12111 摘要：本文提出了一个多智能体系统的随机规划框架，其中任务分解、分配和调度问题是同时优化的。多智能体系统由于其固有的灵活性和鲁棒性，在越来越多的涉及异构任务和不确定信息的实际问题中得到了应用。以前的大多数工作都采用一种独特的方式将任务分解为角色，这些角色可以稍后分配给代理。这种假设对于角色可能不同且存在多重分解结构的复杂任务是无效的。同时，在多智能体系统环境下，如何系统地量化和优化任务需求和智能体能力中的不确定性还不清楚。为了避免非凸任务分解枚举，提出了一种复杂任务的表示方法：将agent能力表示为一个随机分布向量，并用一个可推广的二元函数来验证任务需求。在目标函数中选择条件风险值（CVaR）作为度量，生成稳健的计划。描述了一种求解模型的有效算法，并在两个不同的实际测试案例中对整个框架进行了评估：捕获旗帜和大流行期间的机器人服务协调（如COVID-19）。结果表明，该框架是可扩展的，可推广的，并提供了低成本的计划，以确保高成功率。摘要：This paper develops a stochastic programming framework for multi-agent systems where task decomposition, assignment, and scheduling problems are simultaneously optimized. Due to their inherent flexibility and robustness, multi-agent systems are applied in a growing range of real-world problems that involve heterogeneous tasks and uncertain information. Most previous works assume a unique way to decompose a task into roles that can later be assigned to the agents. This assumption is not valid for a complex task where the roles can vary and multiple decomposition structures exist. Meanwhile, it is unclear how uncertainties in task requirements and agent capabilities can be systematically quantified and optimized under a multi-agent system setting. A representation for complex tasks is proposed to avoid the non-convex task decomposition enumeration: agent capabilities are represented as a vector of random distributions, and task requirements are verified by a generalizable binary function. The conditional value at risk (CVaR) is chosen as a metric in the objective function to generate robust plans. An efficient algorithm is described to solve the model, and the whole framework is evaluated in two different practical test cases: capture-the-flag and robotic service coordination during a pandemic (e.g., COVID-19). Results demonstrate that the framework is scalable, generalizable, and provides low-cost plans that ensure a high probability of success.

【10】 Active Exploitation of Redundancies in Reconfigurable Multi-Robot Systems 标题：可重构多机器人系统中冗余的主动利用

作者：Thomas M. Roehr 备注：18 pages, 8 figures, 8 tables 链接：https://arxiv.org/abs/2106.12079 摘要：传统机器人系统采用单片系统设计，可重构多机器人系统可以按需共享和转移物理资源。多机器人操作可以受益于这种灵活性，它可以根据当前任务主动管理系统冗余，并有更多的选项来响应故障事件。为了支持机器人系统中冗余的主动开发，本文详细介绍了一种组织模型，作为可重构多机器人系统规划的基础。该模型允许在优化多机器人系统相对于期望任务的生存概率时利用冗余。由此产生的规划方法权衡了机器人操作的安全性和效率，从而为设计和改进多机器人任务提供了新的视角和工具。我们使用一个模拟的多机器人行星探索任务来评估这种方法，并强调一个典型的性能景观。摘要：While traditional robotic systems come with a monolithic system design, reconfigurable multi-robot systems can share and shift physical resources in an on-demand fashion. Multi-robot operations can benefit from this flexibility by actively managing system redundancies depending on current tasks and having more options to respond to failure events. To support this active exploitation of redundancies in robotic systems, this paper details an organization model as basis for planning with reconfigurable multi-robot systems. The model allows to exploit redundancies when optimizing a multi-robot system's probability of survival with respect to a desired mission. The resulting planning approach trades safety against efficiency in robotic operations and thereby offers a new perspective and tool to design and improve multi-robot missions. We use a simulated multi-robot planetary exploration mission to evaluate this approach and highlight an exemplary performance landscape.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2021-06-24，如有侵权请联系 cloudcommunity@tencent.com 删除

机器人