机器人相关学术速递[7.9]

公众号-arXiv每日学术速递

发布于 2021-07-27 10:43:19

5920

发布于 2021-07-27 10:43:19

文章被收录于专栏：arXiv每日学术速递

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

cs.RO机器人相关，共计19篇

【1】 RMA: Rapid Motor Adaptation for Legged Robots 标题：RMA：腿部机器人的快速运动适应

作者：Ashish Kumar,Zipeng Fu,Deepak Pathak,Jitendra Malik 机构：Carnegie Mellon University, UC Berkeley, Facebook 备注：RSS 2021. Webpage at this https URL 链接：https://arxiv.org/abs/2107.04034 摘要：腿型机器人在现实世界中的成功部署需要它们实时适应未知场景，如地形变化、有效载荷变化、磨损等。针对四足机器人的实时在线自适应问题，提出了一种快速运动自适应算法。RMA由两部分组成：基本策略和自适应模块。这些部件的组合使机器人能够在几秒钟内适应新的环境。RMA完全在仿真中训练，不需要参考轨迹或预定义的脚轨迹生成器等领域知识，并且在A1机器人上部署，无需任何微调。我们使用生物能学启发的奖励，在各种地形发生器上训练RMA，并将其部署在各种困难地形上，包括岩石、光滑、可变形表面，以及有草、长植被、混凝土、卵石、楼梯、沙子的环境中，RMA在各种真实世界和模拟实验中展示了最先进的性能。视频结果https://ashish-kmr.github.io/rma-legged-robots/ 摘要：Successful real-world deployment of legged robots would require them to adapt in real-time to unseen scenarios like changing terrains, changing payloads, wear and tear. This paper presents Rapid Motor Adaptation (RMA) algorithm to solve this problem of real-time online adaptation in quadruped robots. RMA consists of two components: a base policy and an adaptation module. The combination of these components enables the robot to adapt to novel situations in fractions of a second. RMA is trained completely in simulation without using any domain knowledge like reference trajectories or predefined foot trajectory generators and is deployed on the A1 robot without any fine-tuning. We train RMA on a varied terrain generator using bioenergetics-inspired rewards and deploy it on a variety of difficult terrains including rocky, slippery, deformable surfaces in environments with grass, long vegetation, concrete, pebbles, stairs, sand, etc. RMA shows state-of-the-art performance across diverse real-world as well as simulation experiments. Video results at https://ashish-kmr.github.io/rma-legged-robots/

【2】 The Atlas of Lane Changes: Investigating Location-dependent Lane Change Behaviors Using Measurement Data from a Customer Fleet 标题：车道改变地图集：使用客户车队的测量数据调查与位置相关的车道改变行为

作者：Florian Wirthmüller,Jochen Hipp,Christian Reichenbächer,Manfred Reichert 机构： Reichert are with the Institute of Databases andInformation Systems (DBIS) at Ulm University, Reichenb¨acher is with the Wilhelm-Schickard-Institute for Informaticsat Eberhard Karls University T¨ubingen 备注：the article has been accepted for publication during the 24th IEEE Intelligent Transportation Systems Conference (ITSC), 8 pages, 11 figures 链接：https://arxiv.org/abs/2107.04029 摘要：对周围交通参与者行为的预测是驾驶员辅助和自动驾驶系统的一项重要而富有挑战性的任务。目前的方法主要集中在对交通状况的动态方面进行建模，并试图在此基础上预测交通参与者的行为。在本文中，我们通过计算特定位置的先验车道变化概率，朝着扩展这一常见做法迈出了第一步。这背后的想法是直截了当的：人类的驾驶行为可能会在完全相同的交通状况下发生变化，这取决于各自的位置。例如，司机们可能会问自己：我是应该立即通过前面的卡车，还是应该等到到达前方几公里处弯度较小的路段？尽管这样的信息本身还远远不允许进行行为预测，但很明显，当将这种特定于位置的先验概率纳入预测时，今天的方法将大大受益。例如，我们的调查显示，高速公路立交桥往往会增强驾驶员进行车道变更的动机，而曲线似乎具有车道变更抑制效应。然而，对所有考虑的局部条件的调查表明，各种效应的叠加可能导致某些位置出现意外的概率。因此，我们建议动态构建和维护基于客户车队数据的车道变化概率图，以支持具有附加信息的车载预测系统。要获得可靠的车道变更概率，广泛的客户群是成功的关键。摘要：The prediction of surrounding traffic participants behavior is a crucial and challenging task for driver assistance and autonomous driving systems. Today's approaches mainly focus on modeling dynamic aspects of the traffic situation and try to predict traffic participants behavior based on this. In this article we take a first step towards extending this common practice by calculating location-specific a-priori lane change probabilities. The idea behind this is straight forward: The driving behavior of humans may vary in exactly the same traffic situation depending on the respective location. E.g. drivers may ask themselves: Should I pass the truck in front of me immediately or should I wait until reaching the less curvy part of my route lying only a few kilometers ahead? Although, such information is far away from allowing behavior prediction on its own, it is obvious that today's approaches will greatly benefit when incorporating such location-specific a-priori probabilities into their predictions. For example, our investigations show that highway interchanges tend to enhance driver's motivation to perform lane changes, whereas curves seem to have lane change-dampening effects. Nevertheless, the investigation of all considered local conditions shows that superposition of various effects can lead to unexpected probabilities at some locations. We thus suggest dynamically constructing and maintaining a lane change probability map based on customer fleet data in order to support onboard prediction systems with additional information. For deriving reliable lane change probabilities a broad customer fleet is the key to success.

【3】 Multi-Modality Task Cascade for 3D Object Detection 标题：基于多模态任务级联的三维目标检测

作者：Jinhyung Park,Xinshuo Weng,Yunze Man,Kris Kitani 机构：Carnegie Mellon University 链接：https://arxiv.org/abs/2107.04013 摘要：点云和RGB图像自然是3D视觉理解的互补模式——前者提供了稀疏但精确的对象上点的位置，而后者包含了密集的颜色和纹理信息。尽管这种潜在的密切传感器融合，许多方法训练两个模型在隔离和使用简单的特征拼接来表示三维传感器数据。这种分离的训练方案会导致潜在的次优性能，并阻止3D任务被用来帮助2D任务，而2D任务本身通常是有用的。为了提供一种更为综合的方法，我们提出了一种新的多模态任务级联网络（MTC-RCNN），该网络利用3D盒子方案来改进2D分割预测，然后使用3D盒子进一步细化3D盒子。我们发现在两个阶段的3D模块之间加入2D网络可以显著提高2D和3D任务的性能。此外，为了防止3D模块过度依赖于过度拟合的2D预测，我们提出了一种双头2D分割训练和推理方案，允许第二个3D模块学习解释不完美的2D分割预测。在具有挑战性的SUN RGB-D数据集上评估我们的模型，我们大大改进了单模态和融合网络的最新结果（$\textbf{+3.8}$mAP@0.5). 代码将被释放$\href{https://github.com/Divadi/MTC_RCNN}{\text{这里。}}$ 摘要：Point clouds and RGB images are naturally complementary modalities for 3D visual understanding - the former provides sparse but accurate locations of points on objects, while the latter contains dense color and texture information. Despite this potential for close sensor fusion, many methods train two models in isolation and use simple feature concatenation to represent 3D sensor data. This separated training scheme results in potentially sub-optimal performance and prevents 3D tasks from being used to benefit 2D tasks that are often useful on their own. To provide a more integrated approach, we propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions, which are then used to further refine the 3D boxes. We show that including a 2D network between two stages of 3D modules significantly improves both 2D and 3D task performance. Moreover, to prevent the 3D module from over-relying on the overfitted 2D predictions, we propose a dual-head 2D segmentation training and inference scheme, allowing the 2nd 3D module to learn to interpret imperfect 2D segmentation predictions. Evaluating our model on the challenging SUN RGB-D dataset, we improve upon state-of-the-art results of both single modality and fusion networks by a large margin ($\textbf{+3.8}$ mAP@0.5). Code will be released $\href{https://github.com/Divadi/MTC_RCNN}{\text{here.}}$

【4】 3D Neural Scene Representations for Visuomotor Control 标题：视觉运动控制中的三维神经场景表示

作者：Yunzhu Li,Shuang Li,Vincent Sitzmann,Pulkit Agrawal,Antonio Torralba 机构：MIT CSAIL 备注：First two authors contributed equally. Project Page: this https URL 链接：https://arxiv.org/abs/2107.04004 摘要：人类对我们周围的3D环境有很强的直觉理解。我们大脑中的物理模型适用于不同材料的物体，使我们能够执行各种各样的操作任务，这些任务远远超出了当前机器人的能力范围。在这项工作中，我们希望学习模型的动态三维场景纯粹从二维视觉观察。该模型将神经辐射场（NeRF）和时间对比学习与自动编码框架相结合，学习视点不变的三维感知场景表示。我们证明了一个动力学模型，建立在学习的表示空间，使视觉运动控制具有挑战性的操纵任务涉及到刚体和流体，其中目标是在一个不同于机器人操作的视点指定。当与自动解码框架相结合时，它甚至可以支持来自训练分布之外的摄像机视点的目标规范。通过对未来的预测和新颖的视图合成，进一步证明了所学习的三维动力学模型的丰富性。最后，我们提供了详细的烧蚀研究不同的系统设计和定性分析的学习表示。摘要：Humans have a strong intuitive understanding of the 3D environment around us. The mental model of the physics in our brain applies to objects of different materials and enables us to perform a wide range of manipulation tasks that are far beyond the reach of current robots. In this work, we desire to learn models for dynamic 3D scenes purely from 2D visual observations. Our model combines Neural Radiance Fields (NeRF) and time contrastive learning with an autoencoding framework, which learns viewpoint-invariant 3D-aware scene representations. We show that a dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks involving both rigid bodies and fluids, where the target is specified in a viewpoint different from what the robot operates on. When coupled with an auto-decoding framework, it can even support goal specification from camera viewpoints that are outside the training distribution. We further demonstrate the richness of the learned 3D dynamics model by performing future prediction and novel view synthesis. Finally, we provide detailed ablation studies regarding different system designs and qualitative analysis of the learned representations.

【5】 Incorporating Gaze into Social Navigation 标题：将凝视融入社交导航

作者：Justin Hart,Reuth Mirsky,Xuesu Xiao,Peter Stone 机构：Paper # 备注：Accepted for publication in the Robotics: Science and Systems Workshop on Social Robot Navigation (RSS 2021) 链接：https://arxiv.org/abs/2107.04001 摘要：目前大多数的社交导航方法都关注参与者在互动中的轨迹和位置。我们目前在这个主题上的工作主要集中在将视线整合到社交导航中，既可以提示附近的行人机器人的预期轨迹，也可以让机器人阅读附近行人的意图。本文记录了我们实验室的一系列实验，研究注视在社交导航中的作用。摘要：Most current approaches to social navigation focus on the trajectory and position of participants in the interaction. Our current work on the topic focuses on integrating gaze into social navigation, both to cue nearby pedestrians as to the intended trajectory of the robot and to enable the robot to read the intentions of nearby pedestrians. This paper documents a series of experiments in our laboratory investigating the role of gaze in social navigation.

【6】 Active Safety Envelopes using Light Curtains with Probabilistic Guarantees 标题：使用具有概率保证的轻型窗帘的主动安全封套

作者：Siddharth Ancha,Gaurav Pathak,Srinivasa G. Narasimhan,David Held 机构：Carnegie Mellon University, Pittsburgh PA , USA 备注：18 pages, Published at Robotics: Science and Systems (RSS) 2021 链接：https://arxiv.org/abs/2107.04000 摘要：为了安全地在未知环境中航行，机器人必须准确地感知动态障碍物。我们不使用激光雷达传感器直接测量景深，而是探索使用更便宜、分辨率更高的传感器：可编程光幕。光幕是一种可控的深度传感器，只能沿用户选择的表面进行感应。我们使用光幕来估计场景的安全范围：一个将机器人与所有障碍物隔开的假想表面。我们表明，生成感测随机位置（来自特定分布）的光幕可以快速发现未知对象场景的安全包络。重要的是，我们提供了理论上的安全保证的概率检测障碍物使用随机窗帘。我们结合随机窗帘与机器学习为基础的模型，预测和跟踪运动的安全包络有效。我们的方法在提供概率安全保证的同时，准确估计了安全包络线，可以用来验证机器人感知系统检测和避开动态障碍物的有效性。我们在模拟城市驾驶环境和真实行人环境中使用光幕装置对我们的方法进行了评估，结果表明我们可以有效地估计安全包络线。项目网站：https://siddancha.github.io/projects/active-safety-envelopes-with-guarantees 摘要：To safely navigate unknown environments, robots must accurately perceive dynamic obstacles. Instead of directly measuring the scene depth with a LiDAR sensor, we explore the use of a much cheaper and higher resolution sensor: programmable light curtains. Light curtains are controllable depth sensors that sense only along a surface that a user selects. We use light curtains to estimate the safety envelope of a scene: a hypothetical surface that separates the robot from all obstacles. We show that generating light curtains that sense random locations (from a particular distribution) can quickly discover the safety envelope for scenes with unknown objects. Importantly, we produce theoretical safety guarantees on the probability of detecting an obstacle using random curtains. We combine random curtains with a machine learning based model that forecasts and tracks the motion of the safety envelope efficiently. Our method accurately estimates safety envelopes while providing probabilistic safety guarantees that can be used to certify the efficacy of a robot perception system to detect and avoid dynamic obstacles. We evaluate our approach in a simulated urban driving environment and a real-world environment with moving pedestrians using a light curtain device and show that we can estimate safety envelopes efficiently and effectively. Project website: https://siddancha.github.io/projects/active-safety-envelopes-with-guarantees

【7】 Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers 标题：学习视觉引导的四足运动与跨模态Transformer的端到端运动

作者：Ruihan Yang,Minghao Zhang,Nicklas Hansen,Huazhe Xu,Xiaolong Wang 机构：UC San Diego, Tsinghua University, UC Berkeley 备注：Our project page with videos is at this https URL 链接：https://arxiv.org/abs/2107.03996 摘要：我们建议使用强化学习（RL）和基于Transformer的模型来处理四足动物的运动任务，该模型学习结合本体感知信息和高维深度传感器输入。基于学习的运动训练在RL领域有了很大的发展，但大多数方法仍然依赖于领域随机化来训练具有挑战性的盲智能体。我们的关键洞察是，本体感觉状态仅提供即时反应的接触测量，而配备视觉感官观察的代理可以通过预测前方环境的变化，学会主动操纵有障碍物和不平坦地形的环境。本文介绍了一种端到端的四足运动RL方法LocoTransformer，它利用基于Transformer的模型融合本体感觉状态和视觉观察。我们在具有不同障碍物和不平坦地形的模拟环境中评估了我们的方法。结果表明，与仅使用本体感知状态输入的策略相比，该方法取得了显著的改进，基于Transformer的模型进一步提高了跨环境的泛化能力。我们的视频项目页面位于https://RchalYang.github.io/LocoTransformer . 摘要：We propose to address quadrupedal locomotion tasks using Reinforcement Learning (RL) with a Transformer-based model that learns to combine proprioceptive information and high-dimensional depth sensor inputs. While learning-based locomotion has made great advances using RL, most methods still rely on domain randomization for training blind agents that generalize to challenging terrains. Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equipped with visual sensory observations can learn to proactively maneuver environments with obstacles and uneven terrain by anticipating changes in the environment many steps ahead. In this paper, we introduce LocoTransformer, an end-to-end RL method for quadrupedal locomotion that leverages a Transformer-based model for fusing proprioceptive states and visual observations. We evaluate our method in challenging simulated environments with different obstacles and uneven terrain. We show that our method obtains significant improvements over policies with only proprioceptive state inputs, and that Transformer-based models further improve generalization across environments. Our project page with videos is at https://RchalYang.github.io/LocoTransformer .

【8】 Offline Meta-Reinforcement Learning with Online Self-Supervision 标题：在线自我监控的离线元强化学习

作者：Vitchyr H. Pong,Ashvin Nair,Laura Smith,Catherine Huang,Sergey Levine 机构：UC Berkeley 备注：10 pages, 6 figures 链接：https://arxiv.org/abs/2107.03974 摘要：元强化学习（Meta-reinforcement learning，RL）可以用来训练快速适应新任务的策略，其数据量比标准RL少几个数量级，但这种快速适应的代价往往是在元训练期间大大增加奖励监督的数量。离线meta-RL消除了持续提供奖励监督的需要，因为当离线数据集生成时，奖励只能提供一次。除了离线RL的挑战外，元RL中还存在一个独特的分布变化：代理学习探索策略，可以收集学习新任务所需的经验，还学习适应策略，当呈现数据集中的轨迹时，这些策略效果很好，但适应策略并不适应所学探索策略所收集的数据分布。与在线环境不同，适应策略和探索策略不能有效地相互适应，导致表现不佳。本文提出了一种混合离线meta-RL算法，该算法利用带奖励的离线数据对自适应策略进行元训练，然后在没有任何地面真实奖励标签的情况下收集额外的无监督在线数据，以解决分布转移问题。该方法利用离线数据来学习奖励函数的分布，然后对这些函数进行抽样，对额外的在线数据进行奖励标签的自我监督。通过消除为在线体验提供奖励标签的需要，我们的方法可以更实际地用于奖励监督将手动提供的设置中。我们将我们的方法与先前的离线meta-RL在模拟机器人运动和操作任务上的工作进行了比较，发现使用额外的数据和自生成的奖励可以显著提高agent的泛化能力。摘要：Meta-reinforcement learning (RL) can be used to train policies that quickly adapt to new tasks with orders of magnitude less data than standard RL, but this fast adaptation often comes at the cost of greatly increasing the amount of reward supervision during meta-training time. Offline meta-RL removes the need to continuously provide reward supervision because rewards must only be provided once when the offline dataset is generated. In addition to the challenges of offline RL, a unique distribution shift is present in meta RL: agents learn exploration strategies that can gather the experience needed to learn a new task, and also learn adaptation strategies that work well when presented with the trajectories in the dataset, but the adaptation strategies are not adapted to the data distribution that the learned exploration strategies collect. Unlike the online setting, the adaptation and exploration strategies cannot effectively adapt to each other, resulting in poor performance. In this paper, we propose a hybrid offline meta-RL algorithm, which uses offline data with rewards to meta-train an adaptive policy, and then collects additional unsupervised online data, without any ground truth reward labels, to bridge this distribution shift problem. Our method uses the offline data to learn the distribution of reward functions, which is then sampled to self-supervise reward labels for the additional online data. By removing the need to provide reward labels for the online experience, our approach can be more practical to use in settings where reward supervision would otherwise be provided manually. We compare our method to prior work on offline meta-RL on simulated robot locomotion and manipulation tasks and find that using additional data and self-generated rewards significantly improves an agent's ability to generalize.

【9】 Navigate-and-Seek: a Robotics Framework for People Localization in Agricultural Environments 标题：导航寻人：一种农业环境中人的本地化机器人框架

作者：Riccardo Polvara,Francesco Del Duchetto,Gerhard Neumann,Marc Hanheide 机构：SchoolofComputerScience, UniversityofLincoln 链接：https://arxiv.org/abs/2107.03850 摘要：农业领域提供了一个工作环境，许多人力劳动者现在受雇于维护或收获农作物，通过引入机器人自动化，生产力增长潜力巨大。然而，在这样的环境中，可靠而准确地检测和定位人类是移动机器人与人类工人合作提供许多服务的先决条件。因此，在本文中，我们扩展了拓扑粒子滤波器（TPF）的概念，以便在农场环境中准确、独立地定位和跟踪工人，整合来自异构传感器的信息，并结合本地主动感知（利用机器人的机载感知，采用下一个最佳感知规划方法）和全局定位（使用价格合理的物联网全球导航卫星系统设备）。我们验证了所提出的方法在拓扑创建的部署机器人车队，以支持果品采摘在一个真实的农场环境。通过结合拓扑层次上的多传感器观测和通过NBS方法的主动感知，我们证明了与以前的工作相比，我们可以提高picker定位的精度。摘要：The agricultural domain offers a working environment where many human laborers are nowadays employed to maintain or harvest crops, with huge potential for productivity gains through the introduction of robotic automation. Detecting and localizing humans reliably and accurately in such an environment, however, is a prerequisite to many services offered by fleets of mobile robots collaborating with human workers. Consequently, in this paper, we expand on the concept of a topological particle filter (TPF) to accurately and individually localize and track workers in a farm environment, integrating information from heterogeneous sensors and combining local active sensing (exploiting a robot's onboard sensing employing a Next-Best-Sense planning approach) and global localization (using affordable IoT GNSS devices). We validate the proposed approach in topologies created for the deployment of robotics fleets to support fruit pickers in a real farm environment. By combining multi-sensor observations on the topological level complemented by active perception through the NBS approach, we show that we can improve the accuracy of picker localization in comparison to prior work.

【10】 Identification of Gait Phases with Neural Networks for Smooth Transparent Control of a Lower Limb Exoskeleton 标题：基于神经网络的下肢外骨骼平滑透明控制步态相位识别

作者：Vittorio Lippi,Cristian Camardella,Alessandro Filippeschi,Francesco Porcini 机构：University Hospital of Freiburg, Neurology, Freiburg, Germany, Scuola Superiore Sant’Anna, TeCIP Institute, PERCRO Laboratory, Pisa, Italy, Scuola Superiore Sant’Anna, Department of Excellence in Robotics and AI, Pisa, Italy 备注：None 链接：https://arxiv.org/abs/2107.03746 摘要：下肢外骨骼在站立、蹲下和行走时提供帮助。步态动力学，特别是，意味着改变配置的设备在接触点，驱动和系统动力学方面的一般。为了提供舒适的体验和最大化的性能，外骨骼应该以透明的方式进行平滑的控制，这意味着分别最小化与用户的交互作用力和由于不同配置之间的转换而导致的抖动行为。先前的研究表明，基于关节运动学的步态相位分割可以实现外骨骼的平滑控制。这样的分割系统可以实现为线性回归，并应个性化的用户后，校准程序。本文实现了一种基于神经网络的非线性分割函数，并与线性回归进行了比较。然后提出了一个在线实现，并用一个主题进行了测试。摘要：Lower limbs exoskeletons provide assistance during standing, squatting, and walking. Gait dynamics, in particular, implies a change in the configuration of the device in terms of contact points, actuation, and system dynamics in general. In order to provide a comfortable experience and maximize performance, the exoskeleton should be controlled smoothly and in a transparent way, which means respectively, minimizing the interaction forces with the user and jerky behavior due to transitions between different configurations. A previous study showed that a smooth control of the exoskeleton can be achieved using a gait phase segmentation based on joint kinematics. Such a segmentation system can be implemented as linear regression and should be personalized for the user after a calibration procedure. In this work, a nonlinear segmentation function based on neural networks is implemented and compared with linear regression. An on-line implementation is then proposed and tested with a subject.

【11】 Adaptation of Quadruped Robot Locomotion with Meta-Learning 标题：基于元学习的四足机器人运动适应性研究

作者：Arsen Kuzhamuratov,Dmitry Sorokin,Alexander Ulanov,A. I. Lvovsky 机构：Russian Quantum Center, Moscow, Russia, Moscow Institute of Physics and Technology, Russia, University of Oxford, United Kingdom 备注：14 pages, 6 figures 链接：https://arxiv.org/abs/2107.03741 摘要：动物有非凡的能力来适应不同的地形和任务的移动。然而，通过强化学习训练的机器人通常只能解决单个任务，并且转移策略通常不如从头开始训练的机器人。在这项工作中，我们证明了元强化学习可以用来成功地训练机器人能够解决广泛的运动任务。元训练机器人的性能与单任务训练机器人的性能相似。摘要：Animals have remarkable abilities to adapt locomotion to different terrains and tasks. However, robots trained by means of reinforcement learning are typically able to solve only a single task and a transferred policy is usually inferior to that trained from scratch. In this work, we demonstrate that meta-reinforcement learning can be used to successfully train a robot capable to solve a wide range of locomotion tasks. The performance of the meta-trained robot is similar to that of a robot that is trained on a single task.

【12】 Full-Body Torque-Level Non-linear Model Predictive Control for Aerial Manipulation 标题：空中操纵的全身力矩水平非线性模型预测控制

作者：Josep Martí-Saumell,Joan Solà,Angel Santamaria-Navarro,Juan Andrade-Cetto 备注：Submitted to Transactions on Robotics. 17 pages, 16 figures 链接：https://arxiv.org/abs/2107.03722 摘要：非线性模型预测控制（nMPC）是控制复杂机器人（如仿人机器人、四足机器人、无人机）的一种有效方法。全身动力学以及控制器核心所解决的最优控制问题（OCP）的预测能力，使得机器人能够按照其动力学来驱动。这一事实增强了机器人的能力，并允许，例如，在高动态执行复杂的机动，同时优化能源使用量。尽管人形或四足动物与无人机有许多相似之处，但全身扭矩水平nMPC很少应用于无人机。本文就如何在空中操纵中应用这些技术作了详尽的介绍。从UAM动力学模型到成本函数中的残差，我们对OCP中涉及的不同部分进行了详细的解释。我们开发并比较了三种不同的nMPC控制器：加权MPC、Rail MPC和Carrot MPC，它们在ocp的结构和每个时间步的更新方式上是不同的。为了验证所提出的框架，我们提出了各种各样的模拟案例研究。首先，我们评估了不同类型无人机的轨迹生成问题，即离线求解的最优控制问题，包括不同类型的运动（如攻击性机动或接触性运动）。然后，我们通过各种真实的仿真来评估三个nMPC控制器的性能，即在线求解的闭环控制器。为了社区的利益，我们提供了与这项工作相关的源代码。摘要：Non-linear model predictive control (nMPC) is a powerful approach to control complex robots (such as humanoids, quadrupeds, or unmanned aerial manipulators (UAMs)) as it brings important advantages over other existing techniques. The full-body dynamics, along with the prediction capability of the optimal control problem (OCP) solved at the core of the controller, allows to actuate the robot in line with its dynamics. This fact enhances the robot capabilities and allows, e.g., to perform intricate maneuvers at high dynamics while optimizing the amount of energy used. Despite the many similarities between humanoids or quadrupeds and UAMs, full-body torque-level nMPC has rarely been applied to UAMs. This paper provides a thorough description of how to use such techniques in the field of aerial manipulation. We give a detailed explanation of the different parts involved in the OCP, from the UAM dynamical model to the residuals in the cost function. We develop and compare three different nMPC controllers: Weighted MPC, Rail MPC, and Carrot MPC, which differ on the structure of their OCPs and on how these are updated at every time step. To validate the proposed framework, we present a wide variety of simulated case studies. First, we evaluate the trajectory generation problem, i.e., optimal control problems solved offline, involving different kinds of motions (e.g., aggressive maneuvers or contact locomotion) for different types of UAMs. Then, we assess the performance of the three nMPC controllers, i.e., closed-loop controllers solved online, through a variety of realistic simulations. For the benefit of the community, we have made available the source code related to this work.

【13】 Towards Autonomous Pipeline Inspection with Hierarchical Reinforcement Learning 标题：基于分层强化学习的管道自主检测

作者：Nicolò Botteghi,Luuk Grefte,Mannes Poel,Beril Sirmacek,Christoph Brune,Edwin Dertien,Stefano Stramigioli 机构： University of Twente, nl 2Beril Sirmacek with the Department of Computer Science, J¨onk¨opingUniversity 链接：https://arxiv.org/abs/2107.03685 摘要：检测和维护是工业管道设备的两个重要方面。虽然机器人技术在管道检测机器人的机械设计方面取得了巨大的进步，但由于执行机构数量多，操作复杂，机器人的自主控制仍然是一个巨大的挑战。为了解决这个问题，我们研究了在复杂拓扑结构的管道网络中使用深度强化学习来实现管道机器人的自主导航。此外，我们引入了一种基于分层强化学习的分层策略分解方法来学习鲁棒的高级导航技能。我们证明了该策略中引入的层次结构是解决管道导航任务的基础，也是实现优于人类水平控制的导航性能的必要条件。摘要：Inspection and maintenance are two crucial aspects of industrial pipeline plants. While robotics has made tremendous progress in the mechanic design of in-pipe inspection robots, the autonomous control of such robots is still a big open challenge due to the high number of actuators and the complex manoeuvres required. To address this problem, we investigate the usage of Deep Reinforcement Learning for achieving autonomous navigation of in-pipe robots in pipeline networks with complex topologies. Moreover, we introduce a hierarchical policy decomposition based on Hierarchical Reinforcement Learning to learn robust high-level navigation skills. We show that the hierarchical structure introduced in the policy is fundamental for solving the navigation task through pipes and necessary for achieving navigation performances superior to human-level control.

【14】 Graph and Recurrent Neural Network-based Vehicle Trajectory Prediction For Highway Driving 标题：基于图形和递归神经网络的公路行驶车辆轨迹预测

作者：Xiaoyu Mo,Yang Xing,Chen Lv 链接：https://arxiv.org/abs/2107.03663 摘要：将轨迹预测集成到模块化自主驾驶系统的决策和规划模块中，有望提高自动驾驶车辆的安全性和效率。然而，车辆的未来轨迹预测是一项具有挑战性的任务，因为它受到相邻车辆的社会交互行为的影响，并且相邻车辆的数量在不同的情况下会有所不同。本文提出了一种基于GNN-RNN的编码-解码网络，该网络利用RNN从车辆的历史轨迹中提取车辆的动力学特征，用有向图表示车辆间的相互作用，并用GNN编码。GNN的并行性意味着该方法有可能同时预测多车辆的运动轨迹。对从NGSIM-US-101数据集中提取的数据集的评估表明，该模型能够预测目标车辆在周围车辆数目可变的情况下的运动轨迹。摘要：Integrating trajectory prediction to the decision-making and planning modules of modular autonomous driving systems is expected to improve the safety and efficiency of self-driving vehicles. However, a vehicle's future trajectory prediction is a challenging task since it is affected by the social interactive behaviors of neighboring vehicles, and the number of neighboring vehicles can vary in different situations. This work proposes a GNN-RNN based Encoder-Decoder network for interaction-aware trajectory prediction, where vehicles' dynamics features are extracted from their historical tracks using RNN, and the inter-vehicular interaction is represented by a directed graph and encoded using a GNN. The parallelism of GNN implies the proposed method's potential to predict multi-vehicular trajectories simultaneously. Evaluation on the dataset extracted from the NGSIM US-101 dataset shows that the proposed model is able to predict a target vehicle's trajectory in situations with a variable number of surrounding vehicles.

【15】 4D Attention: Comprehensive Framework for Spatio-Temporal Gaze Mapping 标题：4D注意：时空凝视映射的综合框架

作者：Shuji Oishi,Kenji Koide,Masashi Yokozuka,Atsuhiko Banno 机构：NationalInstituteofAdvanced Industrial Science and Technology (AIST) 链接：https://arxiv.org/abs/2107.03606 摘要：这项研究提出了一个框架，捕捉人类的注意力在时空域使用眼睛跟踪眼镜。注意映射是人类感知活动分析或人机交互（HRI）支持人类视觉认知的关键技术；然而，在动态环境中测量人的注意力是一个挑战，因为在定位主体和处理移动物体时存在困难。为了解决这个问题，我们提出了一个综合的框架，4D注意，统一的凝视映射到静态和动态对象。具体来说，我们通过利用直接视觉定位和惯性测量单元（IMU）值的松耦合来估计眼镜的姿态。此外，通过在我们的框架中安装重建组件，可以基于输入图像实例化三维环境地图中未捕获的动态对象。最后，场景渲染组件合成具有标识（ID）纹理的第一人称视图并执行直接2D-3D注视关联。定量评估表明我们的框架是有效的。此外，我们还通过实验验证了4D注意在真实情境中的应用。摘要：This study presents a framework for capturing human attention in the spatio-temporal domain using eye-tracking glasses. Attention mapping is a key technology for human perceptual activity analysis or Human-Robot Interaction (HRI) to support human visual cognition; however, measuring human attention in dynamic environments is challenging owing to the difficulty in localizing the subject and dealing with moving objects. To address this, we present a comprehensive framework, 4D Attention, for unified gaze mapping onto static and dynamic objects. Specifically, we estimate the glasses pose by leveraging a loose coupling of direct visual localization and Inertial Measurement Unit (IMU) values. Further, by installing reconstruction components into our framework, dynamic objects not captured in the 3D environment map are instantiated based on the input images. Finally, a scene rendering component synthesizes a first-person view with identification (ID) textures and performs direct 2D-3D gaze association. Quantitative evaluations showed the effectiveness of our framework. Additionally, we demonstrated the applications of 4D Attention through experiments in real situations.

【16】 Reinforcement Learning based Negotiation-aware Motion Planning of Autonomous Vehicles 标题：基于强化学习的自主车辆协商感知运动规划

作者：Zhitao Wang,Yuzheng Zhuang,Qiang Gu,Dong Chen,Hongbo Zhang,Wulong Liu 链接：https://arxiv.org/abs/2107.03600 摘要：对于与人类交通参与者集成在道路上的自动驾驶车辆，需要理解和适应参与者的意图和驾驶风格，以可预测的方式进行响应，而无需明确的沟通。提出了一种基于强化学习的协商感知运动规划框架，该框架在环境发生变化时，通过实时自适应地动态修改运动规划器的预测视界长度，采用RL来调整规划器的驱动方式，通常由不同驾驶风格的交通参与者的意图转换触发。该框架将自主车辆与其他交通参与者之间的相互作用建模为一个马尔可夫决策过程。RL模块以占用栅格图的时间序列作为输入，嵌入隐式意图推理。通过课程学习提高了算法的训练效率和鲁棒性。我们将该方法应用于模拟和真实世界中的窄车道导航中，结果表明，该方法在缓解社会困境方面具有明显的优势，并且具有适当的协商技巧。摘要：For autonomous vehicles integrating onto roadways with human traffic participants, it requires understanding and adapting to the participants' intention and driving styles by responding in predictable ways without explicit communication. This paper proposes a reinforcement learning based negotiation-aware motion planning framework, which adopts RL to adjust the driving style of the planner by dynamically modifying the prediction horizon length of the motion planner in real time adaptively w.r.t the event of a change in environment, typically triggered by traffic participants' switch of intents with different driving styles. The framework models the interaction between the autonomous vehicle and other traffic participants as a Markov Decision Process. A temporal sequence of occupancy grid maps are taken as inputs for RL module to embed an implicit intention reasoning. Curriculum learning is employed to enhance the training efficiency and the robustness of the algorithm. We applied our method to narrow lane navigation in both simulation and real world to demonstrate that the proposed method outperforms the common alternative due to its advantage in alleviating the social dilemma problem with proper negotiation skills.

【17】 Design and Deployment of an Autonomous Unmanned Ground Vehicle for Urban Firefighting Scenarios 标题：城市消防场景下自主无人地面车辆的设计与部署

作者：Kshitij Jindal,Anthony Wang,Dinesh Thakur,Alex Zhou,Vojtech Spurny,Viktor Walter,George Broughton,Tomas Krajnik,Martin Saska,Giuseppe Loianno 机构：Department of MAE, New York University, Brooklyn, University of Pennsylvania, Walnut Street, Philadelphia, CTU in Prague, Prague , Czech Republic, Department of ECE and MAE 链接：https://arxiv.org/abs/2107.03582 摘要：自主移动机器人有潜力解决人类无法完成的过于复杂或危险的任务。在这篇论文中，我们讨论了城市消防场景中配备机械臂的地面车辆的设计和自主部署。描述了非结构化城市场景下自主导航、规划、火源识别和灭火的硬件设计和算法方法。该方法利用车载传感器进行自主导航，利用热成像信息进行源识别。一个定制的机电泵负责喷水灭火。通过实验验证了该方法的有效性，证明了该方法能够有效地识别和消除建筑物内的一个样本热源。整个系统是在Mohamed Bin Zayed国际机器人挑战赛（MBZIRC）2020年期间开发和部署的，用于高层建筑内的第3号消防挑战赛，以及在大挑战赛期间，我们的方法在所有UGV解决方案中得分最高，并有助于赢得第一名。摘要：Autonomous mobile robots have the potential to solve missions that are either too complex or dangerous to be accomplished by humans. In this paper, we address the design and autonomous deployment of a ground vehicle equipped with a robotic arm for urban firefighting scenarios. We describe the hardware design and algorithm approaches for autonomous navigation, planning, fire source identification and abatement in unstructured urban scenarios. The approach employs on-board sensors for autonomous navigation and thermal camera information for source identification. A custom electro{mechanical pump is responsible to eject water for fire abatement. The proposed approach is validated through several experiments, where we show the ability to identify and abate a sample heat source in a building. The whole system was developed and deployed during the Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2020, for Challenge No. 3 Fire Fighting Inside a High-Rise Building and during the Grand Challenge where our approach scored the highest number of points among all UGV solutions and was instrumental to win the first place.

【18】 LanguageRefer: Spatial-Language Model for 3D Visual Grounding 标题：LanguageRefer：三维视觉基础的空间语言模型

作者：Junha Roh,Karthik Desingh,Ali Farhadi,Dieter Fox 机构：Paul G. Allen School, University of Washington, United States 备注：11 pages, 3 figures 链接：https://arxiv.org/abs/2107.03438 摘要：为了在不久的将来实现能够理解人类指令并执行有意义任务的机器人，开发能够理解参考语言的学习模型来识别现实世界三维场景中的常见对象是非常重要的。在本文中，我们发展了一个三维视觉接地问题的空间语言模型。具体地说，给定一个以点云的形式重建的三维场景，其中包含潜在候选对象的三维边界框，以及场景中引用目标对象的语言语句，我们的模型从一组潜在候选对象中识别目标对象。我们的空间语言模型使用了一种基于变换器的结构，它结合了边界盒的空间嵌入和DistilBert的精细语言嵌入，以及3D场景中对象之间的推理来找到目标对象。我们证明了我们的模型在ReferIt3D提出的visio语言数据集上具有竞争力。我们提供了额外的空间推理任务的性能分析，从感知噪声中分离出来，视点相关的话语在准确性方面的影响，以及潜在机器人应用的视点注释。摘要：To realize robots that can understand human instructions and perform meaningful tasks in the near future, it is important to develop learned models that can understand referential language to identify common objects in real-world 3D scenes. In this paper, we develop a spatial-language model for a 3D visual grounding problem. Specifically, given a reconstructed 3D scene in the form of a point cloud with 3D bounding boxes of potential object candidates, and a language utterance referring to a target object in the scene, our model identifies the target object from a set of potential candidates. Our spatial-language model uses a transformer-based architecture that combines spatial embedding from bounding-box with a finetuned language embedding from DistilBert and reasons among the objects in the 3D scene to find the target object. We show that our model performs competitively on visio-linguistic datasets proposed by ReferIt3D. We provide additional analysis of performance in spatial reasoning tasks decoupled from perception noise, the effect of view-dependent utterances in terms of accuracy, and view-point annotations for potential robotics applications.

【19】 Graded Symmetry Groups: Plane and Simple 标题：分次对称群：平面和单对称群

作者：Martin Roelfs,Steven De Keninck 机构： University of Amsterdam, Universityof Amsterdam 备注：17 pages, 12 figures, 1 cheat sheet 链接：https://arxiv.org/abs/2107.03771 摘要：Pin群所描述的对称性是（超）平面中有限个离散反射组合的结果。目前的工作表明，如何使用几何代数分析提供了一个图片，补充了经典的矩阵李代数方法，同时保留了信息的数量反射在一个给定的转换。这就在李群上施加了一个等级结构，这在李群的矩阵表示中并不明显。通过采用这种分级结构，证明了不变分解定理：$k$线性独立反射的任何成分都可以分解成$\lceil k/2\rceil$交换因子，每个交换因子最多是两个反射的乘积。这推广了M。例如，Mozzi-Chasles定理是它的3D欧几里德特例。为了证明它的实用性，我们简要地讨论了各种例子，如洛伦兹变换、维格纳旋转和螺旋变换。不变量分解还直接导出了所有自旋群的指数函数和对数函数的闭式公式，并将平面、直线、点等几何元素识别为$k$-反射的不变量。最后，我们给出了几何代数$\mathbb{R}{pqr}$的新的矩阵/向量表示，并在E（3）中用它来说明与点、平面和直线变换的经典协变、逆变和伴随表示的关系。摘要：The symmetries described by Pin groups are the result of combining a finite number of discrete reflections in (hyper)planes. The current work shows how an analysis using geometric algebra provides a picture complementary to that of the classic matrix Lie algebra approach, while retaining information about the number of reflections in a given transformation. This imposes a graded structure on Lie groups, which is not evident in their matrix representation. By embracing this graded structure, the invariant decomposition theorem was proven: any composition of $k$ linearly independent reflections can be decomposed into $\lceil k/2 \rceil$ commuting factors, each of which is the product of at most two reflections. This generalizes a conjecture by M. Riesz, and has e.g. the Mozzi-Chasles' theorem as its 3D Euclidean special case. To demonstrate its utility, we briefly discuss various examples such as Lorentz transformations, Wigner rotations, and screw transformations. The invariant decomposition also directly leads to closed form formulas for the exponential and logarithmic function for all Spin groups, and identifies element of geometry such as planes, lines, points, as the invariants of $k$-reflections. We conclude by presenting novel matrix/vector representations for geometric algebras $\mathbb{R}_{pqr}$, and use this in E(3) to illustrate the relationship with the classic covariant, contravariant and adjoint representations for the transformation of points, planes and lines.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2021-07-09，如有侵权请联系 cloudcommunity@tencent.com 删除

图像处理