前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >机器人相关学术速递[6.18]

机器人相关学术速递[6.18]

作者头像
公众号-arXiv每日学术速递
发布2021-07-02 19:04:13
4970
发布2021-07-02 19:04:13
举报

访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问

cs.RO机器人相关,共计20篇

【1】 No-frills Dynamic Planning using Static Planners 标题:用静电规划器进行朴实无华的动态规划

作者:Mara Levy,Vasista Ayyagari,Abhinav Shrivastava 机构:University of Maryland, College Park 备注:ICRA 2021 链接:https://arxiv.org/abs/2106.09714 摘要:在本文中,我们讨论了与动态环境交互的任务,其中环境的变化与代理无关。我们通过UR5机械臂捕捉移动的球来研究这个问题。我们的主要贡献是使用一个动态规划附加组件,将静态规划器用于动态任务;也就是说,如果我们能够成功地解决一个静态目标的任务,那么我们的方法可以在目标移动时解决相同的任务。我们的方法有三个关键部分:现成的静态规划器、轨迹预测网络和预测机器人到达任何位置的估计时间的网络。我们展示了我们的方法在不同环境中的通用性。更多信息和视频https://mlevy2525.github.io/DynamicAddOn. 摘要:In this paper, we address the task of interacting with dynamic environments where the changes in the environment are independent of the agent. We study this through the context of trapping a moving ball with a UR5 robotic arm. Our key contribution is an approach to utilize a static planner for dynamic tasks using a Dynamic Planning add-on; that is, if we can successfully solve a task with a static target, then our approach can solve the same task when the target is moving. Our approach has three key components: an off-the-shelf static planner, a trajectory forecasting network, and a network to predict robot's estimated time of arrival at any location. We demonstrate the generalization of our approach across environments. More information and videos at https://mlevy2525.github.io/DynamicAddOn.

【2】 SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies 标题:Sucant:用于视觉策略零射泛化的自我专家克隆(Self-Expert Clone)

作者:Linxi Fan,Guanzhi Wang,De-An Huang,Zhiding Yu,Li Fei-Fei,Yuke Zhu,Anima Anandkumar 机构: 3The University of Texas atAustin, 4California Institute of Technology 备注:ICML 2021. Website: this https URL 链接:https://arxiv.org/abs/2106.09678 摘要:泛化是强化学习的一个长期挑战。特别是视觉RL,在高维的观察空间中,很容易被不相关的因素分散注意力。在这项工作中,我们考虑稳健的策略学习,其目标是将Zero-Shot泛化到具有较大分布偏移的未知视觉环境中。我们提出割线,一种新的自我专家克隆技术,利用图像增强分两个阶段解耦鲁棒表示学习策略优化。具体地说,一个专家策略首先由RL从无到有的弱增广训练。然后,学生网络通过有监督学习和强增强学习来学习模仿专家策略,使其表示比专家更能抵抗视觉变化。大量的实验表明,割线显著地提高了在4个具有挑战性的领域中的Zero-Shot泛化的技术水平。与之前的SOTA相比,我们的平均回报改进是:深度控制(+26.5%)、机器人操作(+337.8%)、基于视觉的自动驾驶(+47.7%)和室内物体导航(+15.8%)。代码发布和视频可在https://linxifan.github.io/secant-site/. 摘要:Generalization has been a long-standing challenge for reinforcement learning (RL). Visual RL, in particular, can be easily distracted by irrelevant factors in high-dimensional observation space. In this work, we consider robust policy learning which targets zero-shot generalization to unseen visual environments with large distributional shift. We propose SECANT, a novel self-expert cloning technique that leverages image augmentation in two stages to decouple robust representation learning from policy optimization. Specifically, an expert policy is first trained by RL from scratch with weak augmentations. A student network then learns to mimic the expert policy by supervised learning with strong augmentations, making its representation more robust against visual variations compared to the expert. Extensive experiments demonstrate that SECANT significantly advances the state of the art in zero-shot generalization across 4 challenging domains. Our average reward improvements over prior SOTAs are: DeepMind Control (+26.5%), robotic manipulation (+337.8%), vision-based autonomous driving (+47.7%), and indoor object navigation (+15.8%). Code release and video are available at https://linxifan.github.io/secant-site/.

【3】 Future mobility as a bio-inspired collaborative system 标题:作为仿生协作系统的未来移动性

作者:Naroa Coretti Sánchez,Juan Múgica González,Luis Alonso Pastor,Kent Larson 机构:Larsona, MIT Media Lab, Cambridge, MA, USA; bPolytechnic University of Madrid, Madrid, Spain;, These two authors contributed equally to this work, ARTICLE HISTORY 链接:https://arxiv.org/abs/2106.09543 摘要:预计目前车辆共享、电气化和自主化的趋势将改变机动性。适当地结合起来,它们具有显著改善城市流动性的潜力。然而,在大多数车辆共享、电动和自动之后会发生什么仍然是一个悬而未决的问题,特别是关于车辆之间的交互以及这些交互将如何影响系统级行为。受自然的启发,在群机器人和车辆排队模型的支持下,本文提出了一种未来的移动方式,其中共享、电动和自主车辆作为一个生物激励的协作系统。车辆之间的协作将导致系统级行为,类似于自然群集。自然蜂群可以划分任务、集群、构建在一起或协同传输。在未来的移动中,车辆将通过物理或虚拟连接聚集在一起,这将使共享能源、数据或计算能力、提供服务或转移货物等成为可能。车辆将与属于同一车队的车辆合作,或与道路上的任何其他车辆合作,找到互利互惠的关系,使双方受益。群机器人领域已经将一些行为从自然群转化为人工系统,如果我们进一步将这些概念转化为城市移动,令人兴奋的想法就会出现。在机动性相关的研究中,车辆排班模型中提出的协调运动可以看作是朝着协同机动性方向迈出的第一步。本文提出了一个未来移动的框架,以一种新颖独特的方式整合了当前的研究和移动趋势。 摘要:The current trends towards vehicle-sharing, electrification, and autonomy are predicted to transform mobility. Combined appropriately, they have the potential of significantly improving urban mobility. However, what will come after most vehicles are shared, electric, and autonomous remains an open question, especially regarding the interactions between vehicles and how these interactions will impact system-level behaviour. Inspired by nature and supported by swarm robotics and vehicle platooning models, this paper proposes a future mobility in which shared, electric, and autonomous vehicles behave as a bio-inspired collaborative system. The collaboration between vehicles will lead to a system-level behaviour analogous to natural swarms. Natural swarms can divide tasks, cluster, build together, or transport cooperatively. In this future mobility, vehicles will cluster by connecting either physically or virtually, which will enable the possibility of sharing energy, data or computational power, provide services or transfer cargo, among others. Vehicles will collaborate either with vehicles that are part of the same fleet, or with any other vehicle on the road, by finding mutualistic relationships that benefit both parties. The field of swarm robotics has already translated some of the behaviours from natural swarms to artificial systems and, if we further translate these concepts into urban mobility, exciting ideas emerge. Within mobility-related research, the coordinated movement proposed in vehicle platooning models can be seen as a first step towards collaborative mobility. This paper contributes with the proposal of a framework for future mobility that integrates current research and mobility trends in a novel and unique way.

【4】 KIT Bus: A Shuttle Model for CARLA Simulator 标题:KIT BUS:一种适用于CALA模拟器的航天飞机模型

作者:Yusheng Xiang,Shuo Wang,Tianqing Su,Jun Li,Samuel S. Mao,Marcus Geimer 机构:Institute of Vehicle System Technology, Karlsruhe Institute of Technology, Karlsruhe, Germany, Guanghua School of Management, Peking University, Beijing, China, Center of AI, University of Technology Sydney, Ultimo, Australia, Department of Mechanical Engineering 备注:6 pages, 12 figures 链接:https://arxiv.org/abs/2106.09508 摘要:随着科学技术的不断发展,未来自动驾驶汽车必将改变交通运输的性质,实现汽车产业的转型。与自动驾驶汽车相比,自动驾驶公共汽车载客效率更高,能耗也更环保。因此,可以推测,在未来,自驾巴士将变得越来越重要。作为自主驾驶研究的模拟器,卡拉模拟器可以帮助人们更快、更安全地积累自主驾驶技术的经验。然而,CARLA模拟器的一个缺点是没有现代总线模型。因此,人们无法模拟公共汽车上的自动驾驶或与公共汽车相互作用的场景。因此,我们在3ds Max软件中构建了一个总线模型,并将其导入到CARLA中以填补这一空白。我们的模型,即KIT-bus,在CARLA中通过自动驾驶仪仿真验证了它的有效性。视频演示在我们的Youtube上播放。 摘要:With the continuous development of science and technology, self-driving vehicles will surely change the nature of transportation and realize the automotive industry's transformation in the future. Compared with self-driving cars, self-driving buses are more efficient in carrying passengers and more environmentally friendly in terms of energy consumption. Therefore, it is speculated that in the future, self-driving buses will become more and more important. As a simulator for autonomous driving research, the CARLA simulator can help people accumulate experience in autonomous driving technology faster and safer. However, a shortcoming is that there is no modern bus model in the CARLA simulator. Consequently, people cannot simulate autonomous driving on buses or the scenarios interacting with buses. Therefore, we built a bus model in 3ds Max software and imported it into the CARLA to fill this gap. Our model, namely KIT bus, is proven to work in the CARLA by testing it with the autopilot simulation. The video demo is shown on our Youtube.

【5】 Making Sense of Complex Sensor Data Streams 标题:理解复杂的传感器数据流

作者:Rongrong Liu,Birgitta Dresp-Langley 机构: ICube Lab, Robotics Department, Strasbourg University, Strasbourg, France, ICube Lab, UMR , Centre National de la Recherche Scientifique CNRS, Strasbourg 备注:None 链接:https://arxiv.org/abs/2106.09500 摘要:本概念文件借鉴了我们之前的研究成果,这些研究是通过将生物传感器放置在执行机器人辅助的微创内窥镜手术精确握持任务的操作员的主导和非主导手的特定解剖位置上,收集个人握持力数据。一方面,机器人系统的特殊性,另一方面,在现实世界的三维空间中执行的二维图像引导任务的特殊性,以一种独特的方式限制了任务执行过程中单个手和手指的运动。我们之前的工作显示了操作员专业技能在特定握力剖面方面的特定任务特征,我们能够在数千个高度可变的个体数据中检测到这些特征。本概念文件的重点是两个互补的数据分析策略,使实现这样一个目标。与其他旨在最小化数据方差的传感器数据分析策略相比,在这种情况下,有必要通过使用适当的统计分析来破译传感器数据中个体内和个体间方差的全部范围的含义,如本文第一部分所示。然后,它解释了如何计算个人时空握力剖面允许检测个人用户之间的专业知识的具体差异。这两种分析策略是互补的。它们能够从数千个生物传感器数据中提取意义,反映人类握力表现及其随训练的演变,同时充分考虑到个体间和个体内的巨大差异。 摘要:This concept paper draws from our previous research on individual grip force data collected from biosensors placed on specific anatomical locations in the dominant and non dominant hands of operators performing a robot assisted precision grip task for minimally invasive endoscopic surgery. The specificity of the robotic system on the one hand, and that of the 2D image guided task performed in a real world 3D space on the other, constrain the individual hand and finger movements during task performance in a unique way. Our previous work showed task specific characteristics of operator expertise in terms of specific grip force profiles, which we were able to detect in thousands of highly variable individual data. This concept paper is focused on two complementary data analysis strategies that allow achieving such a goal. In contrast with other sensor data analysis strategies aimed at minimizing variance in the data, it is in this case here necessary to decipher the meaning of the full extent of intra and inter individual variance in the sensor data by using the appropriate statistical analyses, as shown in the first part of this paper. Then, it is explained how the computation of individual spatio temporal grip force profiles permits detecting expertise specific differences between individual users. It is concluded that these two analytic strategies are complementary. They enable drawing meaning from thousands of biosensor data reflecting human grip performance and its evolution with training, while fully taking into account their considerable inter and intra individual variability.

【6】 Synthesizing Modular Manipulators For Tasks With Time, Obstacle, And Torque Constraints 标题:具有时间、障碍和扭矩约束任务的模块化机械手综合

作者:Thais Campos,Hadas Kress-Gazit 机构:Sibley School of Mechanical and Aerospace Engineering, Cornell University, Ithaca, New York 链接:https://arxiv.org/abs/2106.09487 摘要:模块化机器人可以定制以完成特定的任务,并重新安排以实现以前不可行的任务。挑战是从一个大的搜索空间中选择一个合适的设计。在这项工作中,我们描述了一个框架,自动综合设计和控制的串行链模块化机械手给定的任务描述。该任务包括三维空间中要到达的点、时间限制、末端执行器要承受的载荷以及环境中要避免的障碍物。这些规范被编码为机器人运动学和动力学中的约束优化,如果找到解决方案,公式将返回执行任务的特定设计和控制。最后,我们在一个复杂的规范中演示了我们的方法,在这个规范中,机器人在一个受限的环境中导航,同时握住一个物体。 摘要:Modular robots can be tailored to achieve specific tasks and rearranged to achieve previously infeasible ones. The challenge is choosing an appropriate design from a large search space. In this work, we describe a framework that automatically synthesizes the design and controls for a serial chain modular manipulator given a task description. The task includes points to be reached in the 3D space, time constraints, a load to be sustained at the end-effector, and obstacles to be avoided in the environment. These specifications are encoded as a constrained optimization in the robot's kinematics and dynamics and, if a solution is found, the formulation returns the specific design and controls to perform the task. Finally, we demonstrate our approach on a complex specification in which the robot navigates a constrained environment while holding an object.

【7】 Modelling resource allocation in uncertain system environment through deep reinforcement learning 标题:基于深度强化学习的不确定系统环境下资源分配建模

作者:Neel Gandhi,Shakti Mishra 机构:Student,School of Technology, Pandit Deendayal Energy University, Gandhinagar,Gujarat, Associate Professor,School of Technology 备注:Accepted at IRMAS'21 链接:https://arxiv.org/abs/2106.09461 摘要:强化学习在机电一体化、机器人等资源受限控制系统领域有着广泛的应用。资源分配问题主要是利用传统的预定义技术和现代的深度学习方法来解决的。预定义的、最深入的资源分配学习方法的缺点是在系统环境不确定的情况下不能满足要求。利用深度强化学习,我们可以在遵循一定准则的同时,研究不确定系统环境下的资源分配问题。强化学习具有长时间适应新的不确定环境的能力。本文对各种深度强化学习方法进行了详细的对比分析,通过使用不同的组件来修改强化学习的体系结构,包括使用噪声层、优先重放、bagging、决斗网络、,以及其他相关的组合,以获得性能上的改进和计算成本的降低。文中指出,在给定的资源分配模拟环境中,采用带噪声的Bagging-duelling双deep-Q网络可以有效地解决不确定环境下的资源分配问题,通过显著的探索,使报酬最大化,效率达到97.7%。 摘要:Reinforcement Learning has applications in field of mechatronics, robotics, and other resource-constrained control system. Problem of resource allocation is primarily solved using traditional predefined techniques and modern deep learning methods. The drawback of predefined and most deep learning methods for resource allocation is failing to meet the requirements in cases of uncertain system environment. We can approach problem of resource allocation in uncertain system environment alongside following certain criteria using deep reinforcement learning. Also, reinforcement learning has ability for adapting to new uncertain environment for prolonged period of time. The paper provides a detailed comparative analysis on various deep reinforcement learning methods by applying different components to modify architecture of reinforcement learning with use of noisy layers, prioritized replay, bagging, duelling networks, and other related combination to obtain improvement in terms of performance and reduction of computational cost. The paper identifies problem of resource allocation in uncertain environment could be effectively solved using Noisy Bagging duelling double deep Q network achieving efficiency of 97.7% by maximizing reward with significant exploration in given simulated environment for resource allocation.

【8】 CRIL: Continual Robot Imitation Learning via Generative and Prediction Model 标题:CRIL:基于产生式和预测性模型的连续机器人模仿学习

作者:Chongkai Gao,Haichuan Gao,Shangqi Guo,Tianren Zhang,Feng Chen 机构: this is is not essentially required in continual 1Department of Automation, Tsinghua University 链接:https://arxiv.org/abs/2106.09422 摘要:模仿学习(IL)算法在机器人从专家演示中学习技能方面取得了很好的效果。然而,对于现在需要学习不同任务的多功能机器人来说,同时提供和学习多任务演示都是困难的。为了解决这一问题,本文研究了如何实现连续模仿学习能力,使机器人能够一个接一个地不断学习新任务,从而减轻多任务学习的负担,同时加快新任务学习的进程。提出了一种新的轨迹生成模型,在新的任务学习过程中,利用生成对抗网络和动态预测模型从所有学习到的任务中生成伪轨迹,以实现连续的模仿学习能力。在模拟和真实操作任务上的实验证明了该方法的有效性。 摘要:Imitation learning (IL) algorithms have shown promising results for robots to learn skills from expert demonstrations. However, for versatile robots nowadays that need to learn diverse tasks, providing and learning the multi-task demonstrations all at once are both difficult. To solve this problem, in this work we study how to realize continual imitation learning ability that empowers robots to continually learn new tasks one by one, thus reducing the burden of multi-task IL and accelerating the process of new task learning at the same time. We propose a novel trajectory generation model that employs both a generative adversarial network and a dynamics prediction model to generate pseudo trajectories from all learned tasks in the new task learning process to achieve continual imitation learning ability. Our experiments on both simulation and real world manipulation tasks demonstrate the effectiveness of our method.

【9】 Cat-like Jumping and Landing of Legged Robots in Low-gravity Using Deep Reinforcement Learning 标题:基于深度强化学习的微重力腿部机器人类猫跳跃与着陆

作者:Nikita Rudin,Hendrik Kolvenbach,Vassilios Tsounis,Marco Hutter 备注:Published in IEEE Transactions on Robotics: this https URL Video: this https URL 链接:https://arxiv.org/abs/2106.09357 摘要:在这篇文章中,我们证明了学习策略可以应用于解决具有广泛飞行阶段的腿部运动控制任务,例如在太空探索中遇到的任务。采用一种现成的深度强化学习算法,训练了一个神经网络来控制一个跳跃四足机器人,同时单独使用它的肢体进行姿态控制。我们提出的任务越来越复杂,导致一个四足机器人穿越模拟低重力天体的三维(重新)定向和着陆运动行为相结合。我们表明,我们的方法很容易在这些任务中推广,并成功地为每种情况训练策略。利用sim-to-real变换,我们将训练好的策略部署到实际世界中的SpaceBok机器人上,该机器人放置在一个为二维微重力实验设计的实验台上。实验结果表明,具有自然敏捷性的重复、控制跳跃和着陆是可能的。 摘要:In this article, we show that learned policies can be applied to solve legged locomotion control tasks with extensive flight phases, such as those encountered in space exploration. Using an off-the-shelf deep reinforcement learning algorithm, we trained a neural network to control a jumping quadruped robot while solely using its limbs for attitude control. We present tasks of increasing complexity leading to a combination of three-dimensional (re-)orientation and landing locomotion behaviors of a quadruped robot traversing simulated low-gravity celestial bodies. We show that our approach easily generalizes across these tasks and successfully trains policies for each case. Using sim-to-real transfer, we deploy trained policies in the real world on the SpaceBok robot placed on an experimental testbed designed for two-dimensional micro-gravity experiments. The experimental results demonstrate that repetitive, controlled jumping and landing with natural agility is possible.

【10】 A new robotic hand based on the design of fingers with spatial motions 标题:一种基于空间运动手指设计的新型机械手

作者:Pol Hamon,Damien Chablat,Franck Plestan 机构:Armor M´eca, ´Ecole Centrale de NantesLS,N, ZI La Grignardais, Pleslin-Trigavou, France, Ecole Centrale de Nantes, LS,N, UMR CNRS , rue de la Noe, Nantes, France 备注:None 链接:https://arxiv.org/abs/2106.09331 摘要:本文提出了一种新的三个欠驱动手指的手结构。每个手指执行空间运动,以实现比现有平面运动手指更复杂和多样的抓取。这只手的目的是在复杂形状的工件离开加工中心时抓住它们。在夹持器的分类中,圆柱形和球形夹持器通常用于抓取重物。这两种模式的结合使得可以捕获大多数用五轴机床加工的工件。然而,抓持模式的改变需要手指重新配置自己来执行空间运动。这种解决方案需要添加两个或三个执行器来改变手指的位置,并且需要传感器来识别工件的形状并确定要使用的抓取类型。本文建议将欠驱动手指的概念扩展到空间运动。在介绍了手指的运动学之后,讨论了该机构的稳定性问题以及力的传递。通过对雅可比力传递矩阵的研究,给出了计算稳定条件的完整方法。给出了手的CAD表示及其在球形和圆柱形夹持器中的行为。 摘要:This article presents a new hand architecture with three under-actuated fingers. Each finger performs spatial movements to achieve more complex and varied grasping than the existing planar-movement fingers. The purpose of this hand is to grasp complex-shaped workpieces as they leave the machining centres. Among the taxonomy of grips, cylindrical and spherical grips are often used to grasp heavy objects. A combination of these two modes makes it possible to capture most of the workpieces machined with 5-axis machines. However, the change in grasping mode requires the fingers to reconfigure themselves to perform spatial movements. This solution requires the addition of two or three actuators to change the position of the fingers and requires sensors to recognize the shape of the workpiece and determine the type of grasp to be used. This article proposes to extend the notion of under-actuated fingers to spatial movements. After a presentation of the kinematics of the fingers, the problem of stability is discussed as well as the transmission of forces in this mechanism. The complete approach for calculating the stability conditions is presented from the study of Jacobian force transmission matrices. CAD representations of the hand and its behavior in spherical and cylindrical grips are presented.

【11】 Towards bio-inspired unsupervised representation learning for indoor aerial navigation 标题:面向室内航空导航的仿生无监督表征学习

作者:Ni Wang,Ozan Catal,Tim Verbelen,Matthias Hartmann,Bart Dhoedt 机构:∗IDLab, Ghent University - imec, Belgium, †imec, Belgium 链接:https://arxiv.org/abs/2106.09326 摘要:室内环境下的GPS空中导航仍然是一个开放的挑战。无人机可以从更丰富的视角感知环境,同时比其他自主平台有更严格的计算和能量限制。为了解决这一问题,本研究提出了一种基于生物启发的同步定位与映射深度学习算法(SLAM)及其在无人机导航系统中的应用。我们提出了一种无监督的表示学习方法,该方法产生低维的潜在状态描述符,降低了对感知混叠的敏感性,并且适用于节能的嵌入式硬件。在一个室内仓库环境中采集的数据集上对所设计的算法进行了评估,初步结果表明了鲁棒室内空中导航的可行性。 摘要:Aerial navigation in GPS-denied, indoor environments, is still an open challenge. Drones can perceive the environment from a richer set of viewpoints, while having more stringent compute and energy constraints than other autonomous platforms. To tackle that problem, this research displays a biologically inspired deep-learning algorithm for simultaneous localization and mapping (SLAM) and its application in a drone navigation system. We propose an unsupervised representation learning method that yields low-dimensional latent state descriptors, that mitigates the sensitivity to perceptual aliasing, and works on power-efficient, embedded hardware. The designed algorithm is evaluated on a dataset collected in an indoor warehouse environment, and initial results show the feasibility for robust indoor aerial navigation.

【12】 Design of a prototypical platform for autonomous and connected vehicles 标题:一种自主互联车辆原型平台的设计

作者:Stefano Arrigoni,Simone Mentasti,Federico Cheli,Matteo Matteucci,Francesco Braghin 链接:https://arxiv.org/abs/2106.09307 摘要:预计自动驾驶技术将彻底改变不同的行业,并被视为道路车辆的自然演变。在过去的几年里,设计和虚拟测试解决方案的真实世界验证变得越来越重要,因为模拟环境永远不会完全复制现实世界中可能影响结果的所有方面。为此,本文介绍了我们的原型平台,用于互联自主驾驶项目的实验研究。通过对自主驾驶所需的主要算法(自我定位、环境感知、运动规划和驱动)的全面描述,详细介绍了车辆的总体结构,重点讨论了与远程驱动和传感器设置相关的机械方面以及软件方面。最后,在城市环境中进行了实验测试,验证和评估了整个系统的性能。 摘要:Self-driving technology is expected to revolutionize different sectors and is seen as the natural evolution of road vehicles. In the last years, real-world validation of designed and virtually tested solutions is growing in importance since simulated environments will never fully replicate all the aspects that can affect results in the real world. To this end, this paper presents our prototype platform for experimental research on connected and autonomous driving projects. In detail, the paper presents the overall architecture of the vehicle focusing both on mechanical aspects related to remote actuation and sensors set-up and software aspects by means of a comprehensive description of the main algorithms required for autonomous driving as ego-localization, environment perception, motion planning, and actuation. Finally, experimental tests conducted in an urban-like environment are reported to validate and assess the performances of the overall system.

【13】 Field trial on Ocean Estimation for Multi-Vessel Multi-Float-based Active perception 标题:基于多船多浮标的主动感知海洋估计野外试验

作者:Giovanni D'urso,James Ju Heon Lee,Ki Myung Brian Lee,Jackson Shields,Brenton Leighton,Oscar Pizarro,Chanyeol Yoo,Robert Fitch 备注:7 pages, 6 figures, presented at "ICRA2021, 1st Advanced Marine Robotics TC Workshop: Active Perception" 链接:https://arxiv.org/abs/2106.09279 摘要:海上交通工具已被用于各种科学任务,在这些任务中收集有关感兴趣特征的信息。为了最大限度地提高在大搜索空间中收集信息的效率,我们应该能够部署大量的自主车辆,这些车辆根据对环境中目标特征的最新了解做出决策。在我们之前的工作中,我们提出了一个多船多浮子(MVMF)问题的层次结构,其中水面船舶以时间最小的方式落下和拾取欠驱动浮子。在本文中,我们提出了现场试验结果使用框架与一些漂移和浮标。我们发现了拟议框架中需要考虑的一些重要方面,并提出了应对这些挑战的潜在方法。 摘要:Marine vehicles have been used for various scientific missions where information over features of interest is collected. In order to maximise efficiency in collecting information over a large search space, we should be able to deploy a large number of autonomous vehicles that make a decision based on the latest understanding of the target feature in the environment. In our previous work, we have presented a hierarchical framework for the multi-vessel multi-float (MVMF) problem where surface vessels drop and pick up underactuated floats in a time-minimal way. In this paper, we present the field trial results using the framework with a number of drifters and floats. We discovered a number of important aspects that need to be considered in the proposed framework, and present the potential approaches to address the challenges.

【14】 Learning Robot Exploration Strategy with 4D Point-Clouds-like Information as Observations 标题:以4D类点云信息为观测的学习型机器人探索策略

作者:Zhaoting Li,Tingguang Li,Jiankun Wang,Max Q. -H. Meng 链接:https://arxiv.org/abs/2106.09257 摘要:能够探索未知环境是完全自主机器人的一个要求。许多基于学习的方法被提出来学习探索策略。在基于前沿的探索中,学习算法倾向于学习最优或接近最优的前沿进行探索。这些方法大多将环境表示为固定大小的图像,并将其作为神经网络的输入。然而,由于环境的大小通常是未知的,这使得这些方法无法推广到实际场景中。为了解决这个问题,我们提出了一种新的基于4D点云信息的状态表示方法,包括位置、边界和距离信息。我们还设计了一个神经网络,可以处理这些4D点云样的信息,并生成每个前沿的估计值。然后采用典型的强化学习框架对神经网络进行训练。我们通过与其他五种方法的比较来测试我们提出的方法的性能,并在一个比训练集中的map大得多的map上测试它的可伸缩性。实验结果表明,本文提出的方法需要较短的平均旅行距离来探索整个环境,并且可以用于任意大小的地图。 摘要:Being able to explore unknown environments is a requirement for fully autonomous robots. Many learning-based methods have been proposed to learn an exploration strategy. In the frontier-based exploration, learning algorithms tend to learn the optimal or near-optimal frontier to explore. Most of these methods represent the environments as fixed size images and take these as inputs to neural networks. However, the size of environments is usually unknown, which makes these methods fail to generalize to real world scenarios. To address this issue, we present a novel state representation method based on 4D point-clouds-like information, including the locations, frontier, and distance information. We also design a neural network that can process these 4D point-clouds-like information and generate the estimated value for each frontier. Then this neural network is trained using the typical reinforcement learning framework. We test the performance of our proposed method by comparing it with other five methods and test its scalability on a map that is much larger than maps in the training set. The experiment results demonstrate that our proposed method needs shorter average traveling distances to explore whole environments and can be adopted in maps with arbitrarily sizes.

【15】 Decentralised Intelligence, Surveillance, and Reconnaissance in Unknown Environments with Heterogeneous Multi-Robot Systems 标题:异构多机器人系统在未知环境中的分散情报、监视和侦察

作者:Ki Myung Brian Lee,Felix H. Kong,Ricardo Cannizzaro,Jennifer L. Palmer,David Johnson,Chanyeol Yoo,Robert Fitch 机构: and the University of Technology Sydney, 1AuthorsarewiththeUniversityofTechnologySydney 备注:7 pages, 6 figures. Presented at ICRA2021 Workshop on Robot Swarms in the Real World: From Design to Deployment 链接:https://arxiv.org/abs/2106.09219 摘要:我们提出了一个分散的、异构的多机器人系统的设计和实现,用于在未知环境中执行情报、监视和侦察(ISR)。该团队由收集信息的功能专用机器人和执行特定任务的其他机器人组成,协调一致,在未知环境中实现同步探测和开发。我们提出了这样一个系统的实际实现,包括分散的机器人间定位,地图,数据融合和协调。该系统在一个高效的分布式仿真中得到了验证。我们还描述了一个用于硬件实验的无人机平台,以及正在进行的进展。 摘要:We present the design and implementation of a decentralised, heterogeneous multi-robot system for performing intelligence, surveillance and reconnaissance (ISR) in an unknown environment. The team consists of functionally specialised robots that gather information and others that perform a mission-specific task, and is coordinated to achieve simultaneous exploration and exploitation in the unknown environment. We present a practical implementation of such a system, including decentralised inter-robot localisation, mapping, data fusion and coordination. The system is demonstrated in an efficient distributed simulation. We also describe an UAS platform for hardware experiments, and the ongoing progress.

【16】 Learning from Demonstration without Demonstrations 标题:从没有示范的示范中学习

作者:Tom Blau,Gilad Francis,Philippe Morere 机构:au† School of Computer Science, The University of Sydney 备注:International Conference on Robotics and Automation (ICRA), 2021. arXiv admin note: substantial text overlap with arXiv:2001.06940 链接:https://arxiv.org/abs/2106.09203 摘要:最新的强化学习(RL)算法具有较高的样本复杂度,特别是在稀疏奖励情况下。缓解这个问题的一个流行策略是通过模仿一组专家演示来学习控制策略。这种方法的缺点是,专家需要进行演示,这在实践中可能代价高昂。为了解决这个缺点,我们提出了演示发现的概率规划(P2D2),这是一种无需专家访问就可以自动发现演示的技术。我们将发现演示描述为一个搜索问题,并利用广泛使用的规划算法(如快速探索随机树)来发现演示轨迹。这些演示用于初始化策略,然后通过通用RL算法进行优化。我们提供了P2D2找到成功轨迹的理论保证,以及它的采样复杂度的界。实验证明,在一系列经典控制和机器人任务中,该方法的性能优于经典的和内在的探索RL技术,只需要一小部分的探索样本,并且获得了更好的渐近性能。 摘要:State-of-the-art reinforcement learning (RL) algorithms suffer from high sample complexity, particularly in the sparse reward case. A popular strategy for mitigating this problem is to learn control policies by imitating a set of expert demonstrations. The drawback of such approaches is that an expert needs to produce demonstrations, which may be costly in practice. To address this shortcoming, we propose Probabilistic Planning for Demonstration Discovery (P2D2), a technique for automatically discovering demonstrations without access to an expert. We formulate discovering demonstrations as a search problem and leverage widely-used planning algorithms such as Rapidly-exploring Random Tree to find demonstration trajectories. These demonstrations are used to initialize a policy, then refined by a generic RL algorithm. We provide theoretical guarantees of P2D2 finding successful trajectories, as well as bounds for its sampling complexity. We experimentally demonstrate the method outperforms classic and intrinsic exploration RL techniques in a range of classic control and robotics tasks, requiring only a fraction of exploration samples and achieving better asymptotic performance.

【17】 Automatic Curricula via Expert Demonstrations 标题:通过专家演示实现自动课程

作者:Siyu Dai,Andreas Hofmann,Brian Williams 机构:Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, United States 备注:Preprint, work in progress 链接:https://arxiv.org/abs/2106.09159 摘要:为了解决具有稀疏奖励函数的机器人操作任务,提出了一种基于专家演示(ACED)的强化学习(RL)方法,该方法结合了模仿学习和课程学习的思想。课程学习通过引入一系列辅助任务来解决复杂的学习任务,难度越来越大,然而如何自动设计有效的、可推广的课程仍然是一个具有挑战性的研究课题。ACED从少量的专家演示轨迹中提取课程,方法是将演示划分为几个部分,并对从演示的不同部分取样的州初始化训练集。随着学习代理性能的提高,ACED通过将重置状态从演示结束移动到演示开始,不仅可以学习具有未知初始化和目标的具有挑战性的操作任务,而且可以发现不同于演示的新颖解决方案。此外,ACED可以自然地与其他模仿学习方法相结合,以更有效的方式利用专家演示,并且我们表明,ACED与行为克隆的结合允许仅用1个演示就可以学习挑选和放置任务,而用20个演示就可以学习块堆积任务。 摘要:We propose Automatic Curricula via Expert Demonstrations (ACED), a reinforcement learning (RL) approach that combines the ideas of imitation learning and curriculum learning in order to solve challenging robotic manipulation tasks with sparse reward functions. Curriculum learning solves complicated RL tasks by introducing a sequence of auxiliary tasks with increasing difficulty, yet how to automatically design effective and generalizable curricula remains a challenging research problem. ACED extracts curricula from a small amount of expert demonstration trajectories by dividing demonstrations into sections and initializing training episodes to states sampled from different sections of demonstrations. Through moving the reset states from the end to the beginning of demonstrations as the learning agent improves its performance, ACED not only learns challenging manipulation tasks with unseen initializations and goals, but also discovers novel solutions that are distinct from the demonstrations. In addition, ACED can be naturally combined with other imitation learning methods to utilize expert demonstrations in a more efficient manner, and we show that a combination of ACED with behavior cloning allows pick-and-place tasks to be learned with as few as 1 demonstration and block stacking tasks to be learned with 20 demonstrations.

【18】 Planning on a (Risk) Budget: Safe Non-Conservative Planning in Probabilistic Dynamic Environments 标题:(风险)预算规划:概率动态环境下的安全非保守性规划

作者:Hung-Jui Huang,Kai-Chi Huang,Michal Čáp,Yibiao Zhao,Ying Nian Wu,Chris L. Baker 机构: USA 2University of California 备注:9 pages, 5 figures, International Conference on Robotics and Automation 2021 链接:https://arxiv.org/abs/2106.09127 摘要:在与未来行动不确定的其他代理一起进行环境规划时,通常需要在安全性和性能之间进行折衷。在这里,我们的目标是设计有效的规划算法,在安全违规的概率上有保证的界限,但仍然实现非保守的性能。为了量化系统的风险,我们定义了一个称为区间风险界(IRBs)的自然准则,它提供了给定时间间隔或任务中安全违规概率的参数上界。提出了一种新的滚动时域算法,并证明了该算法能够满足期望的IRB。我们的算法保持一个动态风险预算,该预算约束每次迭代的允许风险,并通过要求预算内的应急计划能够到达一个安全集来保证递归的可行性。在两个模拟的自动驾驶实验中,我们证明了我们的算法比强基线算法更安全、更不保守,并在一辆自主8级卡车上进行了验证。 摘要:Planning in environments with other agents whose future actions are uncertain often requires compromise between safety and performance. Here our goal is to design efficient planning algorithms with guaranteed bounds on the probability of safety violation, which nonetheless achieve non-conservative performance. To quantify a system's risk, we define a natural criterion called interval risk bounds (IRBs), which provide a parametric upper bound on the probability of safety violation over a given time interval or task. We present a novel receding horizon algorithm, and prove that it can satisfy a desired IRB. Our algorithm maintains a dynamic risk budget which constrains the allowable risk at each iteration, and guarantees recursive feasibility by requiring a safe set to be reachable by a contingency plan within the budget. We empirically demonstrate that our algorithm is both safer and less conservative than strong baselines in two simulated autonomous driving experiments in scenarios involving collision avoidance with other vehicles, and additionally demonstrate our algorithm running on an autonomous class 8 truck.

【19】 Safe Reinforcement Learning Using Advantage-Based Intervention 标题:基于优势干预的安全强化学习

作者:Nolan Wagener,Byron Boots,Ching-An Cheng 机构: we want the agent not only to find a 1Institute for Robotics and Intelligent Machines, AllenSchool of Computer Science and Engineering, University of Wash-ington 备注:Appearing in ICML 2021. 28 pages, 7 figures 链接:https://arxiv.org/abs/2106.09110 摘要:许多顺序决策问题都涉及到在满足安全约束的情况下寻找一个使总回报最大化的策略。尽管最近的研究主要集中在安全强化学习(RL)算法的开发上,该算法在训练后产生一个安全策略,但是确保训练期间的安全仍然是一个开放的问题。在未知马尔可夫决策过程(MDP)中,一个基本的挑战是在满足约束的同时进行探索。在这项工作中,我们解决这个问题的机会约束设置。我们提出了一种新的算法SAILR,它使用基于优势函数的干预机制来保证agent在整个训练过程中的安全,并使用为无约束mdp设计的现成RL算法来优化agent的策略。与最优安全约束策略相比,我们的方法在训练和部署期间(即训练后和没有干预机制的情况下)的安全性和策略性能都有很强的保证。在我们的实验中,我们发现saillr在训练过程中比标准的safe-RL和受约束的MDP方法更少地违反约束,并且收敛到一个性能良好的策略,该策略可以在不需要干预的情况下安全部署。我们的代码在https://github.com/nolanwagener/safe_rl. 摘要:Many sequential decision problems involve finding a policy that maximizes total reward while obeying safety constraints. Although much recent research has focused on the development of safe reinforcement learning (RL) algorithms that produce a safe policy after training, ensuring safety during training as well remains an open problem. A fundamental challenge is performing exploration while still satisfying constraints in an unknown Markov decision process (MDP). In this work, we address this problem for the chance-constrained setting. We propose a new algorithm, SAILR, that uses an intervention mechanism based on advantage functions to keep the agent safe throughout training and optimizes the agent's policy using off-the-shelf RL algorithms designed for unconstrained MDPs. Our method comes with strong guarantees on safety during both training and deployment (i.e., after training and without the intervention mechanism) and policy performance compared to the optimal safety-constrained policy. In our experiments, we show that SAILR violates constraints far less during training than standard safe RL and constrained MDP approaches and converges to a well-performing policy that can be deployed safely without intervention. Our code is available at https://github.com/nolanwagener/safe_rl.

【20】 Convex Optimization for Trajectory Generation 标题:轨迹生成的凸优化算法

作者:Danylo Malyuta,Taylor P. Reynolds,Michael Szmuk,Thomas Lew,Riccardo Bonalli,Marco Pavone,Behcet Acikmese 机构:a William E. Boeing Department of Aeronautics and Astronautics, University of Washington, Seattle, WA , USA, b Department of Aeronautics and Astronautics, Stanford University, Stanford, CA , USA 备注:68 pages, 42 figures, 5 tables. This work has been submitted to the IEEE for possible publication 链接:https://arxiv.org/abs/2106.09125 摘要:可靠有效的轨迹生成方法是未来自治动力系统的基本要求。本文的目标是提供一个基于三种主要凸优化的轨迹生成方法的综合教程:无损凸化(LCvx)和两种称为SCvx和GuSTO的序列凸规划算法。在本文中,轨迹生成是在优化关键任务目标的同时,计算满足一组约束条件的动态可行状态和控制信号。轨迹生成问题几乎总是非凸的,这通常意味着它不容易在自主车辆上得到有效和可靠的解决。我们所讨论的三种算法使用问题重构和系统算法策略,通过使用凸优化器来解决非凸轨迹生成任务。凸优化算法所提供的理论保证和计算速度使其成为研究界和工业界的热门算法。到目前为止,应用列表包括火箭着陆、航天器高超音速再入、航天器交会对接、固定翼和四旋翼飞行器的空中运动规划、机器人运动规划等。在这些应用中,包括由NASA、Masten Space Systems、SpaceX和Blue Origin等组织进行的引人注目的火箭飞行。本文旨在为读者提供使用每种算法所需的工具和理解,并了解每种方法可以做什么和不能做什么。公开可用的源代码存储库支持提供的数字示例。在文章的最后,读者应该准备好使用这些方法,扩展它们,并为它们的许多令人兴奋的现代应用做出贡献。 摘要:Reliable and efficient trajectory generation methods are a fundamental need for autonomous dynamical systems of tomorrow. The goal of this article is to provide a comprehensive tutorial of three major convex optimization-based trajectory generation methods: lossless convexification (LCvx), and two sequential convex programming algorithms known as SCvx and GuSTO. In this article, trajectory generation is the computation of a dynamically feasible state and control signal that satisfies a set of constraints while optimizing key mission objectives. The trajectory generation problem is almost always nonconvex, which typically means that it is not readily amenable to efficient and reliable solution onboard an autonomous vehicle. The three algorithms that we discuss use problem reformulation and a systematic algorithmic strategy to nonetheless solve nonconvex trajectory generation tasks through the use of a convex optimizer. The theoretical guarantees and computational speed offered by convex optimization have made the algorithms popular in both research and industry circles. To date, the list of applications includes rocket landing, spacecraft hypersonic reentry, spacecraft rendezvous and docking, aerial motion planning for fixed-wing and quadrotor vehicles, robot motion planning, and more. Among these applications are high-profile rocket flights conducted by organizations like NASA, Masten Space Systems, SpaceX, and Blue Origin. This article aims to give the reader the tools and understanding necessary to work with each algorithm, and to know what each method can and cannot do. A publicly available source code repository supports the provided numerical examples. By the end of the article, the reader should be ready to use the methods, to extend them, and to contribute to their many exciting modern applications.

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2021-06-18,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 arXiv每日学术速递 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档