前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >机器人相关学术速递[7.7]

机器人相关学术速递[7.7]

作者头像
公众号-arXiv每日学术速递
发布2021-07-27 10:27:20
4610
发布2021-07-27 10:27:20
举报
文章被收录于专栏:arXiv每日学术速递

cs.RO机器人相关,共计19篇

【1】 Learned Visual Navigation for Under-Canopy Agricultural Robots 标题:学习的冠下农业机器人视觉导航

作者:Arun Narenthiran Sivakumar,Sahil Modi,Mateus Valverde Gasparino,Che Ellis,Andres Eduardo Baquero Velasquez,Girish Chowdhary,Saurabh Gupta 机构:Department of Agricultural and Biological Engineering, University of Illinois at Urbana-Champaign (UIUC), Department of Computer Science, UIUC,Department of Electrical and Computer Engineering, UIUC, EarthSense Inc. 备注:RSS 2021. Project website with data and videos: this https URL 链接:https://arxiv.org/abs/2107.02792 摘要:我们描述了一个视觉引导下农场机器人自主导航系统。低成本的冠层下机器人可以在植物冠层下的作物行之间行驶,完成对冠层上的无人机或大型农业设备不可行的任务。然而,在树冠下自主导航它们带来了许多挑战:GPS和激光雷达不可靠、传感成本高、农场地形具有挑战性、树叶和杂草造成的杂乱以及季节和作物类型之间外观的巨大变化。我们通过构建一个模块化系统来解决这些挑战,该系统利用机器学习从低成本相机的单目RGB图像获得鲁棒和可概括的感知,并利用模型预测控制在具有挑战性的地形中进行精确控制。我们的系统CropFollow平均每次干预能够自主行驶485米,在跨越25公里的广泛实地测试中优于最先进的基于激光雷达的系统(每次干预286米)。 摘要:We describe a system for visually guided autonomous navigation of under-canopy farm robots. Low-cost under-canopy robots can drive between crop rows under the plant canopy and accomplish tasks that are infeasible for over-the-canopy drones or larger agricultural equipment. However, autonomously navigating them under the canopy presents a number of challenges: unreliable GPS and LiDAR, high cost of sensing, challenging farm terrain, clutter due to leaves and weeds, and large variability in appearance over the season and across crop types. We address these challenges by building a modular system that leverages machine learning for robust and generalizable perception from monocular RGB images from low-cost cameras, and model predictive control for accurate control in challenging terrain. Our system, CropFollow, is able to autonomously drive 485 meters per intervention on average, outperforming a state-of-the-art LiDAR based system (286 meters per intervention) in extensive field testing spanning over 25 km.

【2】 Geometrical Postural Optimisation of 7-DoF Limb-Like Manipulators 标题:7-DOF类肢体机械手的几何位姿优化

作者:Carlo Tiseo,Sydney Rebecca Charitos,Michael Mistry 机构:ECR, Institute of Perception Action & Behaviour, School of Informatics, University, of Edinburgh, Edinburgh, UK 链接:https://arxiv.org/abs/2107.02715 摘要:机器人正朝着在非结构化环境中应用的方向发展,但其基于模型的控制器面临着任务复杂性和内在环境不可预测性的挑战。研究生物运动控制可以提供克服这些限制的见解,因为在人类和动物身上可以观察到高度的灵活性和稳定性。这项工作提出了一个几何解决方案的姿态优化的7自由度四肢状机构,这是稳健的奇异性和计算效率。理论公式确定了两个独立的解耦优化策略。肩部和肘部策略将运动平面与预期运动平面对齐,并保证末端姿势的可达性。腕部策略确保末端执行器的方向,这对于在接近单一配置时保持可操作性至关重要。数值结果证实了理论观察,并允许我们确定不同的抓取策略对系统可操作性的影响。几何学方法在数千种配置中进行了数值测试,结果证明是稳健和准确的。测试场景包括左臂和右臂姿势、奇异配置和行走场景。所提出的几何方法可以应用于开发高效和鲁棒的交互控制器,可应用于计算神经科学和机器人学。 摘要:Robots are moving towards applications in less structured environments, but their model-based controllers are challenged by the tasks' complexity and intrinsic environmental unpredictability. Studying biological motor control can provide insights into overcoming these limitations due to the high dexterity and stability observable in humans and animals. This work presents a geometrical solution to the postural optimisation of 7-DoF limbs-like mechanisms, which are robust to singularities and computationally efficient. The theoretical formulation identified two separate decoupled optimisation strategies. The shoulder and elbow strategy align the plane of motion with the expected plane of motion and guarantee the reachability of the end-posture. The wrist strategy ensures the end-effector orientation, which is essential to retain manipulability when nearing a singular configuration. The numerical results confirmed the theoretical observations and allowed us to identify the effect of different grasp strategies on system manipulability. The geometrical method was numerically tested in thousands of configurations proving to be both robust and accurate. The tested scenarios include left and right arm postures, singular configurations, and walking scenarios. The proposed geometrical approach can find application in developing efficient and robust interaction controllers that could be applied in computational neuroscience and robotics.

【3】 Real-time Pose Estimation from Images for Multiple Humanoid Robots 标题:基于图像的多仿人机器人实时位姿估计

作者:Arash Amini,Hafez Farazi,Sven Behnke 机构:University of Bonn, Computer Science Institute VI, Autonomous Intelligent Systems, Friedrich-Hirzebruch-Allee , Bonn, Germany 链接:https://arxiv.org/abs/2107.02675 摘要:姿势估计通常指的是计算机视觉方法,可以识别图像或视频中的人体姿势。随着深度学习的最新进展,我们现在有了令人信服的模型来实时解决这个问题。由于这些模型通常是为人类图像而设计的,因此需要对现有模型进行调整,以适用于其他生物,包括机器人。本文研究了各种最新的姿态估计模型,并提出了一种在RoboCup仿人联盟环境下实时工作的轻量级模型。此外,我们提出了一个新的数据集称为类人机器人姿势数据集。这项工作的结果有可能使许多先进的行为足球机器人。 摘要:Pose estimation commonly refers to computer vision methods that recognize people's body postures in images or videos. With recent advancements in deep learning, we now have compelling models to tackle the problem in real-time. Since these models are usually designed for human images, one needs to adapt existing models to work on other creatures, including robots. This paper examines different state-of-the-art pose estimation models and proposes a lightweight model that can work in real-time on humanoid robots in the RoboCup Humanoid League environment. Additionally, we present a novel dataset called the HumanoidRobotPose dataset. The results of this work have the potential to enable many advanced behaviors for soccer-playing robots.

【4】 Best Axes Composition: Multiple Gyroscopes IMU Sensor Fusion to Reduce Systematic Error 标题:最佳轴系组合:多陀螺IMU传感器融合减小系统误差

作者:Marsel Faizullin,Gonzalo Ferrer 机构:We propose to monitor the systematic error and dynam-ically select among all sensors the best fitting axes to theThe authors are with Skolkovo Institute of Science and Technology 备注:European Conference on Mobile Robots 2021 链接:https://arxiv.org/abs/2107.02632 摘要:本文提出了一种将多个廉价的惯性测量单元(IMU)传感器相结合的算法来精确计算三维方位。我们的方法考虑了陀螺仪模型中固有的和不可忽略的系统误差,并基于在前一时刻观测到的误差提供了一种解决方案。我们的算法{em-Best-Axis-Composition}(BAC)在imu中动态地选择最合适的轴来提高估计性能。我们将我们的方法与概率多重IMU(MIMU)方法进行了比较,并在我们收集的数据集上验证了我们的算法。因此,只需2个imu即可显著提高精度,而其他MIMU方法则需要更多的传感器才能达到相同的结果。 摘要:In this paper, we have proposed an algorithm to combine multiple cheap Inertial Measurement Unit (IMU) sensors to calculate 3D-orientations accurately. Our approach takes into account the inherent and non-negligible systematic error in the gyroscope model and provides a solution based on the error observed during previous instants of time. Our algorithm, the {\em Best Axis Composition} (BAC), chooses dynamically the most fitted axes among IMUs to improve the estimation performance. We have compared our approach with a probabilistic Multiple IMU (MIMU) approach, and we have validated our algorithm in our collected dataset. As a result, it only takes as few as 2 IMUs to significantly improve accuracy, while other MIMU approaches need a higher number of sensors to achieve the same results.

【5】 Open-Source LiDAR Time Synchronization System by Mimicking GPS-clock 标题:模拟GPS时钟的开源LiDAR时间同步系统

作者:Marsel Faizullin,Anastasiia Kornilova,Gonzalo Ferrer 机构:Skolkovo Institute of Science and Technology, Moscow, Russia, ORCID: ∗,-,-,-, †,-,-,-, ‡,-,-,- 备注:IEEE Sensors 2021 Conference 链接:https://arxiv.org/abs/2107.02625 摘要:多传感器时间同步是传感器网络建设中的主要问题之一。数据融合算法及其应用,如激光雷达IMU里程计(LIO),依赖于精确的时间戳。我们将开放源激光雷达引入惯性测量单元(IMU)硬件时间同步系统,该系统可以推广到多个传感器,如相机、编码器、其他激光雷达等。该系统通过微控制器驱动的平台模拟GPS提供的时钟接口,提供1微秒的同步精度。此外,我们还比较了ROS软件提供的时间戳和LiDAR内时钟等其他同步方法,对系统的精度进行了评估,显示了两种同步方法的明显优势。 摘要:Time synchronization of multiple sensors is one of the main issues when building sensor networks. Data fusion algorithms and their applications, such as LiDAR-IMU Odometry (LIO), rely on precise timestamping. We introduce open-source LiDAR to inertial measurement unit (IMU) hardware time synchronization system, which could be generalized to multiple sensors such as cameras, encoders, other LiDARs, etc. The system mimics a GPS-supplied clock interface by a microcontroller-powered platform and provides 1 microsecond synchronization precision. In addition, we conduct an evaluation of the system precision comparing to other synchronization methods, including timestamping provided by ROS software and LiDAR inner clock, showing clear advantages over both baseline methods.

【6】 Tactile Sensing with a Tendon-Driven Soft Robotic Finger 标题:肌腱驱动软机器人手指的触觉传感

作者:Chang Cheng,Yadong Yan,Mingjun Guan,Jianan Zhang,Yu Wang 机构:School of Biological Sci. and Medical Engr., Beihang University, Beijing, China, Dept. of Math. and Computer Sci., Colorado College, Colorado, USA 备注:6 pages, 10 figures, submitted to ICCMA 2021 链接:https://arxiv.org/abs/2107.02546 摘要:提出了一种新型的机器人手指触觉传感机构。受哺乳动物本体感觉机制的启发,该方法从附着在手指肌腱上的应变传感器推断触觉信息。我们进行了实验来测试所提出的结构的触觉感知能力,并且我们的结果表明这种方法能够在外展和屈曲接触中触诊纹理和刚度。在系统交叉验证下,该系统的纹理和刚度识别准确率分别达到100%和99.7%,验证了该方法的可行性。此外,我们使用统计工具来确定提取的各种特征的重要性,以便进行分类。 摘要:In this paper, a novel tactile sensing mechanism for soft robotic fingers is proposed. Inspired by the proprioception mechanism found in mammals, the proposed approach infers tactile information from a strain sensor attached on the finger's tendon. We perform experiments to test the tactile sensing capabilities of the proposed structures, and our results indicate this method is capable of palpating texture and stiffness in both abduction and flexion contact. Under systematic cross validation, the proposed system achieved 100% and 99.7% accuracy in texture and stiffness discrimination respectively, which validate the viability of this approach. Furthermore, we use statistics tools to determine the significance of various features extracted for classification.

【7】 Approximate Topological Optimization using Multi-Mode Estimation for Robot Motion Planning 标题:基于多模态估计的机器人运动规划近似拓扑优化

作者:Andreas Orthey,Florian T. Pokorny,Marc Toussaint 机构: 1Max Planck Institute for Intelligent Systems, 2KTH Royal Institute of Technology, Sweden 3TechnicalUniversityofBerlin 备注:5 pages, Extended Abstract for RSS Workshop on Geometry and Topology in Robotics 链接:https://arxiv.org/abs/2107.02498 摘要:在这个扩展的摘要中,我们报告了一个具有渐近保证的近似多峰优化算法。多峰优化问题是寻找路径优化问题的所有局部最优解(模式)的问题。这对于压缩路径数据库非常重要,因为它是重新规划的意外情况,也是符号表示的来源。根据Morse理论的思想,我们将模式定义为代价泛函优化下的路径不变。提出了一种多模态估计算法,该算法能近似地求出给定运动优化问题的所有模态并渐近收敛。这可以通过将稀疏路线图与现有的单模优化算法相结合来实现。初步评价结果表明,多模式估计算法是从拓扑角度研究路径空间的一个很有前途的方向。 摘要:In this extended abstract, we report on ongoing work towards an approximate multimodal optimization algorithm with asymptotic guarantees. Multimodal optimization is the problem of finding all local optimal solutions (modes) to a path optimization problem. This is important to compress path databases, as contingencies for replanning and as source of symbolic representations. Following ideas from Morse theory, we define modes as paths invariant under optimization of a cost functional. We develop a multi-mode estimation algorithm which approximately finds all modes of a given motion optimization problem and asymptotically converges. This is made possible by integrating sparse roadmaps with an existing single-mode optimization algorithm. Initial evaluation results show the multi-mode estimation algorithm as a promising direction to study path spaces from a topological point of view.

【8】 Learning a Generative Transition Model for Uncertainty-Aware Robotic Manipulation 标题:不确定性感知机器人操作的产生式转换模型学习

作者:Lars Berscheid,Pascal Meißner,Torsten Kröger 机构: It highlights several chal- 1Karlsruhe Institute of Technology (KIT) {lars, edu 2University of Aberdeen pascal 备注:2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021) 链接:https://arxiv.org/abs/2107.02464 摘要:机器人学习现实世界中的操作任务仍然具有挑战性和耗时,即使行动往往是简化了一步操作原语。为了补偿去除的时间依赖性,我们还学习了一个图像到图像的转换模型,该模型能够预测下一个状态,包括其不确定性。我们将此方法应用于垃圾箱拣选,即使用抓取和预抓取操作尽可能快地清空垃圾箱的任务。在操纵动作前后,用多达42000对真实世界的图像训练过渡模型。我们的方法实现了两个重要的技能:首先,对于法兰安装摄像头的应用程序,通过跳过图像测量,每小时拾取次数(PPH)可以增加15%左右。其次,我们使用该模型提前规划行动序列,并优化与时间相关的奖励,例如最小化清空垃圾箱所需的行动数量。我们用真实的机器人实验评估了这两种改进,并在YCB盒和块测试中达到了700 PPH以上。 摘要:Robot learning of real-world manipulation tasks remains challenging and time consuming, even though actions are often simplified by single-step manipulation primitives. In order to compensate the removed time dependency, we additionally learn an image-to-image transition model that is able to predict a next state including its uncertainty. We apply this approach to bin picking, the task of emptying a bin using grasping as well as pre-grasping manipulation as fast as possible. The transition model is trained with up to 42000 pairs of real-world images before and after a manipulation action. Our approach enables two important skills: First, for applications with flange-mounted cameras, picks per hours (PPH) can be increased by around 15% by skipping image measurements. Second, we use the model to plan action sequences ahead of time and optimize time-dependent rewards, e.g. to minimize the number of actions required to empty the bin. We evaluate both improvements with real-robot experiments and achieve over 700 PPH in the YCB Box and Blocks Test.

【9】 Fast-Learning Grasping and Pre-Grasping via Clutter Quantization and Q-map Masking 标题:基于杂波量化和Q图掩蔽的快速学习抓取和预抓取

作者:Dafa Ren,Xiaoqiang Ren,Xiaofan Wang,S. Tejaswi Digumarti,Guodong Shi 机构:TheUniversityofSydney 链接:https://arxiv.org/abs/2107.02452 摘要:在杂乱无章的场景中抓取物体是机器人学中一项具有挑战性的任务。执行预抓取操作(如推动和移动)以分散对象是减少混乱的一种方法。在深度强化学习的基础上,提出了一种快速学习-抓取(FLG)框架,该框架将预抓取动作与抓取相结合,从杂乱的场景中提取物体,减少了真实世界的训练时间。我们将执行移动动作的奖励与环境混乱的变化联系起来,并采用混合触发方法,从而实现高效的数据学习和协同。然后,我们使用扩展的完全卷积网络的输出作为工作空间每个像素点的值函数,并建立每个动作的抓取概率的精确估计。我们还引入遮罩函数作为先验知识,使智能体能够专注于精确的姿态调整,从而提高训练数据的收集效率,从而提高学习效率。我们在模拟环境下对FLG进行预训练,然后将学习到的模型以最小的微调传递到现实世界中,以便在动作过程中进一步学习。实验结果表明,该方法具有94%的抓取成功率和对新物体的泛化能力。与文献中的最新方法相比,本文提出的FLG框架可以在较少的训练量下获得相似或更高的抓取成功率。补充视频可在https://youtu.be/e04uDLsxfDg. 摘要:Grasping objects in cluttered scenarios is a challenging task in robotics. Performing pre-grasp actions such as pushing and shifting to scatter objects is a way to reduce clutter. Based on deep reinforcement learning, we propose a Fast-Learning Grasping (FLG) framework, that can integrate pre-grasping actions along with grasping to pick up objects from cluttered scenarios with reduced real-world training time. We associate rewards for performing moving actions with the change of environmental clutter and utilize a hybrid triggering method, leading to data-efficient learning and synergy. Then we use the output of an extended fully convolutional network as the value function of each pixel point of the workspace and establish an accurate estimation of the grasp probability for each action. We also introduce a mask function as prior knowledge to enable the agents to focus on the accurate pose adjustment to improve the effectiveness of collecting training data and, hence, to learn efficiently. We carry out pre-training of the FLG over simulated environment, and then the learnt model is transferred to the real world with minimal fine-tuning for further learning during actions. Experimental results demonstrate a 94% grasp success rate and the ability to generalize to novel objects. Compared to state-of-the-art approaches in the literature, the proposed FLG framework can achieve similar or higher grasp success rate with lesser amount of training in the real world. Supplementary video is available at https://youtu.be/e04uDLsxfDg.

【10】 A Hierarchical Dual Model of Environment- and Place-Specific Utility for Visual Place Recognition 标题:视觉场所识别的环境和地点特定效用的分层对偶模型

作者:Nikhil Varma Keetha,Michael Milford,Sourav Garg 机构: Wepresent a novel hierarchical VPR pipeline that uses globaldescriptors to guide local feature matching in a more unified 1The author is with the Indian Institute of Technology (ISM) Dhanbad 备注:Accepted to IEEE Robotics and Automation Letters (RA-L) and IROS 2021 链接:https://arxiv.org/abs/2107.02440 摘要:视觉位置识别(VPR)方法通常试图通过识别视觉线索、图像区域或地标来匹配位置,这些视觉线索、图像区域或地标在识别特定位置时具有很高的“效用”。但是这种效用的概念并不是单一的,它可以有多种形式。在本文中,我们提出了一种新的方法来推导VPR的两种关键效用类型:特定于环境和特定位置的视觉线索效用。我们利用对比学习原理,以无监督的方式估计局部聚集描述符(VLAD)簇向量的环境效用和位置效用,然后通过关键点选择来指导局部特征匹配。通过结合这两种效用度量,我们的方法在三个具有挑战性的基准数据集上实现了最先进的性能,同时减少了所需的存储和计算时间。我们提供了进一步的分析,证明了无监督聚类选择会产生语义上有意义的结果,细粒度分类通常比高级语义分类(如建筑、道路)具有更高的VPR效用,并描述了这两种效用度量在不同地点和环境中的变化。源代码公开于https://github.com/Nik-V9/HEAPUtil. 摘要:Visual Place Recognition (VPR) approaches have typically attempted to match places by identifying visual cues, image regions or landmarks that have high ``utility'' in identifying a specific place. But this concept of utility is not singular - rather it can take a range of forms. In this paper, we present a novel approach to deduce two key types of utility for VPR: the utility of visual cues `specific' to an environment, and to a particular place. We employ contrastive learning principles to estimate both the environment- and place-specific utility of Vector of Locally Aggregated Descriptors (VLAD) clusters in an unsupervised manner, which is then used to guide local feature matching through keypoint selection. By combining these two utility measures, our approach achieves state-of-the-art performance on three challenging benchmark datasets, while simultaneously reducing the required storage and compute time. We provide further analysis demonstrating that unsupervised cluster selection results in semantically meaningful results, that finer grained categorization often has higher utility for VPR than high level semantic categorization (e.g. building, road), and characterise how these two utility measures vary across different places and environments. Source code is made publicly available at https://github.com/Nik-V9/HEAPUtil.

【11】 DL-AMP and DBTO: An Automatic Merge Planning and Trajectory Optimization and Its Application in Autonomous Driving 标题:DL-AMP和DBTO:一种自动合并规划和轨迹优化方法及其在自动驾驶中的应用

作者:Yuncheng Jiang,Qi Lin,Jiwei Zhang,Jun Wang,Danjian Qian,Yuxi Cai 备注:8 pages, preprint on Feb 2, 2021, accepted by ITSC2021 on Jun, 25, 2021 链接:https://arxiv.org/abs/2107.02413 摘要:提出了一种自主驾驶车辆的自动合并算法,该算法将特定的运动规划问题分解为双层自动合并规划和基于下降的轨迹优化算法。这项工作在寻找最佳合并时机、横向和纵向合并规划与控制、轨迹后处理和驾驶舒适性等方面取得了很大的进步。 摘要:This paper presents an automatic merging algorithm for autonomous driving vehicles, which decouples the specific motion planning problem into a Dual-Layer Automatic Merge Planning (DL_AMP) and a Descent-Based Trajectory Optimization (DBTO). This work leads to great improvements in finding the best merge opportunity, lateral and longitudinal merge planning and control, trajectory postprocessing and driving comfort.

【12】 Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling 标题:基于随机采样的大规模点云学习语义分割

作者:Qingyong Hu,Bo Yang,Linhai Xie,Stefano Rosa,Yulan Guo,Zhihua Wang,Niki Trigoni,Andrew Markham 机构: Yang is with the Department of Computing, The HongKong Polytechnic University, Guo is with the School ofElectronics and Communication Engineering, Sun Yat-sen University 备注:IEEE TPAMI 2021. arXiv admin note: substantial text overlap with arXiv:1911.11236 链接:https://arxiv.org/abs/2107.02389 摘要:研究了大规模三维点云的有效语义分割问题。由于依赖于昂贵的采样技术或计算繁重的前/后处理步骤,大多数现有方法只能在小尺度点云上进行训练和操作。在本文中,我们引入了RandLA网络,一种高效轻量级的神经网络结构来直接推断大规模点云的逐点语义。我们的方法的关键是使用随机点采样而不是更复杂的点选择方法。虽然计算和内存效率非常高,但随机抽样可以随意丢弃关键特征。为了克服这个问题,我们引入了一个新的局部特征聚合模块,以逐步增加每个三维点的感受野,从而有效地保留几何细节。对比实验表明,我们的RandLA网络一次处理100万个点的速度比现有方法快200倍。此外,在Semantic3D、SemanticKITTI、Toronto3D、NPM3D和S3DIS等五个大型点云数据集上进行了大量实验,验证了RandLA网络的最新语义分割性能。 摘要:We study the problem of efficient semantic segmentation of large-scale 3D point clouds. By relying on expensive sampling techniques or computationally heavy pre/post-processing steps, most existing approaches are only able to be trained and operate over small-scale point clouds. In this paper, we introduce RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Although remarkably computation and memory efficient, random sampling can discard key features by chance. To overcome this, we introduce a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details. Comparative experiments show that our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches. Moreover, extensive experiments on five large-scale point cloud datasets, including Semantic3D, SemanticKITTI, Toronto3D, NPM3D and S3DIS, demonstrate the state-of-the-art semantic segmentation performance of our RandLA-Net.

【13】 Multi-Modal Motion Planning Using Composite Pose Graph Optimization 标题:基于组合位姿图优化的多模态运动规划

作者:L. Lao Beyer,N. Balabanska,E. Tal,S. Karaman 机构:All authors are with the Laboratory for Information and Decision Sys-, tems, Massachusetts Institute of Technology (MIT), Cambridge, MA 备注:7 pages, 6 figures, to be included in proceedings of IEEE International Conference on Robotics and Automation 2021 链接:https://arxiv.org/abs/2107.02384 摘要:本文提出了一种多模态车辆动力学的运动规划框架。该算法将优化目标函数、车辆动力学、状态约束和控制约束转化为稀疏因子图,再结合模式转换约束,构成复合姿态图。通过构造复合位姿图形式的多模态运动规划问题,我们可以利用稀疏图上的有效优化技术,例如那些广泛应用于双重估计问题的技术,例如同步定位和映射(SLAM)。由此产生的运动规划算法优化多模态轨迹,包括模式转换的位置,并由姿势图优化过程引导以消除不必要的转换,从而能够从粗略的初始猜测中高效地发现优化的模式序列。我们在模拟和真实世界的实验中演示了具有各种动力学模型的飞行器的多模态轨迹优化,例如具有滑行和飞行模式的飞机,以及在悬停和水平飞行模式之间转换的垂直起降(VTOL)固定翼飞机。 摘要:In this paper, we present a motion planning framework for multi-modal vehicle dynamics. Our proposed algorithm employs transcription of the optimization objective function, vehicle dynamics, and state and control constraints into sparse factor graphs, which -- combined with mode transition constraints -- constitute a composite pose graph. By formulating the multi-modal motion planning problem in composite pose graph form, we enable utilization of efficient techniques for optimization on sparse graphs, such as those widely applied in dual estimation problems, e.g., simultaneous localization and mapping (SLAM). The resulting motion planning algorithm optimizes the multi-modal trajectory, including the location of mode transitions, and is guided by the pose graph optimization process to eliminate unnecessary transitions, enabling efficient discovery of optimized mode sequences from rough initial guesses. We demonstrate multi-modal trajectory optimization in both simulation and real-world experiments for vehicles with various dynamics models, such as an airplane with taxi and flight modes, and a vertical take-off and landing (VTOL) fixed-wing aircraft that transitions between hover and horizontal flight modes.

【14】 Real-Time Motion Planning of a Hydraulic Excavator using Trajectory Optimization and Model Predictive Control 标题:基于轨迹优化和模型预测控制的液压挖掘机实时运动规划

作者:Dongjae Lee*,Inkyu Jang*,Jeonghyun Byun,Hoseong Seo,H. Jin Kim 机构:©, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media 备注:8 pages, 8 figures, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) accepted 链接:https://arxiv.org/abs/2107.02366 摘要:挖掘任务的自动化需要满足各种约束条件的实时轨迹规划。为了保证约束的可行性和实时轨迹可重规划性,提出了一种基于实时优化的液压挖掘机轨迹规划集成框架。该框架由两个主要模块组成:全局规划器和实时局部规划器。全局规划器在考虑开挖量和能量最小化的情况下计算整个全局轨迹,而局部规划器在满足动态可行性、物理约束和扰动感知的前提下,以滚动的方式跟踪全局轨迹。我们在一个模拟环境中验证了所提出的规划算法,在这个环境中,有两种类型的作业是在模拟的液压摩擦和土桶相互作用的干扰下进行的:浅基坑和深基坑。优化后的全局轨迹以1秒的速度获得,并由局部规划器以30hz以上的频率进行跟踪。据我们所知,这项工作提出了第一个实时运动规划框架,满足液压挖掘机的限制,如力/扭矩,功率,气缸位移和流量限制。 摘要:Automation of excavation tasks requires real-time trajectory planning satisfying various constraints. To guarantee both constraint feasibility and real-time trajectory re-plannability, we present an integrated framework for real-time optimization-based trajectory planning of a hydraulic excavator. The proposed framework is composed of two main modules: a global planner and a real-time local planner. The global planner computes the entire global trajectory considering excavation volume and energy minimization while the local counterpart tracks the global trajectory in a receding horizon manner, satisfying dynamic feasibility, physical constraints, and disturbance-awareness. We validate the proposed planning algorithm in a simulation environment where two types of operations are conducted in the presence of emulated disturbance from hydraulic friction and soil-bucket interaction: shallow and deep excavation. The optimized global trajectories are obtained in an order of a second, which is tracked by the local planner at faster than 30 Hz. To the best of our knowledge, this work presents the first real-time motion planning framework that satisfies constraints of a hydraulic excavator, such as force/torque, power, cylinder displacement, and flow rate limits.

【15】 Physical Interaction as Communication: Learning Robot Objectives Online from Human Corrections 标题:作为交流的物理交互:从人类矫正中在线学习机器人目标

作者:Dylan P. Losey,Andrea Bajcsy,Marcia K. O'Malley,Anca D. Dragan 机构:edu 2University of California, edu 3Rice University; omalleym, Department of Mechanical Engineering 链接:https://arxiv.org/abs/2107.02349 摘要:当机器人在人旁边执行任务时,物理交互是不可避免的:人可以推、拉、扭或引导机器人。最先进的技术将这些相互作用视为机器人应该拒绝或避免的干扰。充其量,这些机器人在人类互动时安全地做出反应;但在人类放手之后,这些机器人就简单地恢复了它们原来的行为。我们认识到,人与机器人的物理交互(pHRI)通常是有意的——人类故意干预是因为机器人没有正确地完成任务。在本文中,我们认为当pHRI是有意的时,它也提供了信息:机器人可以利用交互来学习如何完成当前任务的其余部分,即使在人放手之后。我们将pHRI形式化为一个动态系统,在这个系统中,人类有一个他们希望机器人优化的目标函数,但是机器人不能直接访问这个目标的参数——它们是人类内部的。在我们提出的框架内,人与人之间的互动成为对真实目标的观察。我们介绍了近似学习和响应实时pHRI。我们认识到,并不是所有的人类纠正都是完美的:用户经常与机器人进行有噪音的交互,因此我们通过减少非故意的学习来提高机器人从pHRI中学习的效率。最后,我们在一个机械手上进行模拟和使用者研究,以比较我们提出的方法与最先进的方法。我们的研究结果表明,从pHRI学习可以提高任务绩效和人的满意度。 摘要:When a robot performs a task next to a human, physical interaction is inevitable: the human might push, pull, twist, or guide the robot. The state-of-the-art treats these interactions as disturbances that the robot should reject or avoid. At best, these robots respond safely while the human interacts; but after the human lets go, these robots simply return to their original behavior. We recognize that physical human-robot interaction (pHRI) is often intentional -- the human intervenes on purpose because the robot is not doing the task correctly. In this paper, we argue that when pHRI is intentional it is also informative: the robot can leverage interactions to learn how it should complete the rest of its current task even after the person lets go. We formalize pHRI as a dynamical system, where the human has in mind an objective function they want the robot to optimize, but the robot does not get direct access to the parameters of this objective -- they are internal to the human. Within our proposed framework human interactions become observations about the true objective. We introduce approximations to learn from and respond to pHRI in real-time. We recognize that not all human corrections are perfect: often users interact with the robot noisily, and so we improve the efficiency of robot learning from pHRI by reducing unintended learning. Finally, we conduct simulations and user studies on a robotic manipulator to compare our proposed approach to the state-of-the-art. Our results indicate that learning from pHRI leads to better task performance and improved human satisfaction.

【16】 Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning 标题:鲁棒自监督深度强化学习的多模态互信息(MuMMI)训练

作者:Kaiqi Chen,Yong Lee,Harold Soh 机构:Dept. of Computer Science, National University of Singapore. 备注:10 pages, Published in ICRA 2021 链接:https://arxiv.org/abs/2107.02339 摘要:这项工作的重点是学习有用和强大的深世界模型使用多个,可能不可靠的,传感器。我们发现,目前的方法不足以鼓励模式之间的共同代表性;这可能会导致下游任务的性能不佳,以及对特定传感器的过度依赖。作为一个解决方案,我们提出了一个新的多模态深潜状态空间模型,利用互信息下界进行训练。关键的创新是一个特别设计的密度比估计器,鼓励每个模态的潜在代码之间的一致性。我们的任务是学习多模态自然MuJoCo基准上的策略(以自我监督的方式),以及一个具有挑战性的擦表任务。实验表明,我们的方法明显优于现有的深度强化学习方法,尤其是在缺少观察的情况下。 摘要:This work focuses on learning useful and robust deep world models using multiple, possibly unreliable, sensors. We find that current methods do not sufficiently encourage a shared representation between modalities; this can cause poor performance on downstream tasks and over-reliance on specific sensors. As a solution, we contribute a new multi-modal deep latent state-space model, trained using a mutual information lower-bound. The key innovation is a specially-designed density ratio estimator that encourages consistency between the latent codes of each modality. We tasked our method to learn policies (in a self-supervised manner) on multi-modal Natural MuJoCo benchmarks and a challenging Table Wiping task. Experiments show our method significantly outperforms state-of-the-art deep reinforcement learning methods, particularly in the presence of missing observations.

【17】 Pedestrian Emergence Estimation and Occlusion-Aware Risk Assessment for Urban Autonomous Driving 标题:城市自动驾驶行人涌现估计与遮挡感知风险评估

作者:Mert Koc,Ekim Yurtsever,Keith Redmill,Umit Ozguner 机构: [ 16] improved the reachability 1The Department of Electrical and Computer Engineering, The OhioState University 备注:Accepted to ITSC2021 链接:https://arxiv.org/abs/2107.02326 摘要:避免看不见或部分闭塞的脆弱道路使用者(VRU)是城市场景中完全自主驾驶的一个主要挑战。然而,遮挡感知的风险评估系统还没有得到广泛的研究。在这里,我们提出了一个行人出现估计和遮挡感知的城市自主驾驶风险评估系统。首先,提出的系统利用可用的上下文信息,如可见的汽车和行人,估计行人出现在闭塞区域的概率。这些概率然后用于风险评估框架,并纳入纵向运动控制器。所提出的控制器测试了几个基线控制器,这些控制器概括了一些常见的驾驶风格。模拟的测试场景包括随机停放的汽车和行人,他们中的大多数被挡在汽车的视线之外,随机出现。提出的控制器在安全性和舒适性方面优于基线。 摘要:Avoiding unseen or partially occluded vulnerable road users (VRUs) is a major challenge for fully autonomous driving in urban scenes. However, occlusion-aware risk assessment systems have not been widely studied. Here, we propose a pedestrian emergence estimation and occlusion-aware risk assessment system for urban autonomous driving. First, the proposed system utilizes available contextual information, such as visible cars and pedestrians, to estimate pedestrian emergence probabilities in occluded regions. These probabilities are then used in a risk assessment framework, and incorporated into a longitudinal motion controller. The proposed controller is tested against several baseline controllers that recapitulate some commonly observed driving styles. The simulated test scenarios include randomly placed parked cars and pedestrians, most of whom are occluded from the ego vehicle's view and emerges randomly. The proposed controller outperformed the baselines in terms of safety and comfort measures.

【18】 Autonomous Robotic Endoscope Control based on Semantically Rich Instructions 标题:基于语义丰富指令的机器人自主内窥镜控制

作者:Caspar Gruijthuijsen,Luis C. Garcia-Peraza-Herrera,Gianni Borghesan,Dominiek Reynaerts,Jan Deprest,Sebastien Ourselin,Tom Vercauteren,Emmanuel Vander Poorten 机构:Department of Mechanical Engineering, KU Leuven, Leuven, Belgium, Department of Medical Physics and Biomedical Engineering, University College London, United Kingdom, Department of Development and Regeneration, Division Woman and Child, KU Leuven, Belgium 备注:Caspar Gruijthuijsen and Luis C. Garcia-Peraza-Herrera contributed equally to this work 链接:https://arxiv.org/abs/2107.02317 摘要:在锁孔介入治疗中,当外科医生的手被手术器械占据时,他们依靠同事充当摄像助手。这通常会导致图像稳定性降低,任务完成时间增加,有时还会出错。机器人内窥镜支架(REHs)由一组基本指令控制,被认为是一种替代方法,但其非自然的操作增加了外科医生的认知负荷,阻碍了其广泛的临床接受。我们建议REHs通过语义丰富的指令与手术医生协作,这些指令与发给人类摄像助手的指令非常相似,例如“聚焦在我的右手器械上”。作为一个概念证明,我们提出了一个新的系统,铺平了道路之间的协同互动外科医生和康复中心。提出的平台允许外科医生执行双人工协调和导航任务,而机械臂自主执行各种内窥镜定位任务。在我们的系统中,我们提出了一种新的基于手术工具分割的工具提示定位方法,以及一种新的视觉伺服方法,以确保内窥镜摄像头的平滑和正确运动。我们验证了我们的视觉管道,并运行了该系统的用户研究。通过在医学上证明的双人工协调和导航任务中的成功应用,该框架已被证明是更广泛临床采用REHs的一个有希望的起点。 摘要:In keyhole interventions, surgeons rely on a colleague to act as a camera assistant when their hands are occupied with surgical instruments. This often leads to reduced image stability, increased task completion times and sometimes errors. Robotic endoscope holders (REHs), controlled by a set of basic instructions, have been proposed as an alternative, but their unnatural handling increases the cognitive load of the surgeon, hindering their widespread clinical acceptance. We propose that REHs collaborate with the operating surgeon via semantically rich instructions that closely resemble those issued to a human camera assistant, such as "focus on my right-hand instrument". As a proof-of-concept, we present a novel system that paves the way towards a synergistic interaction between surgeons and REHs. The proposed platform allows the surgeon to perform a bi-manual coordination and navigation task, while a robotic arm autonomously performs various endoscope positioning tasks. Within our system, we propose a novel tooltip localization method based on surgical tool segmentation, and a novel visual servoing approach that ensures smooth and correct motion of the endoscope camera. We validate our vision pipeline and run a user study of this system. Through successful application in a medically proven bi-manual coordination and navigation task, the framework has shown to be a promising starting point towards broader clinical adoption of REHs.

【19】 A visual introduction to Gaussian Belief Propagation 标题:高斯信念传播的可视化介绍

作者:Joseph Ortiz,Talfan Evans,Andrew J. Davison 机构:Imperial College London,DeepMind 备注:See online version of this article: this https URL 链接:https://arxiv.org/abs/2107.02308 摘要:在这篇文章中,我们提出了一个可视化的介绍高斯信念传播(GBP),一个近似的概率推理算法,通过在任意结构的因子图的节点之间传递消息来进行操作。作为循环信念传播的一种特殊情况,GBP更新只依赖于局部信息,并且将独立于消息调度而收敛。我们的主要论点是,考虑到计算硬件的最新趋势,GBP具有正确的计算特性,可以作为未来机器学习系统的可伸缩分布式概率推理框架。 摘要:In this article, we present a visual introduction to Gaussian Belief Propagation (GBP), an approximate probabilistic inference algorithm that operates by passing messages between the nodes of arbitrarily structured factor graphs. A special case of loopy belief propagation, GBP updates rely only on local information and will converge independently of the message schedule. Our key argument is that, given recent trends in computing hardware, GBP has the right computational properties to act as a scalable distributed probabilistic inference framework for future machine learning systems.

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-07-07,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 arXiv每日学术速递 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档