前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >机器人相关学术速递[9.9]

机器人相关学术速递[9.9]

作者头像
公众号-arXiv每日学术速递
发布2021-09-16 16:50:33
4180
发布2021-09-16 16:50:33
举报

Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!

cs.RO机器人相关,共计11篇

【1】 Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking 标题:全景NuScenes:一种用于LiDAR全景分割和跟踪的大规模基准 链接:https://arxiv.org/abs/2109.03805

作者:Whye Kit Fong,Rohit Mohan,Juana Valeria Hurtado,Lubing Zhou,Holger Caesar,Oscar Beijbom,Abhinav Valada 机构: 2 Department of Computer Science, University of Freiburg 备注:The benchmark is available at this https URL and this https URL 摘要:动态代理的全景场景理解和跟踪对于机器人和自动车辆在城市环境中导航至关重要。由于激光雷达提供与照明无关的精确场景几何描述,因此使用激光雷达点云执行这些任务可以提供可靠的预测。然而,现有数据集缺乏城市场景类型的多样性,动态对象实例数量有限,这既妨碍了对这些任务的学习,也妨碍了对所开发方法的可靠基准测试。在本文中,我们介绍了大规模全景nuScenes基准数据集,它扩展了我们流行的nuScenes数据集,并为语义分割、全景分割和全景跟踪任务提供了逐点的地面真相注释。为了便于比较,我们在建议的数据集上为每项任务提供了几个强大的基线。此外,我们分析了现有的全景跟踪指标的缺点,并提出了一种新的以实例为中心的指标来解决这些问题。我们提供了大量的实验,与现有的数据集相比,这些实验证明了全景式nuScenes的实用性,并使在线评估服务器在\url{nuScenes.org}上可用。我们相信,这一扩展将加速研究动态城市环境场景理解的新方法。 摘要:Panoptic scene understanding and tracking of dynamic agents are essential for robots and automated vehicles to navigate in urban environments. As LiDARs provide accurate illumination-independent geometric depictions of the scene, performing these tasks using LiDAR point clouds provides reliable predictions. However, existing datasets lack diversity in the type of urban scenes and have a limited number of dynamic object instances which hinders both learning of these tasks as well as credible benchmarking of the developed methods. In this paper, we introduce the large-scale Panoptic nuScenes benchmark dataset that extends our popular nuScenes dataset with point-wise groundtruth annotations for semantic segmentation, panoptic segmentation, and panoptic tracking tasks. To facilitate comparison, we provide several strong baselines for each of these tasks on our proposed dataset. Moreover, we analyze the drawbacks of the existing metrics for the panoptic tracking problem and propose a novel instance-centric metric that addresses the concerns. We present extensive experiments that demonstrate the utility of Panoptic nuScenes compared to existing datasets and make the online evaluation server available at \url{nuScenes.org}. We believe that this extension will accelerate the research of novel methods for scene understanding of dynamic urban environments.

【2】 FIDNet: LiDAR Point Cloud Semantic Segmentation with Fully Interpolation Decoding 标题:FIDNet:基于全插值解码的激光雷达点云语义分割 链接:https://arxiv.org/abs/2109.03787

作者:Yiming Zhao,Lin Bai,Xinming Huang 机构:Note:, After, submission, of, IROS, we, have, some, follow-up, updates, please, scroll, to, end, and take a look at the supplementary, file after the, reference section. Further details can be found in our, code: 备注:Accepted by IROS'21, code link: this https URL 摘要:将点云投影到二维球形范围图像上会将激光雷达语义分割转换为范围图像上的二维分割任务。然而,激光雷达距离图像仍然与常规2D RGB图像自然不同;例如,范围图像上的每个位置编码唯一的几何信息。在本文中,我们提出了一种新的基于投影的激光雷达语义分割管道,该管道由一种新的网络结构和一个高效的后处理步骤组成。在我们的网络结构中,我们设计了一个FID(完全插值解码)模块,该模块使用双线性插值直接对多分辨率特征地图进行上采样。受PointNet++中使用的3D距离插值的启发,我们认为此FID模块是$(\theta,\phi)$空间上的2D版本距离插值。作为一个无参数解码模块,FID通过保持良好的性能大大降低了模型复杂度。除了网络结构之外,我们的经验发现我们的模型预测在不同的语义类别之间有明确的界限。这让我们重新思考,广泛使用的K-最近邻后处理是否仍然是我们的管道所必需的。然后,我们意识到多对一映射会导致一些点映射到同一像素并共享同一标签的模糊效果。因此,我们建议通过指定最近的预测标签来处理这些遮挡点。该NLA(最近标签分配)后处理步骤在消融研究中显示出比KNN更好的性能和更快的推理速度。在SemanticKITTI数据集上,在所有基于投影的方法中,我们的管道以$64乘以2048$的分辨率和所有逐点解决方案实现了最佳性能。以ResNet-34为主干,我们的模型的训练和测试都可以在一个带有11G内存的RTX 2080 Ti上完成。代码发布了。 摘要:Projecting the point cloud on the 2D spherical range image transforms the LiDAR semantic segmentation to a 2D segmentation task on the range image. However, the LiDAR range image is still naturally different from the regular 2D RGB image; for example, each position on the range image encodes the unique geometry information. In this paper, we propose a new projection-based LiDAR semantic segmentation pipeline that consists of a novel network structure and an efficient post-processing step. In our network structure, we design a FID (fully interpolation decoding) module that directly upsamples the multi-resolution feature maps using bilinear interpolation. Inspired by the 3D distance interpolation used in PointNet++, we argue this FID module is a 2D version distance interpolation on $(\theta, \phi)$ space. As a parameter-free decoding module, the FID largely reduces the model complexity by maintaining good performance. Besides the network structure, we empirically find that our model predictions have clear boundaries between different semantic classes. This makes us rethink whether the widely used K-nearest-neighbor post-processing is still necessary for our pipeline. Then, we realize the many-to-one mapping causes the blurring effect that some points are mapped into the same pixel and share the same label. Therefore, we propose to process those occluded points by assigning the nearest predicted label to them. This NLA (nearest label assignment) post-processing step shows a better performance than KNN with faster inference speed in the ablation study. On the SemanticKITTI dataset, our pipeline achieves the best performance among all projection-based methods with $64 \times 2048$ resolution and all point-wise solutions. With a ResNet-34 as the backbone, both the training and testing of our model can be finished on a single RTX 2080 Ti with 11G memory. The code is released.

【3】 An Online Framework for Cognitive Load Assessment in Assembly Tasks 标题:一种装配任务认知负荷在线评估框架 链接:https://arxiv.org/abs/2109.03627

作者:Marta Lagomarsino,Marta Lorenzini,Elena De Momi,Arash Ajoudani 机构:Human-Robot Interfaces and Physical Interaction Laboratory, Istituto Italiano di Tecnologia, Via San Quirico ,d, Genoa, Italy, Department of Electronics, Information and Bioengineering, Politecnico di Milano, Via Giuseppe Colombo, Milan, Italy 摘要:工业4.0的持续趋势彻底改变了普通工作场所,深刻改变了人类在生产链中的角色。工业环境中的人机工程学研究主要集中在减少操作员的身体疲劳和不适,以提高吞吐量和避免安全危害。然而,随着生产复杂性的增加,认知资源需求和心理负荷可能会影响操作员的绩效和车间工作效率。认知科学中最先进的方法离线工作和/或涉及难以在工业环境中部署的大型设备。本文提出了一种在线评估制造业(主要是装配业)认知负荷的新方法,该方法直接从立体摄像机的输入图像中检测人体运动模式。头部姿势估计和骨骼跟踪被用来调查工人的注意力,评估多动和不可预见的动作。试点实验表明,我们的因素评估工具能够深入了解员工的心理负荷,甚至可以通过与生理和绩效测量的相关性得到证实。根据本研究收集的数据,基于视觉的认知负荷评估有可能整合到机电系统的开发中,以改善制造业的认知工效学。 摘要:The ongoing trend towards Industry 4.0 has revolutionised ordinary workplaces, profoundly changing the role played by humans in the production chain. Research on ergonomics in industrial settings mainly focuses on reducing the operator's physical fatigue and discomfort to improve throughput and avoid safety hazards. However, as the production complexity increases, the cognitive resources demand and mental workload could compromise the operator's performance and the efficiency of the shop floor workplace. State-of-the-art methods in cognitive science work offline and/or involve bulky equipment hardly deployable in industrial settings. This paper presents a novel method for online assessment of cognitive load in manufacturing, primarily assembly, by detecting patterns in human motion directly from the input images of a stereo camera. Head pose estimation and skeleton tracking are exploited to investigate the workers' attention and assess hyperactivity and unforeseen movements. Pilot experiments suggest that our factor assessment tool provides significant insights into workers' mental workload, even confirmed by correlations with physiological and performance measurements. According to data gathered in this study, a vision-based cognitive load assessment has the potential to be integrated into the development of mechatronic systems for improving cognitive ergonomics in manufacturing.

【4】 Tactile Image-to-Image Disentanglement of Contact Geometry from Motion-Induced Shear 标题:运动诱导剪切中接触几何的触觉像像解缠 链接:https://arxiv.org/abs/2109.03615

作者:Anupam K. Gupta,Laurence Aitchison,Nathan F. Lepora 机构:Department of Engineering Maths and Bristol Robotics Laboratory, University of Bristol United Kingdom, Department of Computer Science 备注:15 pages, 6 figure, under review CORL 2021 摘要:机器人触摸,特别是使用软光学触觉传感器时,会因运动相关剪切而产生变形。传感器接触刺激物的方式与关于刺激物几何形状的触觉信息纠缠在一起。在这项工作中,我们提出了一个有监督的卷积深度神经网络模型,该模型学习在潜在空间中分离由接触几何引起的传感器变形分量和由滑动诱发剪切引起的传感器变形分量。该方法通过从剪切图像重建无耳触觉图像并显示它们与无滑动运动采集的无耳触觉图像匹配来验证。此外,无耳触觉图像提供了从剪切数据不可能实现的接触几何体的忠实重建,以及可用于伺服控制围绕各种2D形状滑动的接触姿态的稳健估计。最后,将接触几何重建与伺服控制滑动相结合,实现了各种二维形状的真实全对象重建。该方法对具有剪切敏感触觉的机器人深度学习模型具有广泛的适用性。 摘要:Robotic touch, particularly when using soft optical tactile sensors, suffers from distortion caused by motion-dependent shear. The manner in which the sensor contacts a stimulus is entangled with the tactile information about the geometry of the stimulus. In this work, we propose a supervised convolutional deep neural network model that learns to disentangle, in the latent space, the components of sensor deformations caused by contact geometry from those due to sliding-induced shear. The approach is validated by reconstructing unsheared tactile images from sheared images and showing they match unsheared tactile images collected with no sliding motion. In addition, the unsheared tactile images give a faithful reconstruction of the contact geometry that is not possible from the sheared data, and robust estimation of the contact pose that can be used for servo control sliding around various 2D shapes. Finally, the contact geometry reconstruction in conjunction with servo control sliding were used for faithful full object reconstruction of various 2D shapes. The methods have broad applicability to deep learning models for robots with a shear-sensitive sense of touch.

【5】 LiDARTouch: Monocular metric depth estimation with a few-beam LiDAR 标题:LiDARTouch:用小波束激光雷达进行单目测量深度估计 链接:https://arxiv.org/abs/2109.03569

作者:Florent Bartoccioni,Éloi Zablocki,Patrick Pérez,Matthieu Cord,Karteek Alahari 机构:Valeo.ai, rue de Courcelle, Paris, France, Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France. 备注:Preprint. Under review 摘要:基于视觉的深度估计是自治系统的一个关键特性,它通常依赖于单个摄像机或多个独立的摄像机。在这种单目设置中,密集深度通过一个或多个昂贵的激光雷达(例如64光束)的额外输入获得,或仅通过相机方法获得,这些方法存在尺度模糊和无限深度问题。在本文中,我们提出了一种新的方案,通过将单目相机与轻型激光雷达(例如,具有4束光束)相结合来密集估计米制深度,这是当今汽车级大规模生产的激光扫描仪的典型特征。受最近自我监督方法的启发,我们引入了一种称为LiDARTouch的新框架,利用激光雷达的“触感”从单目图像估计密集深度图,即不需要密集地面真实深度。在我们的设置中,最小激光雷达输入在三个不同的层次上起作用:作为附加模型的输入,在自监督激光雷达重建目标函数中,以及估计姿态变化(自监督深度估计体系结构的关键组成部分)。我们的LiDARTouch框架在KITTI数据集的自监督深度估计方面达到了最新水平,从而支持我们将非常稀疏的LiDAR信号与其他视觉特征相结合的选择。此外,我们还表明,使用几束激光雷达可以缓解仅相机方法所面临的尺度模糊和无限深度问题。我们还证明,完全监督深度完井文献中的方法可以适应具有最小激光雷达信号的自我监督机制。 摘要:Vision-based depth estimation is a key feature in autonomous systems, which often relies on a single camera or several independent ones. In such a monocular setup, dense depth is obtained with either additional input from one or several expensive LiDARs, e.g., with 64 beams, or camera-only methods, which suffer from scale-ambiguity and infinite-depth problems. In this paper, we propose a new alternative of densely estimating metric depth by combining a monocular camera with a light-weight LiDAR, e.g., with 4 beams, typical of today's automotive-grade mass-produced laser scanners. Inspired by recent self-supervised methods, we introduce a novel framework, called LiDARTouch, to estimate dense depth maps from monocular images with the help of ``touches'' of LiDAR, i.e., without the need for dense ground-truth depth. In our setup, the minimal LiDAR input contributes on three different levels: as an additional model's input, in a self-supervised LiDAR reconstruction objective function, and to estimate changes of pose (a key component of self-supervised depth estimation architectures). Our LiDARTouch framework achieves new state of the art in self-supervised depth estimation on the KITTI dataset, thus supporting our choices of integrating the very sparse LiDAR signal with other visual features. Moreover, we show that the use of a few-beam LiDAR alleviates scale ambiguity and infinite-depth issues that camera-only methods suffer from. We also demonstrate that methods from the fully-supervised depth-completion literature can be adapted to a self-supervised regime with a minimal LiDAR signal.

【6】 Autonomous search of an airborne release in urban environments using informed tree planning 标题:使用知情树木规划在城市环境中对空中放行的自主搜索 链接:https://arxiv.org/abs/2109.03542

作者:Callum Rhodes,Cunjia Liu,Paul Westoby,Wen-Hua Chen 机构:Dept. of Aeronautical and Automotive Engineering, Loughborough University, Loughborough, LE,TU, UK, CBR Division, Dstl Porton Down 摘要:使用自动车辆定位化学源是灾害响应团队安全有效地处理化学紧急情况的一个关键工具。虽然使用自主系统对震源定位进行了大量工作,但大多数以前的工作都假设了开放环境或采用了简单的避障,与估算程序分开。在本文中,我们在一个整体框架中探讨了源项估计和避障的路径规划任务的耦合。建议的系统根据当前风场估计和本地地图智能生成潜在的气体采样位置。然后,执行树搜索以生成指向估计源位置的路径,该路径穿过任何障碍物,并且仍然允许探索潜在的优势采样位置。然后,在一系列高保真仿真中,针对Entrotaxis技术对所提出的知情树规划算法进行了测试。在特征丰富的环境中,所提出的系统比Entrotaxis更有效地减少源位置误差,同时也显示出更一致和更稳健的结果。 摘要:The use of autonomous vehicles for chemical source localisation is a key enabling tool for disaster response teams to safely and efficiently deal with chemical emergencies. Whilst much work has been performed on source localisation using autonomous systems, most previous works have assumed an open environment or employed simplistic obstacle avoidance, separate to the estimation procedure. In this paper, we explore the coupling of the path planning task for both source term estimation and obstacle avoidance in a holistic framework. The proposed system intelligently produces potential gas sampling locations based on the current estimation of the wind field and the local map. Then a tree search is performed to generate paths toward the estimated source location that traverse around any obstacles and still allow for exploration of potentially superior sampling locations. The proposed informed tree planning algorithm is then tested against the Entrotaxis technique in a series of high fidelity simulations. The proposed system is found to reduce source position error far more efficiently than Entrotaxis in a feature rich environment, whilst also exhibiting vastly more consistent and robust results.

【7】 Recalibrating the KITTI Dataset Camera Setup for Improved Odometry Accuracy 标题:重新校准KITTI数据集相机设置以提高里程计精度 链接:https://arxiv.org/abs/2109.03462

作者:Igor Cvišić,Ivan Marković,Ivan Petrović 机构:Authors are with the University of Zagreb Faculty of Electrical Engi-neering and Computing 备注:None 摘要:在过去十年中,评估里程计准确性的最相关公共数据集之一是KITTI数据集。除了高质量和丰富的传感器设置外,它的成功还得益于在线评估工具,该工具使研究人员能够对算法进行基准测试和比较。结果仅在测试子集上进行评估,而不了解地面真实情况,从而产生无偏、无过拟合的结果,从而对基于摄像头、3D激光或两者的组合的机器人定位进行相关验证。但是,与任何传感器设置一样,它需要事先校准,并提供校正的立体图像,从而引入对默认校准参数的依赖。考虑到这一点,如果能够找到一组更好的标定参数,从而产生更高的里程计精度,自然会出现一个问题。在本文中,我们提出了一种新的方法,用于KITTI数据集多摄像机设置的一次标定。该方法产生了更好的标定参数,无论是在较低的标定重投影误差还是较低的视觉里程计误差的意义上。我们进行的实验表明,对于三种不同的里程计算法,即SOFT2、ORB-SLAM2和VISO2,所提出的校准参数显著提高了里程计的精度。此外,我们的里程计SOFT2与建议的校准方法结合,在官方KITTI记分板上实现了最高精度,平移误差为0.53%,旋转误差为0.0009 deg/m,甚至优于基于3D激光的方法。 摘要:Over the last decade, one of the most relevant public datasets for evaluating odometry accuracy is the KITTI dataset. Beside the quality and rich sensor setup, its success is also due to the online evaluation tool, which enables researchers to benchmark and compare algorithms. The results are evaluated on the test subset solely, without any knowledge about the ground truth, yielding unbiased, overfit free and therefore relevant validation for robot localization based on cameras, 3D laser or combination of both. However, as any sensor setup, it requires prior calibration and rectified stereo images are provided, introducing dependence on the default calibration parameters. Given that, a natural question arises if a better set of calibration parameters can be found that would yield higher odometry accuracy. In this paper, we propose a new approach for one shot calibration of the KITTI dataset multiple camera setup. The approach yields better calibration parameters, both in the sense of lower calibration reprojection errors and lower visual odometry error. We conducted experiments where we show for three different odometry algorithms, namely SOFT2, ORB-SLAM2 and VISO2, that odometry accuracy is significantly improved with the proposed calibration parameters. Moreover, our odometry, SOFT2, in conjunction with the proposed calibration method achieved the highest accuracy on the official KITTI scoreboard with 0.53% translational and 0.0009 deg/m rotational error, outperforming even 3D laser-based methods.

【8】 Joint Search of Optimal Topology and Trajectory for Planar Linkages 标题:平面连杆机构最优拓扑和轨迹的联合搜索 链接:https://arxiv.org/abs/2109.03392

作者:Zherong Pan,Min Liu,Xifeng Gao,Dinesh Manocha 机构: 1Department of Computer Science, University of Illinois at Urbana-Champaign, 2Department of Computer Scienceand Electrical & Computer Engineering, University of Maryland atCollege Park, 3Department of ComputerScience, Florida State University 备注:18 pages 摘要:我们提出了一种算法来计算平面连杆拓扑和几何,给定用户指定的末端执行器轨迹。平面连杆结构将单个致动器的旋转或棱柱运动转换为任意复杂的周期运动,这是在我们日常生活中建造低成本、模块化机器人、机械玩具和可折叠结构(椅子、自行车和架子)时的一个重要组成部分。即使是有经验的工程师,这种结构的设计也需要反复试验。我们的研究提供了半自动方法,用于探索在高水平规范和约束条件下的新设计。}我们将该问题描述为一个具有二次目标函数和非凸二次约束的非光滑数值优化问题,其中包含混合整数决策变量(MIQCQP)。我们提出并比较了三种近似算法来解决这个问题:混合整数圆锥规划(MICP)、混合整数非线性规划(MINLP)和模拟退火(SA)。我们评估了搜索涉及10-14美元刚性连杆的平面连杆机构的这些算法。我们的结果表明,通过将MICP和MINLP相结合,可以获得最佳性能,从而形成一种能够在台式机上几个小时内找到平面连杆机构的混合算法,该算法在优化性方面显著优于SA基线。通过将平面连杆机构用作步行机器人的腿,我们强调了优化平面连杆机构的有效性。 摘要:We present an algorithm to compute planar linkage topology and geometry, given a user-specified end-effector trajectory. Planar linkage structures convert rotational or prismatic motions of a single actuator into an arbitrarily complex periodic motion, \refined{which is an important component when building low-cost, modular robots, mechanical toys, and foldable structures in our daily lives (chairs, bikes, and shelves). The design of such structures require trial and error even for experienced engineers. Our research provides semi-automatic methods for exploring novel designs given high-level specifications and constraints.} We formulate this problem as a non-smooth numerical optimization with quadratic objective functions and non-convex quadratic constraints involving mixed-integer decision variables (MIQCQP). We propose and compare three approximate algorithms to solve this problem: mixed-integer conic-programming (MICP), mixed-integer nonlinear programming (MINLP), and simulated annealing (SA). We evaluated these algorithms searching for planar linkages involving $10-14$ rigid links. Our results show that the best performance can be achieved by combining MICP and MINLP, leading to a hybrid algorithm capable of finding the planar linkages within a couple of hours on a desktop machine, which significantly outperforms the SA baseline in terms of optimality. We highlight the effectiveness of our optimized planar linkages by using them as legs of a walking robot.

【9】 Convex Iteration for Distance-Geometric Inverse Kinematics 标题:距离几何逆运动学的凸迭代法 链接:https://arxiv.org/abs/2109.03374

作者:Matthew Giamou,Filip Marić,David M. Rosen,Valentin Peretroukhin,Nicholas Roy,Ivan Petrović,Jonathan Kelly 机构: University of Zagreb 备注:Submitted to IEEE Robotics and Automation Letters 摘要:逆运动学(IK)是一个寻找满足一个或多个末端效应器位置或姿势约束的机器人关节配置的问题。对于具有冗余自由度的机器人,通常存在无穷多个非凸解集。当工作空间中的障碍物施加碰撞避免约束时,IK问题更加复杂。一般来说,不存在产生可行配置的封闭形式表达式,这促使使用数值解方法。然而,这些方法依赖于非凸问题的局部优化,通常需要精确初始化或多次重新初始化才能收敛到有效解。在这项工作中,我们首先将复杂的逆运动学问题表述为凸可行性问题,其低阶可行点提供精确的IK解。然后,我们提出了CIDGIK(距离几何逆运动学的凸迭代),该算法通过一系列半定规划来解决这些可行性问题,其目标是鼓励低秩极小化。我们的问题公式巧妙地统一了机器人的配置空间和工作空间约束:机器人固有的几何结构和避障都表示为简单的线性矩阵方程和不等式。我们对各种流行的机械手模型的实验结果表明,与传统的基于非线性优化的方法相比,该方法具有更快、更精确的收敛速度,特别是在有许多障碍物的环境中。 摘要:Inverse kinematics (IK) is the problem of finding robot joint configurations that satisfy constraints on the position or pose of one or more end-effectors. For robots with redundant degrees of freedom, there is often an infinite, nonconvex set of solutions. The IK problem is further complicated when collision avoidance constraints are imposed by obstacles in the workspace. In general, closed-form expressions yielding feasible configurations do not exist, motivating the use of numerical solution methods. However, these approaches rely on local optimization of nonconvex problems, often requiring an accurate initialization or numerous re-initializations to converge to a valid solution. In this work, we first formulate complicated inverse kinematics problems as convex feasibility problems whose low-rank feasible points provide exact IK solutions. We then present CIDGIK (Convex Iteration for Distance-Geometric Inverse Kinematics), an algorithm that solves these feasibility problems with a sequence of semidefinite programs whose objectives are designed to encourage low-rank minimizers. Our problem formulation elegantly unifies the configuration space and workspace constraints of a robot: intrinsic robot geometry and obstacle avoidance are both expressed as simple linear matrix equations and inequalities. Our experimental results for a variety of popular manipulator models demonstrate faster and more accurate convergence than a conventional nonlinear optimization-based approach, especially in environments with many obstacles.

【10】 Certifiable Outlier-Robust Geometric Perception: Exact Semidefinite Relaxations and Scalable Global Optimization 标题:可证明的离群点-鲁棒几何感知:精确的半定松弛和可伸缩的全局优化 链接:https://arxiv.org/abs/2109.03349

作者:Heng Yang,Luca Carlone 机构: Massachusetts Institute of Technology 备注:18 pages main text, 10 pages supplementary 摘要:我们提出了第一个通用和可扩展的框架来设计可证明的算法,以在存在异常值的情况下实现鲁棒的几何感知。我们的第一个贡献是表明,使用常见稳健成本的估计,如截断最小二乘法(TLS)、最大一致性、Geman-McClure、Tukey的双权重等,可以重新表述为多项式优化问题(POPs)。通过关注TLS成本,我们的第二个贡献是利用POP中的稀疏性,并提出一种稀疏半定规划(SDP)松弛,该松弛比标准Lasserre的层次结构小得多,同时保持精确性,即SDP使用最优性证书恢复非凸POP的优化器。我们的第三个贡献是通过介绍STRIDE以前所未有的规模和精度解决SDP松弛问题,STRIDE是一个将凸SDP上的全局下降与非凸POP上的快速局部搜索相结合的解算器。我们的第四个贡献是对所提出的框架在六个几何感知问题上的评估,包括单个和多个旋转平均、点云和网格注册、绝对姿势估计以及类别级对象姿势和形状估计。我们的实验表明:(i)我们的稀疏SDP松弛是精确的,在应用程序中有高达60%-90%的异常值;(ii)尽管距离实时性还很遥远,但STERID在中等规模问题上的速度比现有SDP解算器快100倍,是唯一能够解决具有数十万个高精度约束的大规模SDP的解算器;(iii)STERID为稳健估计(例如RANSAC或分级非凸性)的现有快速启发式提供了保障,即,如果启发式估计是最优的,则证明全局最优;如果启发式估计是次优的,则检测并允许逃避局部最优。 摘要:We propose the first general and scalable framework to design certifiable algorithms for robust geometric perception in the presence of outliers. Our first contribution is to show that estimation using common robust costs, such as truncated least squares (TLS), maximum consensus, Geman-McClure, Tukey's biweight, among others, can be reformulated as polynomial optimization problems (POPs). By focusing on the TLS cost, our second contribution is to exploit sparsity in the POP and propose a sparse semidefinite programming (SDP) relaxation that is much smaller than the standard Lasserre's hierarchy while preserving exactness, i.e., the SDP recovers the optimizer of the nonconvex POP with an optimality certificate. Our third contribution is to solve the SDP relaxations at an unprecedented scale and accuracy by presenting STRIDE, a solver that blends global descent on the convex SDP with fast local search on the nonconvex POP. Our fourth contribution is an evaluation of the proposed framework on six geometric perception problems including single and multiple rotation averaging, point cloud and mesh registration, absolute pose estimation, and category-level object pose and shape estimation. Our experiments demonstrate that (i) our sparse SDP relaxation is exact with up to 60%-90% outliers across applications; (ii) while still being far from real-time, STRIDE is up to 100 times faster than existing SDP solvers on medium-scale problems, and is the only solver that can solve large-scale SDPs with hundreds of thousands of constraints to high accuracy; (iii) STRIDE provides a safeguard to existing fast heuristics for robust estimation (e.g., RANSAC or Graduated Non-Convexity), i.e., it certifies global optimality if the heuristic estimates are optimal, or detects and allows escaping local optima when the heuristic estimates are suboptimal.

【11】 Adaptive Computing in Robotics, Leveraging ROS 2 to Enable Software-Defined Hardware for FPGAs 标题:机器人中的自适应计算,利用ROS 2为FPGA启用软件定义的硬件 链接:https://arxiv.org/abs/2109.03276

作者:Víctor Mayoral-Vilches,Giulio Corradi 摘要:机器人技术中的传统软件开发是关于在给定机器人的CPU中使用预定义的体系结构和约束对功能进行编程。相反,在自适应计算中,构建机器人行为就是编写一个体系结构。通过利用自适应计算,机器人专家可以在运行时调整其计算系统的一个或多个属性(例如,确定性、功耗、安全态势或吞吐量)。然而,机器人专家不是硬件工程师,嵌入式专业知识在他们当中是稀缺的。本白皮书采用了以ROS 2机器人专家为中心的自适应计算观点,并提出了一种架构,将FPGA作为ROS 2生态系统的一流参与者。提出的体系结构与平台和技术无关,并且易于移植。该体系结构的核心组件在Apache 2.0许可下公开,为机器人专家利用自适应计算和创建软件定义的硬件铺平了道路。 摘要:Traditional software development in robotics is about programming functionality in the CPU of a given robot with a pre-defined architecture and constraints. With adaptive computing, instead, building a robotic behavior is about programming an architecture. By leveraging adaptive computing, roboticists can adapt one or more of the properties of its computing systems (e.g. its determinism, power consumption, security posture, or throughput) at run time. Roboticists are not, however, hardware engineers, and embedded expertise is scarce among them. This white paper adopts a ROS 2 roboticist-centric view for adaptive computing and proposes an architecture to include FPGAs as a first-class participant of the ROS 2 ecosystem. The architecture proposed is platform- and technology-agnostic, and is easily portable. The core components of the architecture are disclosed under an Apache 2.0 license, paving the way for roboticists to leverage adaptive computing and create software-defined hardware.

机器翻译,仅供参考

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2021-09-09,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 arXiv每日学术速递 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
机器翻译
机器翻译(Tencent Machine Translation,TMT)结合了神经机器翻译和统计机器翻译的优点,从大规模双语语料库自动学习翻译知识,实现从源语言文本到目标语言文本的自动翻译,目前可支持十余种语言的互译。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档