SLAM包含了两个主要的任务:定位与构图,在移动机器人或者自动驾驶中,这是一个十分重要的问题:机器人要精确的移动,就必须要有一个环境的地图,那么要构建环境的地图就需要知道机器人的位置。
本系列文章主要分成四个部分:
在第一部分中,将介绍Lidar SLAM,包括Lidar传感器,开源Lidar SLAM系统,Lidar中的深度学习以及挑战和未来。
第二部分重点介绍了Visual SLAM,包括相机传感器,不同稠密SLAM的开源视觉SLAM系统。
第三部分介绍视觉惯性里程法SLAM,视觉SLAM中的深度学习以及未来。
第四部分中,将介绍激光雷达与视觉的融合。
在1990年,[1]首次提出使用EKF(扩展卡尔曼滤波器)来逐步估计机器人姿态的后验分布以及地标点的位置。实际上,机器人从未知环境的未知位置开始,通过反复观测运动过程中的环境特征来定位自身的位置和姿态,然后根据自身的位姿都建周围环境的增量图,从而达到同时定位和地图构建的目的。
实际上 定位问题是近年来非常复杂且热点的问题。定位技术取决于环境对成本,准确性,定位频率和鲁棒性的需求,这可以通过GPS(全球定位系统),IMU(惯性测量单元)和无线信号等来实现[2]。但是GPS只能在室外工作,IMU系统具有累积误差。无线技术作为一种主动系统,无法在成本和准确性之间取得平衡。随着快速的发展,配备激光雷达,摄像头,IMU和其他传感器的SLAM近年来兴起了。从基于过滤器的SLAM开始,基于图的SLAM现在起着主要作用。该算法从KF(卡尔曼滤波器),EKF和PF(粒子滤波器)派生到基于图形的优化。并且单线程已被多线程取代。SLAM的技术也从最早的军事用途原型转变为后来的多传感器融合机器人应用。
激光雷达传感器
激光雷达传感器可分为2D激光雷达和3D激光雷达,它们由激光雷达光束的数量定义。在生产工艺方面,激光雷达还可分为机械激光雷达,混合式固态激光雷达(如MEMS)(微机电)和固态激光雷达。固态激光雷达可以通过相控阵和闪存技术生产。
Velodyne:在机械激光雷达中,它具有VLP-16,HDL-32E和HDL-64E。在混合固态激光雷达中,它具有32E的Ultra Puck Auto。可以说是资料最多,软件最为完善的激光雷达。
SLAMTEC:它具有低成本的激光雷达和机器人平台,例如RPLIDAR A1,A2和R3。单线激光雷达,是一个很好的激光SLAN入门的激光雷达,加上一个移动平台,你就可以做出一个移动机器人。
Ouster:具有16至128通道的机械激光雷达。
Quanergy:S3是世界上第一个发布的固态激光雷达,M8是机械激光雷达。S3-QI是微固态激光雷达。
Ibeo:它具有机械激光雷达中的Lux 4L和Lux 8L。与法雷奥合作,它发布了混合动力固态激光雷达,名为Scala。
激光雷达的发展趋势是小型化和轻质固态,激光雷达将占领市场,并能够满足大多数产品的应用。其他激光雷达公司包括但不限于sick, Hokuyo, HESAI, RoboSense, LeddarTech, ISureStar,benewake, Livox, Innovusion, Innoviz, Trimble, Leishen Intelligent System
2D激光雷达SLAM
•Gmapping:它是基于RBPF(Rao-Blackwellisation局部滤波器)方法的机器人中使用最多的SLAM软件包。它增加了扫描匹配方法来估计位置[3]。它是基FastSLAM [4]的带有栅格地图的改进版本。gmapping中主要函数之间的调用关系
•HectorSlam:它将2D SLAM系统和3D导航与扫描匹配技术和惯性传感系统结合在一起[5]。
•KartoSLAM:这是一个基于图的SLAM系统[6]。
•LagoSLAM:其基础是基于图的SLAM,这是最小化非线性非凸代价函数的方法[7]。
•CoreSLAm:这是一种在性能损失最小的情况下可以理解的算法[8]。
• Cartographer :这是Google的SLAM系统[9]。它采用了子地图和闭环检测,以实现更好的产品级性能。该算法可以跨多个平台和传感器配置以2D和3D提供SLAM。
3D 激光雷达SLAM
•Loam:这是一种使用3D Lidar [10]进行状态估计实时构建地图的方法。它还具有来回旋转版本(应该是指激光扫描的方式)和连续扫描2D激光雷达版本。
•Lego-Loam:它从Velodyne VLP-16激光雷达(水平放置)和可选的IMU数据中输入点云作为输入。该系统实时输出6D姿态估计,并具有全局优化和闭环检测[11]。
• Cartographer:它支持2D和3D SLAM [9]。
•IMLS-SLAM:它提出了一种新的低漂移SLAM算法,该算法仅基于基于扫描模型匹配框架的3D LiDAR数据[10]。
基于深度学习的激光SLAM
基于特征的深度学习的检测:
PointNetVLAD [11]允许端到端训练从给定的3D点云中提取全局描述符,以解决基于点云的位置识别检索。
VoxelNet [12]是一种通用的3D检测网络,它将特征提取和边界框预测统一为一个单阶段的,端到端的可训练深度网络,其他工作可以在BirdNet [13]中看到。
LMNet [14]描述了一种有效的单级深度卷积神经网络,用于检测对象并输出对象图和每个点的边界框偏移值。
PIXOR [15]是一种无提议的单级检测器,可输出从像素级神经网络预测中解码的定向3D对象估计。
Yolo3D [16]建立在2D透视图像空间中oneshot回归元体系结构成功的基础之上,并将其扩展以从LiDAR点云生成定向的3D对象边界框。
PointCNN [17]建议从输入的点云中学习X变换。X转换是通过典型卷积算子的逐元素乘积和求和运算来应用的。
MV3D [18]是一种感觉融合框架,将激光雷达点云和RGB图像作为输入并预测定向的3D边界框。
PU-GAN [19]提出了一种基于生成对抗网络(GAN)的新的点云上采样网络。
点云的分割与识别:
对3D点云进行分割的方法可以分为基于边缘的方法,区域增长,模型拟合,混合方法,机器学习应用程序和深度学习[20]。本文重点介绍深度学习的方法。
PointNet [21]设计了一种直接输入点云的新型神经网络,它具有分类,分割和语义分析的功能。
PointNet ++ [22]在PointNet的基础上学习随着上下文规模的增加而具有的层次结构特征。在基于PointNet ++的端到端3D对象检测网络。
VoteNet [23]为点云构建了一个3D检测流程,SegMap [24]是基于3D点云中线段提取的定位和制图问题的地图表示解决方案。
SqueezeSeg [25]是具有递归CRF(条件随机场)的卷积神经网络,用于从3d激光雷达点云实时分割道路目标。
PointSIFT [26]是3D点云的语义分割框架。它基于一个简单的模块,该模块从八个方向的相邻点提取特征。
PointWise [27]提出了一种卷积神经网络,用于使用3D点云进行语义分割和对象识别。3P-RNN [28]是一种新颖的端到端方法,用于沿两个水平方向的非结构化点云语义分割,以利用固有的上下文特征。可以看到其他类似的工作,但不仅限于SPG [29]和审阅[30]。
SegMatch [31]是一种基于3D的分割检测和匹配的闭环方法。
KdNetwork [32]专为3D模型识别任务而设计,可与非结构化点云一起使用。
DeepTemporalSeg [33]提出了一种深度卷积神经网络(DCNN),用于在时间上具有一致性的LiDAR扫描的语义分割。
LU-Net [34]实现了语义分割的功能,而不是应用某些全局3D分割方法。
点云定位:
论文[35]是一种新颖的基于学习的LiDAR定位系统,可实现厘米级的定位精度。
SuMa ++ [36]在整个扫描过程中以点标记方式计算语义分割结果,从而使我们能够构建带有标记的surfels的语义丰富的地图,并通过语义约束来改进投影扫描匹配
点云SLAM的挑战与未来
1)成本和适应性
Lidar的优势在于它可以提供3D信息,并且不受夜光变化的影响。另外,视角比较大,可以达到360度。但是激光雷达的技术门槛很高,导致开发周期长,成本高昂。未来小型化,合理的成本,固态以及实现高可靠性和适应性是趋势。
2)低纹理和动态环境
大多数SLAM系统只能在固定环境中工作,但环境是会不断变化。此外,低纹理的环境(如长走廊和大管道)将给激光雷达SLAM带来麻烦。[37]使用IMU协助2D SLAM解决上述障碍。此外,[38]将时间维度纳入构建地图的过程,以使机器人能够在动态环境中运行时保持准确的地图。应该更加深入地考虑如何使Lidar SLAM对低纹理和动态环境更强大,以及如何使地图保持最新状态。
3)对抗传感器攻击
深度神经网络很容易受到对抗性样本的攻击,这在基于相机的感知中也得到了证明。但是,在基于激光雷达的感知中,它非常重要,但尚未探索。[39]通过中继攻击,首先欺骗了激光雷达,干扰了输出数据和距离估计。这种新颖的饱和度攻击完全无法使激光雷达基于Velodynes VLP-16感测某个方向。[40]探索了策略性地控制欺骗性攻击以欺骗机器学习模型的可能性。本文将任务作为优化问题,针对输入扰动函数和目标函数设计建模方法,将攻击成功率提高到75%左右。对抗性传感器攻击将欺骗基于激光雷达点云的SLAM系统,该系统几乎很难发现和防御,因此是隐形的。在这种情况下,关于如何防止激光雷达SLAM系统受到对抗性传感器攻击的研究应该成为一个新课题。
参考文献
[1] Randall Smith, Matthew Self, and Peter Cheeseman. Estimating uncertain spatial relationships in robotics. In Autonomous robot vehicles, pages 167–193. Springer, 1990.
[2] Baichuan Huang, Jingbin Liu, Wei Sun, and Fan Yang. A robust indoor positioning method based on bluetooth low energy with separate channel information. Sensors, 19(16):3487, 2019.
[3] Sebastian Thrun, Wolfram Burgard, and Dieter Fox. Probabilistic robotics. MIT press, 2005.
[4] Michael Montemerlo, Sebastian Thrun, Daphne Koller, Ben Wegbreit, et al. Fastslam 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In IJCAI, pages 1151–1156, 2003. [5] Stefan Kohlbrecher, Oskar Von Stryk, Johannes Meyer, and Uwe Klingauf. A flexible and scalable slam system with full 3d motion estimation. In 2011 IEEE International Symposium on Safety, Security, and Rescue Robotics, pages 155–160. IEEE, 2011.
[6] Kurt Konolige, Giorgio Grisetti, Rainer K ¨ummerle, Wolfram Burgard, Benson Limketkai, and Regis Vincent. Efficient sparse pose adjustment for 2d mapping. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 22–29. IEEE, 2010.
[7] Luca Carlone, Rosario Aragues, Jos´e A Castellanos, and Basilio Bona. A linear approximation for graph-based simultaneous localization and mapping. Robotics: Science and Systems VII, pages 41–48, 2012.
[8] B Steux and O TinySLAM El Hamzaoui. A slam algorithm in less than 200 lines c-language program. Proceedings of the Control Automation Robotics & Vision (ICARCV), Singapore, pages 7–10, 2010.
[9] Wolfgang Hess, Damon Kohler, Holger Rapp, and Daniel Andor. Realtime loop closure in 2d lidar slam. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 1271–1278. IEEE, 2016.
[10] Jean-Emmanuel Deschaud. Imls-slam: scan-to-model matching based on 3d data. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 2480–2485. IEEE, 2018.
[11] Mikaela Angelina Uy and Gim Hee Lee. Pointnetvlad: Deep point cloud based retrieval for large-scale place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4470–4479, 2018.
[12] Yin Zhou and Oncel Tuzel. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4490–4499, 2018.
[13] Jorge Beltr´an, Carlos Guindel, Francisco Miguel Moreno, Daniel Cruzado, Fernando Garcia, and Arturo De La Escalera. Birdnet: a 3d object detection framework from lidar information. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 3517–3523. IEEE, 2018.
[14] Kazuki Minemura, Hengfui Liau, Abraham Monrroy, and Shinpei Kato. Lmnet: Real-time multiclass object detection on cpu using 3d lidar.
[15] Bin Yang, Wenjie Luo, and Raquel Urtasun. Pixor: Real-time 3d object detection from point clouds. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 7652–7660, 2018.
[16] Waleed Ali, Sherif Abdelkarim, Mahmoud Zidan, Mohamed Zahran, and Ahmad El Sallab. Yolo3d: End-to-end real-time 3d oriented object bounding box detection from lidar point cloud. In Proceedings of the European Conference on Computer Vision (ECCV), pages 0–0, 2018.
[17] Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. Pointcnn: Convolution on x-transformed points. In Advances in Neural Information Processing Systems, pages 820–830, 2018.
[18] Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1907–1915, 2017.
[19] Ruihui Li, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, and PhengAnn Heng. Pu-gan: A point cloud upsampling adversarial network. In Proceedings of the IEEE International Conference on Computer Vision, pages 7203–7212, 2019.
[20] E Grilli, F Menna, and F Remondino. A review of point clouds segmentation and classification algorithms. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 42:339, 2017.
[21] Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. arXiv preprint arXiv:1612.00593, 2016.
[22] Charles R Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarc
hical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413, 2017.
[23] Charles R Qi, Or Litany, Kaiming He, and Leonidas J Guibas. Deep hough voting for 3d object detection in point clouds. arXiv preprint arXiv:1904.09664, 2019. [24] Renaud Dube, Andrei Cramariuc, Daniel Dugas, Juan Nieto, Roland Siegwart, and Cesar Cadena. SegMap: 3d segment mapping using data-driven descriptors. In Robotics: Science and Systems (RSS), 2018.
[25] Bichen Wu, Alvin Wan, Xiangyu Yue, and Kurt Keutzer. Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. ICRA, 2018.
[26] Mingyang Jiang, Yiran Wu, Tianqi Zhao, Zelin Zhao, and Cewu Lu. Pointsift: A sift-like network module for 3d point cloud semantic segmentation. arXiv preprint arXiv:1807.00652, 2018.
[27] Binh-Son Hua, Minh-Khoi Tran, and Sai-Kit Yeung. Pointwise convolutional neural networks. In Computer Vision and Pattern Recognition (CVPR), 2018.
[28] Xiaoqing Ye, Jiamao Li, Hexiao Huang, Liang Du, and Xiaolin Zhang. 3d recurrent neural networks with context fusion for point cloud semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), pages 403–417, 2018.
[29] Loic Landrieu and Martin Simonovsky. Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4558–4567, 2018.
[30] Renaud Dub´e, Daniel Dugas, Elena Stumm, Juan Nieto, Roland Siegwart, and Cesar Cadena. Segmatch: Segment based place recognition in 3d point clouds. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 5266–5272. IEEE, 2017.
[31] Roman Klokov and Victor Lempitsky. Escape from cells: Deep kdnetworks for the recognition of 3d point cloud models. In Proceedings of the IEEE International Conference on Computer Vision, pages 863– 872, 2017.
[32] Ayush Dewan and Wolfram Burgard. Deeptemporalseg: Temporally consistent semantic segmentation of 3d lidar scans. arXiv preprint arXiv:1906.06962, 2019. [33] Pierre Biasutti, Vincent Lepetit, Jean-Franois Aujol, Mathieu Brdif, and Aurlie Bugeau. Lu-net: An efficient network for 3d lidar point cloud semantic segmentation based on end-to-end-learned 3d features and u-net. 08 2019.
[34] Shaoshuai Shi, Xiaogang Wang, and Hongsheng Li. Pointrcnn: 3d object proposal generation and detection from point cloud. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–779, 2019.
[35] Lu Weixin, Zhou Yao, Wan Guowei, Hou Shenhua, and Song Shiyu. L3-net: Towards learning based lidar localization for autonomous driving. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[36] Chen Xieyuanli, Milioto Andres, and Emanuelea Palazzolo. Suma++: Efficient lidar-based semantic slam. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019.
[37] Zhongli Wang, Yan Chen, Yue Mei, Kuo Yang, and Baigen Cai. Imuassisted 2d slam method for low-texture and dynamic environments. Applied Sciences, 8(12):2534, 2018.
[38] Aisha Walcott-Bryant, Michael Kaess, Hordur Johannsson, and John J Leonard. Dynamic pose graph slam: Long-term mapping in low dynamic environments. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1871–1878. IEEE, 2012.
[39] Hocheol Shin, Dohyun Kim, Yujin Kwon, and Yongdae Kim. Illusion and dazzle: Adversarial optical channel exploits against lidars for automotive applications. In International Conference on Cryptographic Hardware and Embedded Systems, pages 445–467. Springer, 2017.
[40] Yulong Cao, Chaowei Xiao, Benjamin Cyr, Yimeng Zhou, Won Park, Sara Rampazzi, Qi Alfred Chen, Kevin Fu, and Z Morley Mao. Adversarial sensor attack on lidar-based perception in autonomous driving. arXiv preprint arXiv:1907.06826, 2019