专栏首页点云PCL介绍一篇关于点云的深度学习的文章-PointNet

介绍一篇关于点云的深度学习的文章-PointNet

PointNet: Deep Learning on PointSets for 3D Classification and Segmentation

Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas StanfordUniversity

[arXiv version] [Code on GitHub]

Applications of PointNet. We propose anovel deep net architecture that consumes raw point cloud (set of points)without voxelization or rendering. It is a unified architecture that learnsboth global and local point features, providing a simple, efficient andeffective approach for a number of 3D recognition tasks.

PointNet的应用:提出了一种新的深度学习的架构用于处理原始点云,而不用处理体素网格化或渲染后的点云。它是一个完整的体系结构,可以用来学习点云的全局和本地点云特征,给三维识别任务提供了一种简单、有效且高效的方法。

Abstract

Point cloud is an important type ofgeometric data structure. Due to its irregular format, most researcherstransform such data to regular 3D voxel grids or collections of images. This,however, renders data unnecessarily voluminous and causes issues. In thispaper, we design a novel type of neural network that directly consumes pointclouds, which well respects the permutation invariance of points in the input.Our network, named PointNet, provides a unified architecture for applicationsranging from object classification, part segmentation, to scene semanticparsing. Though simple, PointNet is highly efficient and effective.Empirically, it shows strong performance on par or even better than state ofthe art. Theoretically, we provide analysis towards understanding of what thenetwork has learnt and why the network is robust with respect to inputperturbation and corruption.

点云是一种重要的几何数据结构类型。由于其数据格式不规则,大多数研究人员将这些数据转换成规则的三维体素网格或图像集合。然而,这使得数据不必要地变得庞大,并导致一些问题。文中设计了一种直接处理点云的神经网络,它很好地体现了输入点云的序列不变性。命名为PointNet,从对象分类,部分分割、场景的语义分析等方面提供了一个完整的的体系结构。虽然简单,但是PointNet网络是高效且有效的。从经验上看,它表现出很强的PAR水平,甚至优于state of the art。从理论上讲,……(反正就很厉害,鲁棒性很强呗,我也就是学习一下,看一下,因为实验实验条件有限,并没有跑代码)翻译是水平有限,有大牛带我飞吗?!!!!

Introduction

In this paper we explore deep learningarchitectures capable of reasoning about 3D geometric data such as point cloudsor meshes. Typical convolutional architectures require highly regular inputdata formats, like those of image grids or 3D voxels, in order to performweight sharing and other kernel optimizations. Since point clouds or meshes arenot in a regular format, most researchers typically transform such data toregular 3D voxel grids or collections of images (e.g, views) before feedingthem to a deep net architecture. This data representation transformation,however, renders the resulting data unnecessarily voluminous — while alsointroducing quantization artifacts that can obscure natural invariances of thedata.

For this reason we focus on a different input representation for 3Dgeometry using simply point clouds and name our resulting deep nets PointNetsPoint clouds are simple and unified structures that avoid the combinatorialirregularities and complexities of meshes,and thus are easier to learn from.The PointNet, however,still has to respect the fact that a point cloud is just a

set of points and therefore invariant topermutations of itsmembers, necessitating certain symmetrizations in the netcomputation.Further invariances to rigid motions also needto be considered.

在本文中,我们探索深层的学习网络架构,可以学习和理解的三维几何数据的点云或网格。我们知道典型的卷积结构需要高度规则的输入数据格式,如图像网格或三维体素,目的是为了什么优化内核(水平有限,只是感兴趣,请见谅)。由于点云或网格不是一种规则格式,大多数研究人员通常将这些数据转换成规则的三维体素网格或图像集合,然后将它们提供给深层网络。这一数据结构的转变为规则的网格点云,使不必要的数据的引入,使得数据量变大,导致了什么什么( you know)。

因为这个原因,对于深度学习领域研究不同的输入的三维几何表示的点云,就得到了我们的深度学习网络PointNet,输入的点云是简单而统一的结构,避免了组合的不规则性和复杂性的网格,从而更容易学习。然而……(翻译水平有限)

PointNet Architecture

To deal with unordered input set, key to ourapproach is the use of a single symmetricfunction, max pooling. Effectively the network learns a set of optimizationfunctions/criteria that select interesting or informative points of the pointcloud and encode the reason for their selection. The final fully connectedlayers of the network aggregate these learnt optimal values into the globaldescriptor for the entire shape as mentioned above (shape classification) orare used to predict per point labels (shape segmentation). Our input format iseasy to apply rigid or affine transformations to, as each point transformsindependently. Thus we can add a data-dependent spatial transformer networkthat attempts to canonicalize the data before the PointNet processes them, soas to further improve the results.

为了处理无序的输入集,我们的方法的关键是使用单一的对称函数-max pooling。实际上,是深度网络学习了一组优化函数或者标准,它们选择点云的角点或信息点,并编码它们为什么选择的原因。网络的最终是将这些学习到的最佳值聚集到上面描述的整个形状的全局描述符(形状分类)或用于预测每个点云标签(形状分割)(实在翻译能力有限)。输入的点云格式很容易应用刚性或仿射变换,……………………(后面真我自己意会了,能力有限,只是让大家知道这篇文章的存在,有兴趣的可以多学习学习,这是深度学习与三维结合的案例,以后可能会越来越多吧)

Figure 2. PointNet architecture. The classification network takes n points as input, applies input andfeature transformations, and then aggregates point features by max pooling. Theoutput is classification score for mclasses. The segmentation network is an extension to the classification net. Itconcatenates global and local features and outputs per point scores. mlp stands for multi-layer perceptron,the numbers in bracket are its layer sizes. Batchnorm is used for all layerswith ReLU. Dropout layers are used for the last mlp in classification net.

图2,PointNet架构:分类网络以N个点作为输入,应用输入和特征转换,然后通过max pooling 聚合点特征。输出是M类分类评分。分割网络的分类网络的延伸。它将全局和局部特征和每一点分数输出。MLP是多层感知器,括号内为其层尺寸。batchnorm用于所有图层ReLU。降层是用于分类网络最后的MLP。

Object Part Segmentation Results

Figure 3. Part Segmentation Results. We visualize the CAD part segmentation results across all 16 object categories. We show both results for partial simulated Kinect scans (left block) and complete ShapeNet CAD models (right block).图3.局部分割结果。我们可视化CAD零件在所有16个对象类别的分割结果。显示结果部分模拟Kinect扫描(左边)和完整的shapenet CAD模型(右边)。

Semantic Segmentation Results

Figure 4. Semantic Segmentation Results. Top row is input point cloud withcolor. Bottom row is output semantic segmentation result (on points) displayedin the same camera viewpoint as input.

图4.语义分割结果。顶行是带颜色的输入点云。底部行是输出的语义分割结果,显示在同一相机的视点作为输入。

Visualizing What PointNet has Learnt

Figure 5. Point function visualization. Our network learns a collection ofpoint function that selects representative/critical points from an input pointcloud. Here, we randomly pick 15 point functions from the 1024 functions in ourmodel and visualize the activation regions for them.

图5.点函数可视化。我们的深度网络学习点函数的集合,从输入点云选择代表/临界点。在这里,我们从模型中的1024个函数中随机挑选15点函数,并可视化它们的激活区域。

Figure 6. Visualizing Critical Points and Shape Upper-bound. The first row shows the input point clouds. The second row show the critical points picked by our PointNet. The third row shows the upper-bound shape for the input -- any input point sets that falls between the critical point set and the upper-bound set will result in the same classification result.图6.可视化临界点和形状上限。第一行显示输入点云。第二行显示的临界点是PointNet选择的结果。第三行显示输入的上限形状——任何介于临界点集和上限集之间的输入点集都会产生相同的分类结果。

偶尔看到这篇文章,觉得这是深度学习与三维点云结合的案例,相信这是未来的趋势吧,毕竟深度学习那么火!不过还是很抱歉各位,本人英语水平有限,能力也有限,只是想分享给大家,顺便自己学习一下,不要喷就好了,谢谢

扫描二维码关注微信公众号,积极分享与三维视觉相关的知识,当然如果您觉得我的文章对你有帮助,您可以点击“资助交流”,获取支付宝二维码资助我,您也可以直接在文章后点赞,当然最重要我只是抛砖引玉,希望大牛能分享你们的学习经验!一起进步!!

本文分享自微信公众号 - 点云PCL(dianyunPCL),作者:石城大海

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2017-08-10

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • HoPE杂乱场景的点云数据平面的提取

    标题:HoPE: Horizontal Plane Extractor for Cluttered 3D Scenes

    点云PCL博主
  • 比较全面的3D数据处理建模等链接收集

    点云PCL博主
  • PCL综述—三维图像处理

      三维图像是一种特殊的信息表达形式,其特征是表达的空间中三个维度的数据。和二维图像相比,三维图像借助第三个维度的信息,可以实现天然的物体-背景解耦。除此之外,...

    点云PCL博主
  • 【综述】最新5篇智联网/区块链/深度学习/对话系统/最优化等中英文综述论文推介

    【导读】专知内容组整理了最近人工智能领域相关期刊的5篇最新综述文章,为大家进行介绍,欢迎查看! ▌智联网:概念、问题和平台 ---- ---- 作者:王飞跃、张...

    WZEARW
  • 以政策为导向的有组织犯罪招募模式(Social and Information Networks)

    犯罪组织利用其在领土和地方社区的存在来招募新的劳动力,以便开展其犯罪活动和业务。吸引个人的能力对于维持权力和控制这些群体定居的领土是至关重要的。本研究提出了一个...

    用户6869393
  • TikTok Can Coexist with Instagram, but Going to Destroy YouTube

    Aaron Dinin teaches social media and entrepreneurship at Duke University. Ironic...

    仇诺伊
  • 形状变换在地震、风浪数据时间序列分类中的应用(CS LG)

    由于对大量工程结构(包括建筑物、桥梁、塔楼和海上平台等)的长期健康监测,使用时间序列分类法从大型数据库中自主检测所需事件,在土木工程中越来越重要。在这种情况下,...

    刘持诚
  • A Theory of State Abstraction for Reinforcement Learning

    A Theory of State Abstraction for Reinforcement Learning

    用户1908973
  • 早起—怎样开启高效的一天?

    原文作者:Gianni Cara  原文出处:www.quora.com/What-is-the-most-inspiring-way-to-start-the...

    Ewall
  • 作者分析中抑制域样式的重要性(CS CL)

    作者分析的许多方法的前提是写作风格的表现。但是,尽管进行了数十年的研究,但仍不清楚在多大程度上常用和广泛接受的表示形式(例如字符三字组频率)实际上代表了作者的写...

    刘子蔚

扫码关注云+社区

领取腾讯云代金券