1.简介

3D数据对于自动驾驶汽车，自动驾驶机器人，虚拟现实和增强现实至关重要。与以像素阵列表示的2D图像不同，它可以表示为多边形网格，体积像素网格，点云等。

1.1.点云

1.2.点云上的深度学习

https://arxiv.org/pdf/1604.03265.pdf

1.3. PointNet

• 点云是无序的。算法必须对输入集的排列保持不变。
• 如果我们旋转椅子，它仍然是椅子，对吗？网络对于不变的转换必须是不变的。
• 网络应捕获点之间的交互。

PointNet的作者介绍了一种将所有这些属性都考虑在内的神经网络。它设法解决分类，部分和语义分割任务。

https://arxiv.org/pdf/1612.00593.pdf

2.实施

https://arxiv.org/pdf/1612.00593.pdf

https://github.com/nikitakaraevv/pointnet/blob/master/nbs/PointNetClass.ipynb

2.1.数据集

http://3dvision.princeton.edu/projects/2014/3DShapeNets/

`import numpy as npimport randomimport math!pip install path.py;from path import Path`

`!wget http://3dvision.princeton.edu/projects/2014/3DShapeNets/ModelNet10.zip!unzip -q ModelNet10.zip path = Path("ModelNet10")`

`def read_off(file):    if 'OFF' != file.readline().strip():        raise('Not a valid OFF header')    n_verts, n_faces, __ = tuple([int(s) for s in file.readline().strip().split(' ')])    verts = [[float(s) for s in file.readline().strip().split(' ')] for i_vert in range(n_verts)]    faces = [[int(s) for s in file.readline().strip().split(' ')][1:] for i_face in range(n_faces)]    return verts, faces    with open(path/"bed/train/bed_0001.off", 'r') as f:    mesh = read_off(f)`

2.2.点采样

`verts, faces = meshareas = np.zeros((len(faces)))verts = np.array(verts) # function to calculate triangle area by its vertices# https://en.wikipedia.org/wiki/Heron%27s_formuladef triangle_area(pt1, pt2, pt3):    side_a = np.linalg.norm(pt1 - pt2)    side_b = np.linalg.norm(pt2 - pt3)    side_c = np.linalg.norm(pt3 - pt1)    s = 0.5 * ( side_a + side_b + side_c)    return max(s * (s - side_a) * (s - side_b) * (s - side_c), 0)**0.5 # we calculate areas of all faces in our meshfor i in range(len(areas)):    areas[i] = (triangle_area(verts[faces[i][0]],                              verts[faces[i][1]],                              verts[faces[i][2]]))`

`k = 3000# we sample 'k' faces with probabilities proportional to their areas# weights are used to create a distribution.# they don't have to sum up to one.sampled_faces = (random.choices(faces,                                weights=areas,                                k=k)) # function to sample points on a triangle surfacedef sample_point(pt1, pt2, pt3):    # barycentric coordinates on a triangle    # https://mathworld.wolfram.com/BarycentricCoordinates.html    s, t = sorted([random.random(), random.random()])    f = lambda i: s * pt1[i] + (t-s) * pt2[i] + (1-t) * pt3[i]    return (f(0), f(1), f(2)) pointcloud = np.zeros((k, 3)) # sample points on chosen faces for the point cloud of size 'k'for i in range(len(sampled_faces)):    pointcloud[i] = (sample_point(verts[sampled_faces[i][0]],                                  verts[sampled_faces[i][1]],                                  verts[sampled_faces[i][2]]))`

2.3.扩充

`# normalizenorm_pointcloud = pointcloud - np.mean(pointcloud, axis=0)norm_pointcloud /= np.max(np.linalg.norm(norm_pointcloud, axis=1)) # rotation around z-axistheta = random.random() * 2. * math.pi # rotation anglerot_matrix = np.array([[ math.cos(theta), -math.sin(theta),    0],                       [ math.sin(theta),  math.cos(theta),    0],                       [0,                             0,      1]]) rot_pointcloud = rot_matrix.dot(pointcloud.T).T # add some noisenoise = np.random.normal(0, 0.02, (pointcloud.shape))noisy_pointcloud = rot_pointcloud + noise`

2.4.模型

`import torchimport torch.nn as nnimport torch.nn.functional as F class Tnet(nn.Module):   def __init__(self, k=3):      super().__init__()      self.k=k      self.conv1 = nn.Conv1d(k,64,1)      self.conv2 = nn.Conv1d(64,128,1)      self.conv3 = nn.Conv1d(128,1024,1)      self.fc1 = nn.Linear(1024,512)      self.fc2 = nn.Linear(512,256)      self.fc3 = nn.Linear(256,k*k)       self.bn1 = nn.BatchNorm1d(64)      self.bn2 = nn.BatchNorm1d(128)      self.bn3 = nn.BatchNorm1d(1024)      self.bn4 = nn.BatchNorm1d(512)      self.bn5 = nn.BatchNorm1d(256)           def forward(self, input):      # input.shape == (bs,n,3)      bs = input.size(0)      xb = F.relu(self.bn1(self.conv1(input)))      xb = F.relu(self.bn2(self.conv2(xb)))      xb = F.relu(self.bn3(self.conv3(xb)))      pool = nn.MaxPool1d(xb.size(-1))(xb)      flat = nn.Flatten(1)(pool)      xb = F.relu(self.bn4(self.fc1(flat)))      xb = F.relu(self.bn5(self.fc2(xb)))            # initialize as identity      init = torch.eye(self.k, requires_grad=True).repeat(bs,1,1)      if xb.is_cuda:        init=init.cuda()      # add identity to the output      matrix = self.fc3(xb).view(-1,self.k,self.k) + init      return matrix`

`class Transform(nn.Module):   def __init__(self):        super().__init__()        self.input_transform = Tnet(k=3)        self.feature_transform = Tnet(k=64)        self.conv1 = nn.Conv1d(3,64,1)         self.conv2 = nn.Conv1d(64,128,1)        self.conv3 = nn.Conv1d(128,1024,1)         self.bn1 = nn.BatchNorm1d(64)        self.bn2 = nn.BatchNorm1d(128)        self.bn3 = nn.BatchNorm1d(1024)          def forward(self, input):        matrix3x3 = self.input_transform(input)        # batch matrix multiplication        xb = torch.bmm(torch.transpose(input,1,2), matrix3x3).transpose(1,2)        xb = F.relu(self.bn1(self.conv1(xb)))         matrix64x64 = self.feature_transform(xb)        xb = torch.bmm(torch.transpose(xb,1,2), matrix64x64).transpose(1,2)         xb = F.relu(self.bn2(self.conv2(xb)))        xb = self.bn3(self.conv3(xb))        xb = nn.MaxPool1d(xb.size(-1))(xb)        output = nn.Flatten(1)(xb)        return output, matrix3x3, matrix64x64`

`class PointNet(nn.Module):    def __init__(self, classes=10):        super().__init__()        self.transform = Transform()        self.fc1 = nn.Linear(1024, 512)        self.fc2 = nn.Linear(512, 256)        self.fc3 = nn.Linear(256, classes)         self.bn1 = nn.BatchNorm1d(512)        self.bn2 = nn.BatchNorm1d(256)        self.dropout = nn.Dropout(p=0.3)        self.logsoftmax = nn.LogSoftmax(dim=1)     def forward(self, input):        xb, matrix3x3, matrix64x64 = self.transform(input)        xb = F.relu(self.bn1(self.fc1(xb)))        xb = F.relu(self.bn2(self.dropout(self.fc2(xb))))        output = self.fc3(xb)        return self.logsoftmax(output), matrix3x3, matrix64x64`

`def pointnetloss(outputs, labels, m3x3, m64x64, alpha = 0.0001):    criterion = torch.nn.NLLLoss()    bs = outputs.size(0)    id3x3 = torch.eye(3, requires_grad=True).repeat(bs, 1, 1)    id64x64 = torch.eye(64, requires_grad=True).repeat(bs, 1, 1)    if outputs.is_cuda:        id3x3 = id3x3.cuda()        id64x64 = id64x64.cuda()    diff3x3 = id3x3 - torch.bmm(m3x3, m3x3.transpose(1, 2))    diff64x64 = id64x64 - torch.bmm(m64x64, m64x64.transpose(1, 2))    return criterion(outputs, labels) + alpha * (torch.norm(diff3x3) + torch.norm(diff64x64)) / float(bs)`

2.5.训练

https://github.com/nikitakaraevv/pointnet/blob/master/nbs/PointNetClass.ipynb

3.最后的话

https://github.com/nikitakaraevv/pointnet/blob/master/nbs/PointNetClass.ipynb

[1] 查尔斯·R·齐，苏昊，莫凯春，莱昂尼达斯· 吉巴斯，PointNet：针对3D分类和分割的点集深度学习（2017），CVPR 2017

http://stanford.edu/~rqi/pointnet/

http://news.mit.edu/2019/deep-learning-point-clouds-1021

[3] Loic Landrieu，3D点云的语义分割（2019年），巴黎埃斯特大学—机器学习和优化工作组

[4] Charles R. Qi等人，《基于3D数据的对象分类的体积和多视图CNN》（2016年），arxiv.org。

https://arxiv.org/pdf/1604.03265.pdf

0 条评论

• 使用PyTorch进行表格数据的深度学习

使用表格数据进行深度学习的最简单方法是通过fast-ai库，它可以提供非常好的结果，但是对于试图了解幕后实际情况的人来说，它可能有点抽象。因此在本文中，介绍了如...

• 如何构建PyTorch项目

自从开始训练深度神经网络以来，一直在想所有Python代码的结构是什么。理想情况下，良好的结构应支持对该模型进行广泛的试验，允许在一个紧凑的框架中实现各种不同的...

• 结合知识图谱实现基于电影的推荐系统

知识图谱（Knowledge Graph，KG）可以理解成一个知识库，用来存储实体与实体之间的关系。知识图谱可以为机器学习算法提供更多的信息，帮助模型更好地完成...

• python 写window服务(必须写在服务类里)

import win32serviceutil import win32service import win32event import os impo...

• Python使用socketServer包搭建简易服务器过程详解

socketserver包提供5个Server类，这些单独使用这些Server类都只能完成同步的操作，他是一个单线程的，不能同时处理各个客户端的请求，只能按照顺...

• 【论文解读】DeepFM论文总结

本次要总结分享的是 推荐/CTR 领域内著名的deepfm[1] 论文，参考的代码tensorflow-DeepFM[2]，该论文方法较为简单，实现起来也比较容...

• 使用关键点进行小目标检测

【GiantPandaCV导语】本文是笔者出于兴趣搞了一个小的库，主要是用于定位红外小目标。由于其具有尺度很小的特点，所以可以尝试用点的方式代表其位置。本文主要...

• Python|Huffman编码的python代码实现

Huffman编码是依靠Huffman树来实现的，Huffman树是带全路径长度最小的二叉树。

• 项目演练 | Python制作一个圣诞抽奖程序，原来如此简单

下周我们公司的圣诞 Party 活动安排有抽奖环节，由于不方便采用手机抽奖，且目前选用的电脑端在线抽奖会出现卡顿情况，最近我就尝试着用 Python 实现抽奖功...