pytorch

sofu456

发布于 2019-07-09 14:32:08

8570

发布于 2019-07-09 14:32:08

文章被收录于专栏：sofu456

安装

pip3 install https://download.pytorch.org/whl/cpu/torch-1.0.1-cp35-cp35m-win_amd64.whl pip3 install torchvision（可视化工具集）

可视化工具

visdom、tensorboardx

打印模型

print（net object）   #打印模型

pytorch(封装性高于tensorflow)

tensor、op、Storage（单一数据类型一维数组）、torch.nn(网络结构相关)、torch.autograd(自动求导机制)

自动求导

variable变量（torch.autograd中，torch.nn.init中constant，和tensor不同，可以通过.data获取tensor）包含requires_grad（是否需要梯度）和volatile，设置false可以减少运算量
forward（正向传播算法，nn.Module重载了运算符()，可以在module()中调用）、backward（反向传播算法）

Parameter==》tensorflow（placeholder） Module==》tensorflow(session 计算图)

tensor转number使用item() tensor.view()改变形状 可以参考张量维度变换

torch.max求最大值 ==》 tensor.argmax() 最大值位置

定义dataset

class MyDataSet(Dataset):
    def __init__(self,path):
        self.data=[]
        for file in os.listdir(path):
            self.data.append((Image.open(path+file),int(file[:2])))
    def __getitem__(self,index):
        img,label =self.data[index]
        transform = transforms.Compose([
            transforms.Resize((70,70))
            ,transforms.ToTensor()])
        img = transform(img)
        return img,label    #返回tensor
    def __len__(self):
        return len(self.data)

加载数据

data_loader = DataLoader(train_data, batch_size=64,shuffle=True)
for data,txt in data_loader:
	img,label = torch.autograd.Variable(data),torch.autograd.Variable(txt)      #tensor转variable

启动梯度运算

optimizer.zero_grad()
loss.backward()
optimizer.step()

保存参数与恢复

torch.save({'net':unet.state_dict(),'optimizer':optimizer.state_dict()}, fileSave)
checkpoint = torch.load(fileSave)
unet.load_state_dict(checkpoint['net'])
optimizer.load_state_dict(checkpoint['optimizer'])
unet.eval()                     #启动测试模式

evaluation测试模式

nn.Module.eval()    //关闭dropout和BN

张量运算

cat合并张量、split【chunk】分割张量、unsqueeze增加维度、squeeze减掉1的维度、permute维度重排列、transpose交换维度 dim=0：第一个，1：第二个，2第三个

减小内存

减小batch样本数、减小输入数据，gc.collect (import gc)、checkpoint保存文件后加载

tensorboardx

writer = SummaryWriter(comment='Net')
writer.add_graph(unet,data)    data任意一个输入数据
writer.add_scalar('loss',train_loss,epoch)

python进度调

tqdm、python参数获取（sys.argv[0]）

自编码

class AutoEncoder(nn.Module):
    def __init__(self):
        super(AutoEncoder, self).__init__()

        # 压缩
        self.encoder = nn.Sequential(
            nn.Linear(28*28, 128),
            nn.Tanh(),             #数据在[-1,1]之间，正规化后在这个区间
            nn.Linear(128, 64),
            nn.Tanh(),
            nn.Linear(64, 12),
            nn.Tanh(),
            nn.Linear(12, 3),   # 压缩成3个特征, 进行 3D 图像可视化
        )
        # 解压
        self.decoder = nn.Sequential(
            nn.Linear(3, 12),
            nn.Tanh(),
            nn.Linear(12, 64),
            nn.Tanh(),
            nn.Linear(64, 128),
            nn.Tanh(),
            nn.Linear(128, 28*28),
            nn.Sigmoid(),       # 激励函数让输出值在 (0, 1)
        )

    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return encoded, decoded

autoencoder = AutoEncoder()

自编码一维结构，通过训练提高解码和原编码的相似度

图像网络算法

vgg图像分类器，最大支持1000个类别（全连接层最大输出1000），输出降维 cnn图像分类，输出降维(liner卷积中替换为cond2d) unet图像分割，输出图像 yolo目标检测 aodnet去雾霾算法，参考：https://blog.csdn.net/qq_35608277/article/details/86010157 风格迁移（使用的vgg的迁移学习算法、pre-training (预训练专指迁移学习,需要的数据量较小)），参考https://ptorch.com/news/133.html 集成学习，参考https://www.cnblogs.com/pinard/p/6131423.html

pytorch fastai和tensorflow hub

更高层封装，一个api实现DNN功能

自编码和gan区别

自编码的输入是encoder数据，gan的输入是随机噪声

人脸替换

自编码训练多个decoder、编码后替换decoder

常用模块

numpy矩阵操作、scipy数学计算、pandas数据分析、networkx图论 matpoltlib绘图（pyplot.ion、pyplot.ioff、pyplot.pause绘制时间、pyplot.show）、pygame绘图

sklearn、keras 机器学习更高层的封装

主流算法

Image Classification VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition ResNet: Deep Residual Learning for Image Recognition DenseNet: Densely Connected Convolutional Networks ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
Semantic Segmentation DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation PSPNet: Pyramid Scene Parsing Network DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
Object Detection SSD: Single Shot MultiBox Detector Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks YOLOv3: An Incremental Improvement FPN: Feature Pyramid Networks for Object Detection
Pose Estimation CPM: Convolutional Pose Machines OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Instance Segmentation Mask R-CNN
Generative Adversarial Networks Pix2pix: Image-to-Image Translation with Conditional Adversarial Nets CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.

错误参考

全连接层size mismatch：torch.nn.Linear(64nn, 128) n和上一层的输出保持一致 Assertion cur_target 大于等于 0 and cur_target 小于 n_classes failed:输出样本数和输入不匹配 https://blog.csdn.net/weixin_36411839/article/details/82720551

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2019年04月04日，如有侵权请联系 cloudcommunity@tencent.com 删除

https