Caffe模型转PaddlePaddle的Fluid版本预测模型

夜雨飘零

发布于 2020-05-06 11:50:18

7172

发布于 2020-05-06 11:50:18

原文博客：Doi技术团队链接地址：https://blog.doiduoyi.com/authors/1584446358138 初心：记录优秀的Doi技术团队学习经历

前言

有不少开发者在学习深度学习框架的时候会开源一些训练好的模型，我们可以使用这些模型来运用到我们自己的项目中。如果使用的是同一个深度学习框架，那就很方便，可以直接使用，但是如果时不同深度学习框架，我们就要对模型转换一下。下面我们就介绍如何把Caffe的模型转换成PaddlePaddle的Fluid模型。

环境准备

在线安装最新发布的PaddlePaddle，可以使用pip命令直接在线安装PaddlePaddle。

pip install paddlepaddle

下在安装最新的PaddlePaddle，可以在下面选择适合自己PaddlePaddle的版本，这里下载的是最新编译得到的，然后使用pip命令安装。

http://www.paddlepaddle.org/documentation/docs/zh/0.14.0/new_docs/beginners_guide/install/install_doc.html#id26

克隆PaddlePaddle下的models源码。

git clone https://github.com/PaddlePaddle/models.git

转换模型

进入到上一步克隆的models的caffe2fluid目录。

cd models/fluid/image_classification/caffe2fluid/

下载转换时所需要的依赖的Python文件。

cd proto/ && wget https://raw.githubusercontent.com/ethereon/caffe-tensorflow/master/kaffe/caffe/caffepb.py

修改刚下载的文件名：

mv caffepb.py caffe_pb2.py

获取需要转换的Caffe模型，笔者是参考以下这个开源获取的：

https://gist.github.com/ksimonyan/211839e770f7b538e2d8

首先是获取VGG_ILSVRC_16_layers_deploy.prototxt网络文件。

cd ../ && wget https://gist.githubusercontent.com/ksimonyan/211839e770f7b538e2d8/raw/ded9363bd93ec0c770134f4e387d8aaaaa2407ce/VGG_ILSVRC_16_layers_deploy.prototxt

其次是获取VGG_ILSVRC_16_layers.caffemodel权重文件。

wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel

把Caffe模型转换成Fluid版本的网络结构文件和权重文件，其中VGG16.py是PaddlePaddle定义网络结构的Python文件，VGG16.npy是网络的权重文件。

python convert.py VGG_ILSVRC_16_layers_deploy.prototxt \
        --caffemodel VGG_ILSVRC_16_layers.caffemodel \
        --data-output-path VGG16.npy \
        --code-output-path VGG16.py

在执行过程中会生产以下信息：

register layer[Axpy]
register layer[Flatten]
register layer[ArgMax]
register layer[Reshape]
register layer[ROIPooling]
register layer[PriorBox]
register layer[Permute]
register layer[DetectionOutput]
register layer[Normalize]
register layer[Select]
register layer[Crop]
register layer[Reduction]

------------------------------------------------------------
    WARNING: PyCaffe not found!
    Falling back to a pure protocol buffer implementation.
    * Conversions will be drastically slower.
------------------------------------------------------------

Type                 Name                                          Param               Output
----------------------------------------------------------------------------------------------
Data                 data                                             --    (10, 3, 224, 224)
Convolution          conv1_1                               (64, 3, 3, 3)   (10, 64, 224, 224)
Convolution          conv1_1                                       (64,)   (10, 64, 224, 224)
Convolution          conv1_2                              (64, 64, 3, 3)   (10, 64, 224, 224)
Convolution          conv1_2                                       (64,)   (10, 64, 224, 224)
Pooling              pool1                                            --   (10, 64, 112, 112)
Convolution          conv2_1                             (128, 64, 3, 3)  (10, 128, 112, 112)
Convolution          conv2_1                                      (128,)  (10, 128, 112, 112)
Convolution          conv2_2                            (128, 128, 3, 3)  (10, 128, 112, 112)
Convolution          conv2_2                                      (128,)  (10, 128, 112, 112)
Pooling              pool2                                            --    (10, 128, 56, 56)
Convolution          conv3_1                            (256, 128, 3, 3)    (10, 256, 56, 56)
Convolution          conv3_1                                      (256,)    (10, 256, 56, 56)
Convolution          conv3_2                            (256, 256, 3, 3)    (10, 256, 56, 56)
Convolution          conv3_2                                      (256,)    (10, 256, 56, 56)
Convolution          conv3_3                            (256, 256, 3, 3)    (10, 256, 56, 56)
Convolution          conv3_3                                      (256,)    (10, 256, 56, 56)
Pooling              pool3                                            --    (10, 256, 28, 28)
Convolution          conv4_1                            (512, 256, 3, 3)    (10, 512, 28, 28)
Convolution          conv4_1                                      (512,)    (10, 512, 28, 28)
Convolution          conv4_2                            (512, 512, 3, 3)    (10, 512, 28, 28)
Convolution          conv4_2                                      (512,)    (10, 512, 28, 28)
Convolution          conv4_3                            (512, 512, 3, 3)    (10, 512, 28, 28)
Convolution          conv4_3                                      (512,)    (10, 512, 28, 28)
Pooling              pool4                                            --    (10, 512, 14, 14)
Convolution          conv5_1                            (512, 512, 3, 3)    (10, 512, 14, 14)
Convolution          conv5_1                                      (512,)    (10, 512, 14, 14)
Convolution          conv5_2                            (512, 512, 3, 3)    (10, 512, 14, 14)
Convolution          conv5_2                                      (512,)    (10, 512, 14, 14)
Convolution          conv5_3                            (512, 512, 3, 3)    (10, 512, 14, 14)
Convolution          conv5_3                                      (512,)    (10, 512, 14, 14)
Pooling              pool5                                            --      (10, 512, 7, 7)
InnerProduct         fc6                                   (4096, 25088)           (10, 4096)
InnerProduct         fc6                                         (4096,)           (10, 4096)
Dropout              drop6                                            --           (10, 4096)
InnerProduct         fc7                                    (4096, 4096)           (10, 4096)
InnerProduct         fc7                                         (4096,)           (10, 4096)
Dropout              drop7                                            --           (10, 4096)
InnerProduct         fc8                                    (1000, 4096)           (10, 1000)
InnerProduct         fc8                                         (1000,)           (10, 1000)
Softmax              prob                                             --           (10, 1000)
Converting data...
Saving data...
Saving source...
set env variable before using converted model if used custom_layers:

使用PaddlePaddle的网络结构文件和权重文件生成预测模型文件。

python VGG16.py VGG16.npy ./fluid_models

执行上面的命令之后，就可以生成预测模型了，并存放在当前目录的fluid_models目录下，一共有两文件，分别是model和params，这个跟我们使用paddle.fluid.io.save_inference_model接口是一样的，接口文档在这里。在下一步我们会使用这个模型文件来预测我们的图片。

测试预测模型

获得预测模型之后，我们可以使用它来在PaddlePaddle预测图像，首先要编写一个PaddlePaddle的预测程序：

# coding=utf-8
import os
import time
from PIL import Image
import numpy as np
import paddle.v2 as paddle
import paddle.fluid as fluid


def load_image(file):
    im = Image.open(file)
    im = im.resize((224, 224), Image.ANTIALIAS)
    im = np.array(im).astype(np.float32)
    # PIL打开图片存储顺序为H(高度)，W(宽度)，C(通道)。
    # PaddlePaddle要求数据顺序为CHW，所以需要转换顺序。
    im = im.transpose((2, 0, 1))
    # CIFAR训练图片通道顺序为B(蓝),G(绿),R(红),
    # 而PIL打开图片默认通道顺序为RGB,因为需要交换通道。
    im = im[(2, 1, 0), :, :]  # BGR
    # 减去均值
    mean = [123.68, 116.78, 103.94]
    mean = np.array(mean, dtype=np.float32)
    mean = mean[:, np.newaxis, np.newaxis]
    im -= mean

    return im


def infer_one(image_path, use_cuda, model_path, model_filename, params_filename):
    # 是否使用GPU
    place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
    # 生成调试器
    exe = fluid.Executor(place)

    inference_scope = fluid.core.Scope()
    with fluid.scope_guard(inference_scope):
        # 加载模型
        [inference_program, feed_target_names, fetch_targets] = fluid.io.load_inference_model(model_path, exe, model_filename=model_filename, params_filename=params_filename)

        # 获取预测数据
        test_datas = [load_image(image_path)]
        test_data = np.array(test_datas)

        # 开始预测
        results = exe.run(inference_program,
                          feed={feed_target_names[0]: test_data},
                          fetch_list=fetch_targets)

        results = np.argsort(-results[0])

        result = results[0][0]

        print("infer label is: %d" % result)


if __name__ == '__main__':
    image_path = "0b77aba2-9557-11e8-a47a-c8ff285a4317.jpg"
    use_cuda = False
    model_path = "fluid_models/"
    model_filename = "model"
    params_filename = "params"
    infer_one(image_path, use_cuda, model_path, model_filename, params_filename)

使用上面的程序就是使用转换的模型来预测图片了。要注意训练模型时对图片的处理。