TensorFlow - TF-Slim 使用总览

1. TF-Slim 安装与配置和API列表

1.1 TF-Slim的安装的配置

TensorFlow 安装后,测试 TF-Slim 是否安装成功:

python -c "import tensorflow.contrib.slim as slim; eval = slim.evaluation.evaluate_once"

虽然这里是采用 TF-Slim 处理图像分类问题,还需要安装 TF-Slim 图像模型库 tensorflow/models/research/slim. 假设该库的安装路径为 TF_MODELS. 添加 TF_MODELS/research/slim 到 python path.

导入 Python 模块:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import matplotlib.pyplot as plt
import math
import numpy as np
import tensorflow as tf
import time

from datasets import dataset_utils

# Main slim library
from tensorflow.contrib import slim

1.2 API列表

利用TF-Slim通过合并variables, layers and scopes,模型可以简洁地进行定义。各元素定义如下:

想在原生tensorflow中创建变量,要么需要一个预定义值,要么需要一种初始化机制。此外,如果变量需要在特定的设备上创建,比如GPU上,则必要要显式指定。为了简化代码的变量创建,TF-Slim在variables.py中提供了一批轻量级的函数封装,从而是调用者可以更加容易地定义变量。

例如,创建一个权值变量,并且用truncated_normal初始化,用L2损失正则化,放置于CPU中,我们只需要定义如下:

weights = slim.variable('weights',

shape=[10, 10, 3 , 3],

initializer=tf.truncated_normal_initializer(stddev=0.1),

regularizer=slim.l2_regularizer(0.05),

device='/CPU:0')

在原生tensorflow中,有两种类型的变量:常规变量和局部(临时)变量。绝大部分都是常规变量,它们一旦创建,可以用Saver保存在磁盘上。局部变量则只在一个session期间存在,且不会保存在磁盘上。

TF-Slim通过定义model variables可以进一步区分变量,这种变量代表一个模型的参数。模型变量在学习阶段被训练或微调,在评估和预测阶段从checkpoint中加载。比如通过slim.fully_connected orslim.conv2d进行创建的变量。非模型变量是在学习或评估阶段使用,但不会在预测阶段起作用的变量。例如global_step,它在学习和评估阶段使用,但不是模型的一部分。类似地,移动均值可以mirror模型参数,但是它们本身不是模型变量。

通过TF-Slim,模型变量和常规变量都可以很容易地创建和获取:

# Model Variables

weights = slim.model_variable('weights',

shape=[10, 10, 3 , 3],

initializer=tf.truncated_normal_initializer(stddev=0.1),

regularizer=slim.l2_regularizer(0.05),

device='/CPU:0')

model_variables = slim.get_model_variables()

# Regular variables

my_var = slim.variable('my_var',

shape=[20, 1],

initializer=tf.zeros_initializer())

regular_variables_and_model_variables = slim.get_variables()

这玩意是怎么起作用的呢?当你通过TF-Slim's layers或者直接通过slim.model_variable函数创建一个模型变量,TF-Slim会把这个变量添加到tf.GraphKeys.MODEL_VARIABLES这个集合中。那我们自己的网络层变量怎么让TF-Slim管理呢?TF-Slim提供了一个很方便的函数可以将模型的变量添加到集合中:

my_model_variable = CreateViaCustomCode()

# Letting TF-Slim know about the additional variable.

slim.add_model_variable(my_model_variable)

Layers

tensorflow的操作符集合是十分广泛的,神经网络开发者通常会以更高层的概念,比如"layers", "losses", "metrics", and "networks"去考虑模型。一个层,比如卷积层、全连接层或bn层,要比一个单独的tensorflow操作符更抽象,并且通常会包含若干操作符。此外,和原始操作符不同,一个层经常(不总是)有一些与自己相关的变量(可调参数)。例如,在神经网络中,一个卷积层由许多底层操作符组成:

1. 创建权重、偏置变量

2. 将来自上一层的数据和权值进行卷积

3. 在卷积结果上加上偏置

4. 应用激活函数

如果只用普通的tensorflow代码,干这个事是相当的费事:

input = ...

with tf.name_scope('conv1_1') as scope:

kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,

stddev=1e-1), name='weights')

conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')

biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),

trainable=True, name='biases')

bias = tf.nn.bias_add(conv, biases)

conv1 = tf.nn.relu(bias, name=scope)

为了缓解重复这些代码,TF-Slim在更抽象的神经网络层的层面上提供了大量方便使用的操作符。比如,将上面的代码和TF-Slim响应的代码调用进行比较:

input = ...

net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')

TF-Slim提供了标准接口用于组建神经网络,包括:

Layer

TF-Slim

BiasAdd

slim.bias_add

BatchNorm

slim.batch_norm

Conv2d

slim.conv2d

Conv2dInPlane

slim.conv2d_in_plane

Conv2dTranspose (Deconv)

slim.conv2d_transpose

FullyConnected

slim.fully_connected

AvgPool2D

slim.avg_pool2d

Dropout

slim.dropout

Flatten

slim.flatten

MaxPool2D

slim.max_pool2d

OneHotEncoding

slim.one_hot_encoding

SeparableConv2

slim.separable_conv2d

UnitNorm

slim.unit_norm

TF-Slim也提供了两个元运算符----repeat和stack,允许用户可以重复地使用相同的运算符。例如,VGG网络的一个片段,这个网络在两个池化层之间就有许多卷积层的堆叠:

net = ...

net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')

net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')

net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')

net = slim.max_pool2d(net, [2, 2], scope='pool2')

一种减少这种代码重复的方法是使用for循环:

net = ...

for i in range(3):

net = slim.conv2d(net, 256, [3, 3], scope='conv3_' % (i+1))

net = slim.max_pool2d(net, [2, 2], scope='pool2')

若使用TF-Slim的repeat操作符,代码看起来会更简洁:

net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')

net = slim.max_pool2d(net, [2, 2], scope='pool2')

slim.repeat不但可以在一行中使用相同的参数,而且还能智能地展开scope,即每个后续的slim.conv2d调用所对应的scope都会追加下划线及迭代数字。更具体地讲,上面代码的scope分别为 'conv3/conv3_1', 'conv3/conv3_2' and 'conv3/conv3_3'。除此之外,TF-Slim的slim.stack操作符允许调用者用不同的参数重复使用相同的操作符是创建一个stack或网络层塔。slim.stack也会为每个创建的操作符生成一个新的scope。例如,下面是一个简单的方法去创建MLP:

# Verbose way:

x = slim.fully_connected(x, 32, scope='fc/fc_1')

x = slim.fully_connected(x, 64, scope='fc/fc_2')

x = slim.fully_connected(x, 128, scope='fc/fc_3')


# Equivalent, TF-Slim way using slim.stack:

slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')

在这个例子中,slim.stack调用slim.fully_connected 三次,前一个层的输出是下一层的输入。而每个网络层的输出通道数从32变到64,再到128. 同样,我们可以用stack简化一个多卷积层塔:

# Verbose way:

x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')

x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')

x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')

x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')

# Using stack:

slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')

Scopes

除了tensorflow中自带的scope机制类型(name_scope, variable_scope)外, TF-Slim添加了一种叫做arg_scope的scope机制。这种scope允许用户在arg_scope中指定若干操作符以及一批参数,这些参数会传给前面所有的操作符中。参见以下代码:

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',

weights_initializer=tf.truncated_normal_initializer(stddev=0.01),

weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')

net = slim.conv2d(net, 128, [11, 11], padding='VALID',

weights_initializer=tf.truncated_normal_initializer(stddev=0.01),

weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')

net = slim.conv2d(net, 256, [11, 11], padding='SAME',

weights_initializer=tf.truncated_normal_initializer(stddev=0.01),

weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')

很明显,这三个卷积层有很多超参数都是相同的。有两个卷积层有相同的padding设置,而且这三个卷积层都有相同的weights_initializer(权值初始化器)和weight_regularizer(权值正则化器)。这段代码很难读,且包含了很多重复的参数值。一种解决办法是用变量指定默认值:

padding = 'SAME'

initializer = tf.truncated_normal_initializer(stddev=0.01)

regularizer = slim.l2_regularizer(0.0005)

net = slim.conv2d(inputs, 64, [11, 11], 4,

padding=padding,

weights_initializer=initializer,

weights_regularizer=regularizer,

scope='conv1')

net = slim.conv2d(net, 128, [11, 11],

padding='VALID',

weights_initializer=initializer,

weights_regularizer=regularizer,

scope='conv2')

net = slim.conv2d(net, 256, [11, 11],

padding=padding,

weights_initializer=initializer,

weights_regularizer=regularizer,

scope='conv3')

这种方式可以确保这三个卷积层共享相同的参数值,但是仍然没有减少代码规模。通过使用arg_scope,我们既能确保每层共享参数值,又能精简代码:

with slim.arg_scope([slim.conv2d], padding='SAME',

weights_initializer=tf.truncated_normal_initializer(stddev=0.01)

weights_regularizer=slim.l2_regularizer(0.0005)):

net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')

net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')

net = slim.conv2d(net, 256, [11, 11], scope='conv3')

如例所示,arg_scope使代码更简洁且易于维护。注意,在arg_scope中被指定的参数值,也可以在局部位置进行覆盖。比如,padding参数设置为'SAME', 而第二个卷积层仍然可以通过把它设为'VALID'而覆盖掉arg_scope中的默认设置。我们可以嵌套arg_scope, 也可以在一个scope中指定多个操作符,例如

with slim.arg_scope([slim.conv2d, slim.fully_connected],

activation_fn=tf.nn.relu,

weights_initializer=tf.truncated_normal_initializer(stddev=0.01),

weights_regularizer=slim.l2_regularizer(0.0005)):

with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')

net = slim.conv2d(net, 256, [5, 5],

weights_initializer=tf.truncated_normal_initializer(stddev=0.03),

scope='conv2')

net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')

在这个例子中,第一个arg_scope对处于它的scope中的conv2d和fully_connected操作层应用相同的weights_initializer andweights_regularizer参数。在第二个arg_scope中,默认参数只是在conv2d中指定。

指定VGG的层

通过整合TF-Slim的变量、操作符和scope,我们可以用寥寥几行代码写一个通常非常复杂的网络。例如,完整的VGG结构只需要用下面的一小段代码定义:

def vgg16(inputs):

with slim.arg_scope([slim.conv2d, slim.fully_connected],

activation_fn=tf.nn.relu,

weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),

weights_regularizer=slim.l2_regularizer(0.0005)):

net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')

net = slim.max_pool2d(net, [2, 2], scope='pool1')

net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')

net = slim.max_pool2d(net, [2, 2], scope='pool2')

net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')

net = slim.max_pool2d(net, [2, 2], scope='pool3')

net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')

net = slim.max_pool2d(net, [2, 2], scope='pool4')

net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')

net = slim.max_pool2d(net, [2, 2], scope='pool5')

net = slim.fully_connected(net, 4096, scope='fc6')

net = slim.dropout(net, 0.5, scope='dropout6')

net = slim.fully_connected(net, 4096, scope='fc7')

net = slim.dropout(net, 0.5, scope='dropout7')

net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')

return net

训练模型

训练一个tensorflow模型,需要一个网络模型,一个损失函数,梯度计算方式和用于迭代计算模型权重的训练过程。TF-Slim提供了损失函数,同时也提供了一批运行训练和评估模型的帮助函数。

损失

损失函数定义了我们想最小化的量。对于分裂问题,它通常是真实分布和预测概率分布的交叉熵。对于回归问题,它通常是真实值和预测值的平方和。

对于特定的模型,比如多任务学习模型,可能需要同时使用多个损失函数。换句话说,正在最小化的损失函数是其他一些损失函数的和。例如,有一个模型既要预测图像中场景的类型,又要预测每个像素的深度。那这个模型的损失函数就是分类损失和深度预测损失的和。

TF-Slim通过losses模块,提供了一种易用的机制去定义和跟踪损失函数的足迹。看一个简单的例子,我们想训练VGG网络:

import tensorflow as tf

vgg = tf.contrib.slim.nets.vgg

# Load the images and labels.

images, labels = ...

# Create the model.

predictions, _ = vgg.vgg_16(images)

# Define the loss functions and get the total loss.

loss = slim.losses.softmax_cross_entropy(predictions, labels)

在上面的例子中,我们首先创建了模型(用TF-Slim的VGG接口实现),并添加了标准的分类损失。现在,我们再看一个产生多输出的多任务模型:

# Load the images and labels.

images, scene_labels, depth_labels = ...

# Create the model.

scene_predictions, depth_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.

classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)

sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

# The following two lines have the same effect:

total_loss = classification_loss + sum_of_squares_loss

total_loss = slim.losses.get_total_loss(add_regularization_losses=False)

# Load the images and labels.

images, scene_labels, depth_labels = ...

# Create the model.

scene_predictions, depth_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.

classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)

sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

# The following two lines have the same effect:

total_loss = classification_loss + sum_of_squares_loss

total_loss = slim.losses.get_total_loss(add_regularization_losses=False)

在这个例子中,我们有两个损失,分别是通过slim.losses.softmax_cross_entropy和 slim.losses.sum_of_squares得到的。我们既可以通过相加得到total_loss,也可以通过slim.losses.get_total_loss()得到total_loss。这是怎么做到的呢?当你通过TF-Slim创建一个损失函数时,TF-Slim会把损失加入到一个特殊的Tensorflow的损失函数集合中。这样你既可以手动管理损失函数,也可以托管给TF-Slim。

如果我们有一个自定义的损失函数,现在也想托管给TF-Slim,该怎么做呢?loss_ops.py也有一个函数可以将这个损失函数加入到TF-Slim集合中。例如

# Load the images and labels.

images, scene_labels, depth_labels, pose_labels = ...

# Create the model.

scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.

classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)

sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)

slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.

# The following two ways to compute the total loss are equivalent:

regularization_loss = tf.add_n(slim.losses.get_regularization_losses())

total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss

# (Regularization Loss is included in the total loss by default).

total_loss2 = slim.losses.get_total_loss()

这个例子中,我们同样既可以手动管理损失函数,也可以让TF-Slim知晓这个自定义损失函数,然后托管给TF-Slim。

训练回路

在learning.py中,TF-Slim提供了简单却非常强大的训练模型的工具集。包括Train函数,可以重复地测量损失,计算梯度以及保存模型到磁盘中,还有一些方便的函数用于操作梯度。例如,当我们定义好了模型、损失函数以及优化方式,我们就可以调用slim.learning.create_train_op andslim.learning.train 去执行优化:

g = tf.Graph()

# Create the model and specify the losses...

...

total_loss = slim.losses.get_total_loss()

optimizer = tf.train.GradientDescentOptimizer(learning_rate)

# create_train_op ensures that each time we ask for the loss, the update_ops

# are run and the gradients being computed are applied too.

train_op = slim.learning.create_train_op(total_loss, optimizer)

logdir = ... # Where checkpoints are stored.

slim.learning.train(

train_op,

logdir,

number_of_steps=1000,

save_summaries_secs=300,

save_interval_secs=600):

在该例中,slim.learning.train根据train_op计算损失、应用梯度step。logdir指定了checkpoints和event文件的存储路径。我们可以限制梯度step到任何数值。这里我们采用1000步。最后,save_summaries_secs=300表示每5分钟计算一次summaries,save_interval_secs=600表示每10分钟保存一次模型的checkpoint。

为了说明,让我们测试以下训练VGG的例子:

import tensorflow as tf

slim = tf.contrib.slim

vgg = tf.contrib.slim.nets.vgg

...

train_log_dir = ...

if not tf.gfile.Exists(train_log_dir):

tf.gfile.MakeDirs(train_log_dir)

with tf.Graph().as_default():

# Set up the data loading:

images, labels = ...

# Define the model:

predictions = vgg.vgg16(images, is_training=True)

# Specify the loss function:

slim.losses.softmax_cross_entropy(predictions, labels)

total_loss = slim.losses.get_total_loss()

tf.summary.scalar('losses/total_loss', total_loss)

# Specify the optimization scheme:

optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)

# create_train_op that ensures that when we evaluate it to get the loss,

# the update_ops are done and the gradient updates are computed.

train_tensor = slim.learning.create_train_op(total_loss, optimizer)

# Actually runs training.

slim.learning.train(train_tensor, train_log_dir)

对已有模型进行微调:

简要回顾一下如何从checkpoint加载variables

对从checkpoint加载variables的简略概括

在一个模型训练完成后,我们可以用tf.train.Saver()通过指定checkpoing加载variables的方式加载这个模型。对于很多情况,tf.train.Saver()提供了一种简单的机制去加载所有或一些varialbes变量。

# Create some variables.

v1 = tf.Variable(..., name="v1")

v2 = tf.Variable(..., name="v2")

...

# Add ops to restore all the variables.

restorer = tf.train.Saver()

# Add ops to restore some variables.

restorer = tf.train.Saver([v1, v2])

# Later, launch the model, use the saver to restore variables from disk, and

# do some work with the model.

with tf.Session() as sess:

# Restore variables from disk.

restorer.restore(sess, "/tmp/model.ckpt")

print("Model restored.")

# Do some work with the model

...

See Restoring Variables and Choosing which Variables to Save and Restore sections of the Variables page for more details.

参阅Variables章中Restoring VariablesChoosing which Variables to Save and Restore 相关部分,获取更多细节。

恢复部分模型

有时我们希望在一个全新的数据集上或面对一个信息任务方向去微调预训练模型。在这些情况下,我们可以使用TF-Slim's的帮助函数去加载模型中变量的一个子集:

# Create some variables.

v1 = slim.variable(name="v1", ...)

v2 = slim.variable(name="nested/v2", ...)

...

# Get list of variables to restore (which contains only 'v2'). These are all

# equivalent methods:

variables_to_restore = slim.get_variables_by_name("v2")

# or

variables_to_restore = slim.get_variables_by_suffix("2")

# or

variables_to_restore = slim.get_variables(scope="nested")

# or

variables_to_restore = slim.get_variables_to_restore(include=["nested"])

# or

variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])

# Create the saver which will be used to restore the variables.

restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:

# Restore variables from disk.

restorer.restore(sess, "/tmp/model.ckpt")

print("Model restored.")

# Do some work with the model

...

用不同的变量名加载模型

当从checkpoint加载变量时,Saver先在checkpoint中定位变量名,然后映射到当前图的变量中。我们也可以通过向saver传递一个变量列表来创建saver。这时,在checkpoint文件中用于定位的变量名可以隐式地从各自的var.op.name中获得。当checkpoint文件中的变量名与当前图中的变量名完全匹配时,这会运行得很好。但是,有时我们想从一个变量名与与当前图的变量名不同的checkpoint文件中装载一个模型。这时,我们必须提供一个saver字典,这个字典对checkpoint中的每个变量和每个图变量进行了一一映射。请看下面这个例子,checkpoint的变量是通过一个简单的函数获得的:

# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'

def name_in_checkpoint(var):

return 'vgg16/' + var.op.name

# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'

def name_in_checkpoint(var):

if "weights" in var.op.name:

return var.op.name.replace("weights", "params1")

if "bias" in var.op.name:

return var.op.name.replace("bias", "params2")

variables_to_restore = slim.get_model_variables()

variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}

restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:

# Restore variables from disk.

restorer.restore(sess, "/tmp/model.ckpt")

针对不同任务对模型进行微调

假设我们有一个已经预训练好的VGG16的模型。这个模型是在拥有1000分类的ImageNet数据集上进行训练的。但是,现在我们想把它应用在只具有20个分类的Pascal VOC数据集上。为了能这样做,我们可以通过利用除最后一些全连接层的其他预训练模型值来初始化新模型的达到目的:

# Load the Pascal VOC data

image, label = MyPascalVocDataLoader(...)

images, labels = tf.train.batch([image, label], batch_size=32)

# Create the model

predictions = vgg.vgg_16(images)

train_op = slim.learning.create_train_op(...)

# Specify where the Model, trained on ImageNet, was saved.

model_path = '/path/to/pre_trained_on_imagenet.checkpoint'

metric_ops.py

# Specify where the new model will live:

log_dir =

from_checkpoint_

'/path/to/my_pascal_model_dir/'

# Restore only the convolutional layers:

variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])

init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)

# Start training.

slim.learning.train(train_op, log_dir, init_fn=init_fn)

评估模型

一旦我们训练好了一个模型(或者模型还在训练中),我们想看一下模型在实际中性能如何。这可以通过获取一系列表征模型性能的评估指标来实现,评估代码一般会加载数据,执行前向传播,和ground truth进行比较并记录评估分数。这个步骤可能执行一次,也可能周期性地执行。

度量

比如我们定义了一个不是损失函数的性能度量指标(损失在训练过程中进行直接优化),而这个指标出于评估模型的目的我们还非常感兴趣。比如说我们想最小化log损失,但是我们感兴趣的指标可能是F1 score(测试准确率),或者IoU分数(这个指标不可微,因此不能作为损失)。

TF-Slim提供了一系列指标操作符,它们可以使模型评估更简单。抽象来讲,计算一个指标值可以分为3步:

       1. 初始化:初始化用于计算指标的变量。

       2. 聚合:执行用于计算指标的运算流程(比如sum)。

       3. 收尾:(可选)执行其他用于计算指标值的操作。例如,计算mean、min、max等。

例如,为了计算绝对平均误差,一个count变量和一个total变量需要初始化为0. 在聚合阶段,我们可以观察到一系列预测值及标签,计算他们差的绝对值,并加到total中。每次循环,count变量自加1。最后,在收尾阶段,total除以count就得到了mean。

下面的例子演示了定义指标的API。因为指标通常是在测试集上计算,而不是训练集(训练集上是用于计算loss的),我们假设我们在使用测试集:

images, labels = LoadTestData(...)

predictions = MyModel(images)

mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)

mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)

pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)

如上例所示,指标的创建会返回两个值,一个value_op和一个update_op。value_op表示和当前指标值幂等的操作。update_op是上文提到的执行聚合步骤并返回指标值的操作符。跟踪每个value_opupdate_op是非常费劲的。为了解决这个问题,TF-Slim提供了两个方便的函数:

# Aggregates the value and update ops in two lists:

value_ops, update_ops = slim.metrics.aggregate_metrics(

slim.metrics.streaming_mean_absolute_error(predictions, labels),

slim.metrics.streaming_mean_squared_error(predictions, labels))

# Aggregates the value and update ops in two dictionaries:

names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({

"eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),

"eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),

})

把上面讲的整合在一起:

import tensorflow as tf

slim = tf.contrib.slim

vgg = tf.contrib.slim.nets.vgg


# Load the data

images, labels = load_data(...)

# Define the network

predictions = vgg.vgg_16(images)

# Choose the metrics to compute:

names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({

"eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),

"eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),

})

# Evaluate the model using 1000 batches of data:

num_batches = 1000

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

sess.run(tf.local_variables_initializer())

for batch_id in range(num_batches):

sess.run(names_to_updates.values())

metric_values = sess.run(names_to_values.values())

for metric, value in zip(names_to_values.keys(), metric_values):

print('Metric %s has value: %f' % (metric, value))

注意,metric_ops.py可以在没有layers.py和loss_ops.py的情况下独立使用。

Evaluation Loop

TF-Slim提供了一个评估模块(evaluation.py),这个模块包含了一些利用来自metric_ops.py模块的指标写模型评估脚本的帮助函数。其中包含一个可以周期运行评估,评估数据batch之间的指标、打印并总结指标结果的函数。例如:

import tensorflow as tf

slim = tf.contrib.slim

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})

# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
  op = tf.summary.scalar(metric_name, metric_value)
  op = tf.Print(op, [metric_value], metric_name)
  summary_ops.append(op)

num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))

# Setup the global step.
slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
    'local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)

2. 采用 TF-Slim 创建第一个神经网络

以一个简单多层感知机(Multilayer Perceptron, MLP) 解决回归问题为例.该 MLP 模型有 2 个隐藏层,模型输出是单个节点. 当函数调用时,会创建很多节点node,并自动调价到当前作用域内的全局 TF Graph 中。当创建带有可调参数的网络层(如,FC层)时,会自动创建参数变量节点,并添加到 Graph 中,采用变量作用域(variable scope) 来将所有的节点放于通用名字,因此 Graph 具有分层结构。这有助于在 tensorboard 中可视化 TF Graph,及相关变量的查询。正如 arg_scope中所定义,FC 层都采用相同的 L2 weight decay 和 ReLU 激活。(不过,最终的网络层复写了这些默认值,使用了相同的激活函数)。此外,示例了在第一个全连接层FC1 后如何添加 Dropout 层。在测试时,不需要 dropout 节点,而是采用了平均激活(average activations)。因此,需要知道该模型是处于 training 或 testing 阶段,因为在两种情况下的计算图是不同的.(虽然保存着模型参数的变量variables 是共享的,具有相同的变量名/作用域 name/scope)

2.1 定义回归模型

def regression_model(inputs, is_training=True, scope="deep_regression"):
    """
    创建回归模型

    Args:
        inputs: A node that yields a `Tensor` of size [batch_size, dimensions].
        is_training: Whether or not we're currently training the model.
        scope: An optional variable_op scope for the model.

    Returns:
        predictions: 1-D `Tensor` of shape [batch_size] of responses.
        end_points: A dict of end points representing the hidden layers.
    """
    with tf.variable_scope(scope, 'deep_regression', [inputs]):
        end_points = {}
        # Set the default weight _regularizer and acvitation for each fully_connected layer.
        with slim.arg_scope([slim.fully_connected],
                            activation_fn=tf.nn.relu,
                            weights_regularizer=slim.l2_regularizer(0.01)):

            # Creates a fully connected layer from the inputs with 32 hidden units.
            net = slim.fully_connected(inputs, 32, scope='fc1')
            end_points['fc1'] = net

            # Adds a dropout layer to prevent over-fitting.
            net = slim.dropout(net, 0.8, is_training=is_training)

            # Adds another fully connected layer with 16 hidden units.
            net = slim.fully_connected(net, 16, scope='fc2')
            end_points['fc2'] = net

            # Creates a fully-connected layer with a single hidden unit. Note that the
            # layer is made linear by setting activation_fn=None.
            predictions = slim.fully_connected(net, 1, activation_fn=None, scope='prediction')
            end_points['out'] = predictions

            return predictions, end_points

2.2 创建模型/查看模型结构

with tf.Graph().as_default():
    # Dummy placeholders for arbitrary number of 1d inputs and outputs
    inputs = tf.placeholder(tf.float32, shape=(None, 1))
    outputs = tf.placeholder(tf.float32, shape=(None, 1))

    # 创建模型
    predictions, end_points = regression_model(inputs) # 添加nodes(tensors) 到 Graph.

    # 打印每个 tensor 的 name 和 shape.
    print("Layers")
    for k, v in end_points.items():
        print('name = {}, shape = {}'.format(v.name, v.get_shape()))

    # 打印参数节点(parameter nodes) 的 name 和 shape(值还未初始化)
    print("\n")
    print("Parameters")
    for v in slim.get_model_variables():
        print('name = {}, shape = {}'.format(v.name, v.get_shape()))

2.3 随机生成 1d 回归数据

def produce_batch(batch_size, noise=0.3):
    xs = np.random.random(size=[batch_size, 1]) * 10
    ys = np.sin(xs) + 5 + np.random.normal(size=[batch_size, 1], scale=noise) # 添加了随机噪声
    return [xs.astype(np.float32), ys.astype(np.float32)]

x_train, y_train = produce_batch(200)
x_test, y_test = produce_batch(200)
plt.scatter(x_train, y_train)

2.4 拟合模型

模型训练需要指定 loss 函数和 optimizer,再采用 slim.

slim.learning.train 函数主要工作:

  • 对于每次迭代,评估 train_op,其采用 optimizer 应用到当前 minibatch 数据,更新参数. 同时,更新 global_step.
  • 周期性地保存模型断点到指定路径. 有助于根据断点文件重新训练.
def convert_data_to_tensors(x, y):
    inputs = tf.constant(x)
    inputs.set_shape([None, 1])

    outputs = tf.constant(y)
    outputs.set_shape([None, 1])
    return inputs, outputs

# 采用均方差 loss 训练回归模型.
ckpt_dir = '/tmp/regression_model/'

with tf.Graph().as_default():
    tf.logging.set_verbosity(tf.logging.INFO) # 日志信息

    inputs, targets = convert_data_to_tensors(x_train, y_train)

    # 模型创建
    predictions, nodes = regression_model(inputs, is_training=True)

    # 添加 loss 函数到 Graph
    loss = tf.losses.mean_squared_error(labels=targets, predictions=predictions)

    # 总 loss 是定义的 loss 加上任何正则 losses.
    total_loss = slim.losses.get_total_loss()

    # 设定 optimizer,并创建 train op:
    optimizer = tf.train.AdamOptimizer(learning_rate=0.005)
    train_op = slim.learning.create_train_op(total_loss, optimizer) 

    # 在会话Session 内运行模型训练.
    final_loss = slim.learning.train(
        train_op,
        logdir=ckpt_dir,
        number_of_steps=5000,
        save_summaries_secs=5,
        log_every_n_steps=500)

print("Finished training. Last batch loss:", final_loss)
print("Checkpoint saved in %s" % ckpt_dir)

2.5 采用多个 loss 函数训练模型

在某些任务场景中,需要同时优化多个目标. TF-Slim 提供了易用的多个 losses 计算. (这里,示例未优化 total loss,但是给出了如何计算)

with tf.Graph().as_default():
    inputs, targets = convert_data_to_tensors(x_train, y_train)
    predictions, end_points = regression_model(inputs, is_training=True)

    # 添加多个 losses 节点到 Graph.
    mean_squared_error_loss = tf.losses.mean_squared_error(labels=targets, predictions=predictions)
    absolute_difference_loss = slim.losses.absolute_difference(predictions, targets)

    # 下面两种计算 total loss 的方式是等价的.
    regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
    total_loss1 = mean_squared_error_loss + absolute_difference_loss + regularization_loss

    # 默认情况下,Regularization Loss 被包括在 total loss 中.
    # 有益于 training, 但不益于 testing.
    total_loss2 = slim.losses.get_total_loss(add_regularization_losses=True)

    # 初始化变量
    init_op = tf.global_variables_initializer()

    with tf.Session() as sess:
        sess.run(init_op) # 采用随机权重初始化参数.

        total_loss1, total_loss2 = sess.run([total_loss1, total_loss2])

        print('Total Loss1: %f' % total_loss1)
        print('Total Loss2: %f' % total_loss2)

        print('Regularization Losses:')
        for loss in slim.losses.get_regularization_losses():
            print(loss)

        print('Loss Functions:')
        for loss in slim.losses.get_losses():
            print(loss)

2.6 加载保存的训练进行预测

with tf.Graph().as_default():
    inputs, targets = convert_data_to_tensors(x_test, y_test)

    # 创建模型结构. (后面再加载参数.)
    predictions, end_points = regression_model(inputs, is_training=False)

    # 创建会话,从断点文件恢复参数.
    sv = tf.train.Supervisor(logdir=ckpt_dir)
    with sv.managed_session() as sess:
        inputs, predictions, targets = sess.run([inputs, predictions, targets])

plt.scatter(inputs, targets, c='r')
plt.scatter(inputs, predictions, c='b')
plt.title('red=true, blue=predicted')

2.7 测试集上计算评估度量 metrics

TF-Slim 术语中,losses 用于优化,但 metrics 仅用于评估,二者可能不一样,比如 precision & recall. 例如,计算的均方差误差和平均绝对值误差度量.

每个 metric 声明创建了几个局部变量(必须通过 tf.initialize_local_variables() 初始化),并同时返回 value_opupdate_op. 在评估时,value_op 返回当前 metric 值. update_op 加载一个新的 batch 数据,获得预测值,并在返回当前 metric 值之前累积计算 metric 统计结果. value 节点和 update 节点保存为 2 个字典里.

创建 metric 节点之后,即可传递到 slim.evaluation,重复地评估这些节点多次. 最后,打印每个 metric 的最终值.

with tf.Graph().as_default():
    inputs, targets = convert_data_to_tensors(x_test, y_test)
    predictions, end_points = regression_model(inputs, is_training=False)

    # Specify metrics to evaluate:
    names_to_value_nodes, names_to_update_nodes = slim.metrics.aggregate_metric_map({
      'Mean Squared Error': slim.metrics.streaming_mean_squared_error(predictions, targets),
      'Mean Absolute Error': slim.metrics.streaming_mean_absolute_error(predictions, targets)
    })

    # Make a session which restores the old graph parameters, and then run eval.
    sv = tf.train.Supervisor(logdir=ckpt_dir)
    with sv.managed_session() as sess:
        metric_values = slim.evaluation.evaluation(
            sess,
            num_evals=1, # Single pass over data
            eval_op=names_to_update_nodes.values(),
            final_op=names_to_value_nodes.values())

    names_to_values = dict(zip(names_to_value_nodes.keys(), metric_values))
    for key, value in names_to_values.items():
      print('%s: %f' % (key, value))

3. 采用 TF-Slim 读取数据

采用 TF-Slim 读取数据主要包括两个部分:

3.1 Dataset

TF-Slim Dataset 包含了数据集的描述信息,用于数据读取,例如,数据文件列表,以及数据编码方式。此外,还包含一些元数据(metadata),包括类别标签,train/test 划分的数据集大小,数据集提供的张量描述等。例如,某些数据集包含图片images 和标签labels,其它边界框标注等。Dataset 对象允许针对不同的数据内容和编码类型使用相同的 API。TF-Slim [Dataset] 对于存储为 TFRecords 文件 的数据甚为有效,其中,每个 record 包含一个 tf.train.Example protocol buffer。TF-Slim 采用一致约定,用于每个 Example record 的 keys 和 vaules 的命名。

3.2 DatasetDataProvider

TF-Slim DatasetDataProvider 是用于从数据集真实读取数据的类Class。非常适合训练过程不同方式的数据读取。例如,DatasetDataProvider 是单线程或多线程。如果数据是多个文件的分片,DatasetDataProvider 可以序列的读取每个文件,或者同时从每个文件读取.

3.3 示例:Flowers 数据集

这里给出了将几个常用图片数据集转换为 TFRecord 格式的脚本,以及用于读取的 Dataset 描述.

  • Flowers TFRecord 格式数据集下载: import tensorflow as tf from datasets import dataset_utils url = "http://download.tensorflow.org/data/flowers.tar.gz" flowers_data_dir = '/tmp/flowers' if not tf.gfile.Exists(flowers_data_dir): tf.gfile.MakeDirs(flowers_data_dir) dataset_utils.download_and_uncompress_tarball(url, flowers_data_dir)
  • Flowers TFRecord 部分数据可视化 from datasets import flowers import tensorflow as tf from tensorflow.contrib import slim with tf.Graph().as_default(): dataset = flowers.get_split('train', flowers_data_dir) data_provider = slim.dataset_data_provider.DatasetDataProvider( dataset, common_queue_capacity=32, common_queue_min=1) image, label = data_provider.get(['image', 'label']) with tf.Session() as sess: with slim.queues.QueueRunners(sess): for i in range(4): np_image, np_label = sess.run([image, label]) height, width, _ = np_image.shape class_name = name = dataset.labels_to_names[np_label] plt.figure() plt.imshow(np_image) plt.title('%s, %d x %d' % (name, height, width)) plt.axis('off') plt.show()

4. CNN 训练

基于一个简单 CNN 网络训练图片分类器.

4.1 模型定义

def my_cnn(images, num_classes, is_training):  # is_training is not used...
    with slim.arg_scope([slim.max_pool2d], kernel_size=[3, 3], stride=2):
        net = slim.conv2d(images, 64, [5, 5])
        net = slim.max_pool2d(net)
        net = slim.conv2d(net, 64, [5, 5])
        net = slim.max_pool2d(net)
        net = slim.flatten(net)
        net = slim.fully_connected(net, 192)
        net = slim.fully_connected(net, num_classes, activation_fn=None)
        return net

4.2 对随机生成图片应用模型

import tensorflow as tf

with tf.Graph().as_default():
    # 该模型可以处理任何大小的输入,因为第一层是卷积层.
    # 模型的大小是由 image_node 第一次传递到 my_cnn 函数时来决定的.
    # 一旦初始化了变量,所有权重矩阵的大小都会固定.
    # 由于全连接层,所有后续的图片必须具有与第一张图片具有相同的尺寸大小.
    batch_size, height, width, channels = 3, 28, 28, 3
    images = tf.random_uniform([batch_size, height, width, channels], maxval=1)

    # 创建模型
    num_classes = 10
    logits = my_cnn(images, num_classes, is_training=True)
    probabilities = tf.nn.softmax(logits)

    #随机初始化变量,包括参数初始化.
    init_op = tf.global_variables_initializer()

    with tf.Session() as sess:
        # 运行 init_op, 计算模型输出,并打印结果:
        sess.run(init_op)
        probabilities = sess.run(probabilities)

print('Probabilities Shape:')
print(probabilities.shape)  # batch_size x num_classes 

print('\nProbabilities:')
print(probabilities)

print('\nSumming across all classes (Should equal 1):')
print(np.sum(probabilities, 1)) # Each row sums to 1

4.3 在 Flowers 数据集训练模型

TF-Slim 的 learning.py 中 training 函数的使用. 首先,创建 load_batch 函数,从数据集加载 batchs 数据. 然后,训练模型一次,评估结果.

from preprocessing import inception_preprocessing
import tensorflow as tf

from tensorflow.contrib import slim


def load_batch(dataset, batch_size=32, height=299, width=299, is_training=False):
    """
    加载单个 bacth 的数据.

    Args:
      dataset: The dataset to load.
      batch_size: The number of images in the batch.
      height: The size of each image after preprocessing.
      width: The size of each image after preprocessing.
      is_training: Whether or not we're currently training or evaluating.

    Returns:
      images: A Tensor of size [batch_size, height, width, 3], image samples that have been preprocessed.
      images_raw: A Tensor of size [batch_size, height, width, 3], image samples that can be used for visualization.
      labels: A Tensor of size [batch_size], whose values range between 0 and dataset.num_classes.
    """
    data_provider = slim.dataset_data_provider.DatasetDataProvider(
        dataset, common_queue_capacity=32,
        common_queue_min=8)
    image_raw, label = data_provider.get(['image', 'label'])

    # Preprocess image for usage by Inception.
    image = inception_preprocessing.preprocess_image(image_raw, height, width, is_training=is_training)

    # Preprocess the image for display purposes.
    image_raw = tf.expand_dims(image_raw, 0)
    image_raw = tf.image.resize_images(image_raw, [height, width])
    image_raw = tf.squeeze(image_raw)

    # Batch it up.
    images, images_raw, labels = tf.train.batch(
          [image, image_raw, label],
          batch_size=batch_size,
          num_threads=1,
          capacity=2 * batch_size)

    return images, images_raw, labels


##
from datasets import flowers

# This might take a few minutes.
train_dir = '/tmp/tfslim_model/'
print('Will save model to %s' % train_dir)

with tf.Graph().as_default():
    tf.logging.set_verbosity(tf.logging.INFO)

    dataset = flowers.get_split('train', flowers_data_dir)
    images, _, labels = load_batch(dataset)

    # 创建模型:
    logits = my_cnn(images, num_classes=dataset.num_classes, is_training=True)

    # loss 函数:
    one_hot_labels = slim.one_hot_encoding(labels, dataset.num_classes)
    slim.losses.softmax_cross_entropy(logits, one_hot_labels)
    total_loss = slim.losses.get_total_loss()

    # 创建 summaries,以可视化训练进程:
    tf.summary.scalar('losses/Total Loss', total_loss)

    # 设定 optimizer, 创建 train op:
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
    train_op = slim.learning.create_train_op(total_loss, optimizer)

    # 开始训练:
    final_loss = slim.learning.train(
      train_op,
      logdir=train_dir,
      number_of_steps=1, # For speed, we just do 1 epoch
      save_summaries_secs=1)

    print('Finished training. Final batch loss %d' % final_loss)

4.4 评估度量 metrics

以预测准确率(prediction accuracy) 和 top5 分类准确率为例.

from datasets import flowers

# This might take a few minutes.
with tf.Graph().as_default():
    tf.logging.set_verbosity(tf.logging.DEBUG)

    dataset = flowers.get_split('train', flowers_data_dir)
    images, _, labels = load_batch(dataset)

    logits = my_cnn(images, num_classes=dataset.num_classes, is_training=False)
    predictions = tf.argmax(logits, 1)

    # metrics 定义:
    names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
        'eval/Accuracy': slim.metrics.streaming_accuracy(predictions, labels),
        'eval/Recall@5': slim.metrics.streaming_recall_at_k(logits, labels, 5),
    })

    print('Running evaluation Loop...')
    checkpoint_path = tf.train.latest_checkpoint(train_dir)
    metric_values = slim.evaluation.evaluate_once(
        master='',
        checkpoint_path=checkpoint_path,
        logdir=train_dir,
        eval_op=names_to_updates.values(),
        final_op=names_to_values.values())

    names_to_values = dict(zip(names_to_values.keys(), metric_values))
    for name in names_to_values:
        print('%s: %f' % (name, names_to_values[name]))

5. 采用预训练模型

神经网络模型参数量比较大时,表现最佳,且是比较灵活的函数逼近器. 但是,也就是需要在大规模数据集上进行训练. 由于训练比较耗时,TensorFlow 提供和很多预训练模型,如 Pre-trained Models:

基于开源的预训练模型,可以在其基础上进一步应用到具体场景. 例如,一般是修改最后的 pre-softmax层,根据具体任务修改权重初始化,类别标签数等. 对于小数据集而言,十分有帮助.

下面 [inception-v1] 的例子,虽然 [inception-v3] 表现更好,但前者速度更快.

VGG 和 ResNet 的最后一层是 1000 维输出,而不是 10001 维. ImageNet 数据集提供了一个背景类background class,但 VGG 和 ResNet 没有用到该背景类.

下面给出 Inception V1 和 VGG-16 预训练模型的示例.

5.1 下载 Inception V1 断点文件

from datasets import dataset_utils

url = "http://download.tensorflow.org/models/inception_v1_2016_08_28.tar.gz"
checkpoints_dir = '/tmp/checkpoints'

if not tf.gfile.Exists(checkpoints_dir):
    tf.gfile.MakeDirs(checkpoints_dir)

dataset_utils.download_and_uncompress_tarball(url, checkpoints_dir)

5.2 应用 Inception V1 预训练模型

假设已经将每张图片尺寸调整为模型断点对应的尺寸.

import numpy as np
import os
import tensorflow as tf

try:
    import urllib2 as urllib
except ImportError:
    import urllib.request as urllib

from datasets import imagenet
from nets import inception
from preprocessing import inception_preprocessing

from tensorflow.contrib import slim

image_size = inception.inception_v1.default_image_size # 输入图片尺寸

with tf.Graph().as_default():
    url = 'https://upload.wikimedia.org/wikipedia/commons/7/70/EnglishCockerSpaniel_simon.jpg'
    image_string = urllib.urlopen(url).read()
    image = tf.image.decode_jpeg(image_string, channels=3)
    processed_image = inception_preprocessing.preprocess_image(image, image_size, image_size, is_training=False)
    processed_images  = tf.expand_dims(processed_image, 0)

    # 创建模型, 采用默认的 arg scope 作用域来配置 batch norm 参数.
    with slim.arg_scope(inception.inception_v1_arg_scope()):
        logits, _ = inception.inception_v1(processed_images, num_classes=1001, is_training=False)
    probabilities = tf.nn.softmax(logits)

    init_fn = slim.assign_from_checkpoint_fn(
        os.path.join(checkpoints_dir, 'inception_v1.ckpt'),
        slim.get_model_variables('InceptionV1'))

    with tf.Session() as sess:
        init_fn(sess)
        np_image, probabilities = sess.run([image, probabilities])
        probabilities = probabilities[0, 0:]
        sorted_inds = [i[0] for i in sorted(enumerate(-probabilities), key=lambda x:x[1])]

    plt.figure()
    plt.imshow(np_image.astype(np.uint8))
    plt.axis('off')
    plt.show()

    names = imagenet.create_readable_names_for_imagenet_labels()
    for i in range(5):
        index = sorted_inds[i]
        print('Probability %0.2f%% => [%s]' % (probabilities[index] * 100, names[index]))

5.3 下载 VGG-16 断点文件

from datasets import dataset_utils
import tensorflow as tf

url = "http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz"
checkpoints_dir = '/tmp/checkpoints'

if not tf.gfile.Exists(checkpoints_dir):
    tf.gfile.MakeDirs(checkpoints_dir)

dataset_utils.download_and_uncompress_tarball(url, checkpoints_dir)

5.4 应用 VGG-16 预训练模型

注意:1000 个类别而不是 1001.

import numpy as np
import os
import tensorflow as tf

try:
    import urllib2
except ImportError:
    import urllib.request as urllib

from datasets import imagenet
from nets import vgg
from preprocessing import vgg_preprocessing

from tensorflow.contrib import slim

image_size = vgg.vgg_16.default_image_size

with tf.Graph().as_default():
    url = 'https://upload.wikimedia.org/wikipedia/commons/d/d9/First_Student_IC_school_bus_202076.jpg'
    image_string = urllib.urlopen(url).read()
    image = tf.image.decode_jpeg(image_string, channels=3)
    processed_image = vgg_preprocessing.preprocess_image(image, image_size, image_size, is_training=False)
    processed_images  = tf.expand_dims(processed_image, 0)

    # Create the model, use the default arg scope to configure the batch norm parameters.
    with slim.arg_scope(vgg.vgg_arg_scope()):
        # 1000 classes instead of 1001.
        logits, _ = vgg.vgg_16(processed_images, num_classes=1000, is_training=False)
    probabilities = tf.nn.softmax(logits)

    init_fn = slim.assign_from_checkpoint_fn(
        os.path.join(checkpoints_dir, 'vgg_16.ckpt'),
        slim.get_model_variables('vgg_16'))

    with tf.Session() as sess:
        init_fn(sess)
        np_image, probabilities = sess.run([image, probabilities])
        probabilities = probabilities[0, 0:]
        sorted_inds = [i[0] for i in sorted(enumerate(-probabilities), key=lambda x:x[1])]

    plt.figure()
    plt.imshow(np_image.astype(np.uint8))
    plt.axis('off')
    plt.show()

    names = imagenet.create_readable_names_for_imagenet_labels()
    for i in range(5):
        index = sorted_inds[i]
        # Shift the index of a class name by one. 
        print('Probability %0.2f%% => [%s]' % (probabilities[index] * 100, names[index+1]))

5.5 在新数据集上 fine-tune 模型

基于 Flower 数据集 fine-tune Inception 模型.

# Note that this may take several minutes.

import os

from datasets import flowers
from nets import inception
from preprocessing import inception_preprocessing

from tensorflow.contrib import slim
image_size = inception.inception_v1.default_image_size


def get_init_fn():
    """Returns a function run by the chief worker to warm-start the training."""
    checkpoint_exclude_scopes=["InceptionV1/Logits", "InceptionV1/AuxLogits"]  #原输出层

    exclusions = [scope.strip() for scope in checkpoint_exclude_scopes]

    variables_to_restore = []
    for var in slim.get_model_variables():
        for exclusion in exclusions:
            if var.op.name.startswith(exclusion):
                break
        else:
            variables_to_restore.append(var)

    return slim.assign_from_checkpoint_fn(
        os.path.join(checkpoints_dir, 'inception_v1.ckpt'),
        variables_to_restore)


train_dir = '/tmp/inception_finetuned/'

with tf.Graph().as_default():
    tf.logging.set_verbosity(tf.logging.INFO)

    dataset = flowers.get_split('train', flowers_data_dir)
    images, _, labels = load_batch(dataset, height=image_size, width=image_size)

    # Create the model, use the default arg scope to configure the batch norm parameters.
    with slim.arg_scope(inception.inception_v1_arg_scope()):
        logits, _ = inception.inception_v1(images, num_classes=dataset.num_classes, is_training=True)

    # Specify the loss function:
    one_hot_labels = slim.one_hot_encoding(labels, dataset.num_classes)
    slim.losses.softmax_cross_entropy(logits, one_hot_labels)
    total_loss = slim.losses.get_total_loss()

    # Create some summaries to visualize the training process:
    tf.summary.scalar('losses/Total Loss', total_loss)

    # Specify the optimizer and create the train op:
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
    train_op = slim.learning.create_train_op(total_loss, optimizer)

    # Run the training:
    final_loss = slim.learning.train(train_op,
                                     logdir=train_dir,
                                     init_fn=get_init_fn(),
                                     number_of_steps=2)


print('Finished training. Last batch loss %f' % final_loss)

5.6 应用新数据集的 fine-tune 模型

import numpy as np
import tensorflow as tf
from datasets import flowers
from nets import inception

from tensorflow.contrib import slim

image_size = inception.inception_v1.default_image_size
batch_size = 3

with tf.Graph().as_default():
    tf.logging.set_verbosity(tf.logging.INFO)

    dataset = flowers.get_split('train', flowers_data_dir)
    images, images_raw, labels = load_batch(dataset, height=image_size, width=image_size)

    # Create the model, use the default arg scope to configure the batch norm parameters.
    with slim.arg_scope(inception.inception_v1_arg_scope()):
        logits, _ = inception.inception_v1(images, num_classes=dataset.num_classes, is_training=True)

    probabilities = tf.nn.softmax(logits)

    checkpoint_path = tf.train.latest_checkpoint(train_dir)
    init_fn = slim.assign_from_checkpoint_fn(checkpoint_path,
                                             slim.get_variables_to_restore())

    with tf.Session() as sess:
        with slim.queues.QueueRunners(sess):
            sess.run(tf.initialize_local_variables())
            init_fn(sess)
            np_probabilities, np_images_raw, np_labels = sess.run([probabilities, images_raw, labels])

            for i in range(batch_size): 
                image = np_images_raw[i, :, :, :]
                true_label = np_labels[i]
                predicted_label = np.argmax(np_probabilities[i, :])
                predicted_name = dataset.labels_to_names[predicted_label]
                true_name = dataset.labels_to_names[true_label]

                plt.figure()
                plt.imshow(image.astype(np.uint8))
                plt.title('Ground Truth: [%s], Prediction [%s]' % (true_name, predicted_name))
                plt.axis('off')
                plt.show()

原地址:https://github.com/tensorflow/models/blob/master/research/slim/slim_walkthrough.ipynb

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区

领取腾讯云代金券