0. 写在前面：

1. 一些习惯及方法提炼

2. 图像识别入门

2.1 vgg 16网络

vgg 是一个非常简洁的深度神经网络，我们这里主要关注的是上图中的网络结构。

111 # Block 1

112 x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)

113 x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)

114 x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

115

116 # Block 2

117 x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)

118 x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)

119 x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

120

121 # Block 3

122 x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)

123 x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)

124 x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)

125 x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

126

127 # Block 4

128 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)

129 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)

130 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)

131 x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

132

133 # Block 5

134 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)

135 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)

136 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)

137 x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)

keras的一大优点就是接口封装的相当简洁，命名规则也是一目了然，从接口名字就可以猜到功能，从参数名字就能知道作用。以上代码就是一个典型的例子，对照图1从Block1 到 Block5的描述和图中的结构完全一致，每个 Block 都是由两到三个卷积层加上一个池化层构成。具体的接口说明在这里不做过多的说明，有兴趣的同学可以参看 keras 相关的文档对接口参数做更详细的了解。

21 def conv_block(ipt, num_filter, groups, dropouts, num_channels=None):

23 input=ipt,

24 num_channels=num_channels,

25 pool_size=2,

26 pool_stride=2,

27 conv_num_filter=[num_filter] * groups,

28 conv_filter_size=3,

30 conv_with_batchnorm=True,

31 conv_batchnorm_drop_rate=dropouts,

33

34 conv1 = conv_block(input, 64, 2, [0.3, 0], 3)

35 conv2 = conv_block(conv1, 128, 2, [0.4, 0])

36 conv3 = conv_block(conv2, 256, 3, [0.4, 0.4, 0])

37 conv4 = conv_block(conv3, 512, 3, [0.4, 0.4, 0])

38 conv5 = conv_block(conv4, 512, 3, [0.4, 0.4, 0])

Image Convolution Group, Used for vgg net.

Parameters:

· conv_batchnorm_drop_rate (list) – if conv_with_batchnorm[i] is true, conv_batchnorm_drop_rate[i] represents the drop rate of each batch norm.· input (LayerOutput) – input layer.· conv_num_filter (list|tuple) – list of output channels num.· pool_size (int) – pooling filter size.· num_channels (int) – input channels num.· conv_padding (int) – convolution padding size.· conv_filter_size (int) – convolution filter size.· conv_act (BaseActivation) – activation funciton after convolution.· conv_with_batchnorm (list) – if conv_with_batchnorm[i] is true, there is a batch normalization operation after each convolution.· pool_stride (int) – pooling stride size.· pool_type (BasePoolingType) – pooling type.· param_attr (ParameterAttribute) – param attribute of convolution layer, None means default attribute.

Returns:

layer’s output

input 是输入图像；64代表了这一个 block 中的每个卷积层有 64 个 feature map；2 代表了有两个卷积层；[0.3 0] 分别代表每个卷积层 dropout 的概率；3 代表了输入图像的 chanel 个数（RGB或者 BGR，这里如果是第一层的输入要有个这个参数）。

conv1 代表整个 block 的输入，也就是上一个 block的输出；128 代表了这一个 block 中的每个卷积层有 128 个 feature map； 2 代表了有两个卷积层；[0.4 0] 分别代表每个卷积层 dropout 的概率。

336 def img_conv_group(input,

337 conv_num_filter,

338 pool_size,

339 num_channels=None,

341 conv_filter_size=3,

342 conv_act=None,

343 conv_with_batchnorm=False,

344 conv_batchnorm_drop_rate=0,

345 pool_stride=1,

346 pool_type=None,

347 param_attr=None):

434 return img_pool_layer(

435 input=tmp, stride=pool_stride, pool_size=pool_size, pool_type=pool_type)

43 input=fc1,

47 return fc2

（这里的batch_norm其实并非是一个真正的层，它的作用就是将网络上一层的输出数据转换成正态分布）

39 net = vgg_bn_drop(image)

40

2.2 classification_cost函数

4576 def classification_cost(input,

4577 label,

4578 weight=None,

4579 name=None,

4580 evaluator=classification_error_evaluator,

4581 layer_attr=None,

4582 coeff=1.):

4612 Layer(

4613 name=name,

4614 type="multi-class-cross-entropy",

4615 inputs=ipts,

4616 coeff=coeff,

4617 **ExtraLayerAttribute.to_kwargs(layer_attr))

106 if (type == "multi-class-cross-entropy")

107 return LayerPtr(new MultiClassCrossEntropy(config));

59 /**

60 * The cross-entropy loss for multi-class classification task.

61 * The loss function is:

62 *

63 * \f[

64 * L = - \sum_{i}{t_{k} * log(P(y=k))}

65 * \f]

66 */

67 class MultiClassCrossEntropy : public CostLayer {

68 public:

69 explicit MultiClassCrossEntropy(const LayerConfig& config)

70 : CostLayer(config) {}

71

72 bool init(const LayerMap& layerMap,

73 const ParameterMap& parameterMap) override;

74

75 void forwardImp(Matrix& output, Argument& label, Matrix& cost) override;

76

77 void backwardImp(Matrix& outputValue,

78 Argument& label,

80 };

2.3 Momentum更新方法

51 # Create optimizer

53 momentum=0.9,

55 learning_rate=0.1 / 128.0,

56 learning_rate_decay_a=0.1,

57 learning_rate_decay_b=50000 * 100,

58 learning_rate_schedule='discexp')

25 class Optimizer(object):

26 def __init__(self, **kwargs):

28 if 'batch_size' in kwargs:

29 del kwargs['batch_size'] # not important for python library.

30

31 def __impl__():

32 v1_optimizers.settings(batch_size=1, **kwargs)

33

34 self.__opt_conf_proto__ = config_parser_utils.parse_optimizer_config(

35 __impl__)

36 self.__opt_conf__ = swig_api.OptimizationConfig.createFromProto(

37 self.__opt_conf_proto__)

358 def settings(batch_size,

359 learning_rate=1e-3,

360 learning_rate_decay_a=0.,

361 learning_rate_decay_b=0.,

362 learning_rate_schedule='poly',

363 learning_rate_args='',

364 learning_method=None,

365 regularization=None,

366 is_async=False,

367 model_average=None,

2.4 其他

3. 总结：

0 条评论

## 相关文章

36750

53060

### 除以3，乘以2（STL+排序）- Codeforces 997D

Polycarp likes to play with numbers. He takes some integer number x, writes it d...

12420

29520

364120

34660

17850

24770

### 【学术】不懂神经网络？不怕，一文教你用JavaScript构建神经网络

AiTechYun 编辑：xiaoshan.xiang 本文的内容并不是关于神经网络的深度教程，在这里既不会深入研究输入层、激活函数的内部原理，也不会教你如何使...

35440

35380