1998年的 LeNet-5 标志着 CNN的真正面世 。
该网络在字符识别上取得了高于99%的准确率,因此主要被用于字符识别的卷积神经网络。
但是这个模型在后来的一段时间并未能火起来,主要原因是费机器(当时尚未有GPU),而且在非OCR的任务上,其他的算法(如SVM)也能达到类似的效果甚至超过。
LeNet=(conv+maxpooling)×2+fc×2+GaussianLeNet=(conv+maxpooling)×2+fc×2+Gaussian
LeNet = (conv+maxpooling)×2 + fc×2 + Gaussian
其中,每一个“矩形”代表一张特征图,最后是两层全连接层。
标准代码引自 BVLC/caffe/examples/mnist/lenet.prototxt :
name: "LeNet"
# ========== 输入 ==========
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 64 dim: 1 dim: 28 dim: 28 } }
}
# ========== 第一层 ==========
layer { # 卷积
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer { # max池化
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
# ========== 第二层 ==========
layer { # 卷积
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer { # max池化
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
# ========== 第三层 ==========
layer { # 全连接
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer { # relu激活函数
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
# ========== 第四层 ==========
layer { # 全连接
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer { # Softmax
name: "prob"
type: "Softmax"
bottom: "ip2"
top: "prob"
}