for the training, the issues are mainly related to bn layer:
layer {
name: "conv6"
type: "Convolution"
bottom: "conv5_4"
top: "conv6"
param {
lr_mult: 10
decay_mult: 1
}
param {
lr_mult: 20
decay_mult: 1
}
convolution_param {
num_output: 21
kernel_size: 1
stride: 1
weight_filler {
type: "msra"
}
}
}
and what the differences between your BN layer and the one from BLVC/caffe ? - we assume that the newly added layers should have a larger lr then layers that are initialized from pretrained models.(假设新增加的层有较大的学习率 lr, 然后其它层从预训练的模型进行初始化) For the bn problem, the one in ‘BLVC/caffe’ does the normalization first then followed by a ‘scale’ layer to learn the transfer. While the one in this repo merge ‘slope’ and ‘bias’ in ‘scale’ layer into the bn layer.
cp new_caffe/include/caffe/util/cudnn.hpp ./include/caffe/util/cudnn.hpp
cp new_caffe/include/caffe/layers/cudnn_* ./include/caffe/layers/
cp new_caffe/src/caffe/layers/cudnn_* ./src/caffe/layers/