简单来说迁移学习是把在ImageNet等大型数据集上训练好的CNN模型拿过来,经过简单的调整应用到自己的项目上去。
迁移学习分为三种:
为什么只重用卷积基?能使用相同的分类器吗?一般来说前面的卷积基提取了低级特征,这在很多其他类似问题是可以通用的。而最后的全连接层是与具体问题相关的高级特征,因此不太可复用。
代码步骤 加载数据 这一步很正常,主要是处理图片数据和划分数据集加载MobileNetV2模型(不含全连接层) Keras的应用模块Application提供了带有预训练权重的Keras模型,这些模型可以用来进行预测、特征提取和finetune。你可以从keras.applications模块中导入它。base_model = MobileNetV2(weights='imagenet', include_top=False)
def add_new_last_layer(base_model, nb_classes):
x = base_model.output
x = GlobalAveragePooling2D()(x)
# GlobalAveragePooling2D 将 MxNxC 的张量转换成 1xC 张量,C是通道数
x = Dense(FC_SIZE, activation='relu')(x)
predictions = Dense(nb_classes, activation='softmax')(x)
model = Model(input=base_model.input, output=predictions)
return model
冻结base_model所有层,然后进行训练。
def setup_to_transfer_learn(model, base_model):
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
setup_to_transfer_learn(model, base_model)
其实这一步也可以和上一步结合起来写,更加简洁:
from keras import models
from keras import layers
# 在conv_base的基础上添加全连接分类网络
conv_base = MobileNetV2(weights='imagenet', include_top=False)
conv_base.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
冻结部分层,对顶层分类器进行Fine-tune
Fine-tune以一个预训练好的网络为基础,在新的数据集上重新训练一小部分权重。fine-tune应该在很低的学习率下进行。
def setup_to_finetune(model):
for layer in model.layers[:NB_MobileNetV2_LAYERS_TO_FREEZE]:
layer.trainable = False
for layer in model.layers[NB_MobileNetV2_LAYERS_TO_FREEZE:]:
layer.trainable = True
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])
这里可能比较疑惑的是NB_MobileNetV2_LAYERS_TO_FREEZE是多少呢,怎么找呢。方法是利用Pycharm的Debug功能,查看base_model.layers中的值。 当然也可以选择使用layer name来进行选择:
conv_base.trainable = True
set_trainable = False
for layer in conv_base.layers:
if layer.name == 'block5_conv1':
set_trainable = True
if set_trainable:
layer.trainable = True
else:
layer.trainable = False
from keras.applications import MobileNetV2
from keras import layers
from keras.models import Model
from keras.optimizers import SGD
from keras.utils import plot_model
FC_SIZE = 256
IM_WIDTH, IM_HEIGHT = 28, 28
nb_classes = 100
NB_MobileNetV2_LAYERS_TO_FREEZE = 149
def add_new_last_layer(base_model, nb_classes):
x = base_model.output
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(FC_SIZE, activation='relu')(x)
predictions = layers.Dense(nb_classes, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
return model
def setup_to_transfer_learn(model, base_model):
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
def setup_to_finetune(model):
for layer in model.layers[:NB_MobileNetV2_LAYERS_TO_FREEZE]:
layer.trainable = False
for layer in model.layers[NB_MobileNetV2_LAYERS_TO_FREEZE:]:
layer.trainable = True
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])
if __name__ == '__main__':
base_model = MobileNetV2(weights='imagenet', include_top=False)
model = add_new_last_layer(base_model, nb_classes)
setup_to_transfer_learn(model, base_model)
model.fit()
setup_to_finetune(model)
model.fit()
print(model.summary())
plot_model(model, to_file='mobilev2.png', show_shapes=True)