文章/答案/技术大牛

发布

社区首页 >问答首页 >验证损失和验证精度曲线随预训练模型的波动

问验证损失和验证精度曲线随预训练模型的波动
EN

Stack Overflow用户

提问于 2019-11-15 07:25:46

回答 1查看 421关注 0票数 3

我目前正在学习神经网络，在尝试学习CNN时遇到了问题，我正在尝试训练包含音乐流派频谱图的数据。我的数据由27000个语谱图组成，分为3类(类型)。我的数据被分割成9:1的比例用于训练和验证

有人能帮我解释一下为什么我的验证损失/准确性的结果是不稳定的吗？我使用的是Keras的MobileNetV2，并将其与3个密实层连接。下面是我的代码片段：

train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    validation_split=0.1)

train_generator = train_datagen.flow_from_dataframe(
    dataframe=traindf,
    directory="...",
    color_mode='rgb',
    x_col="ID",
    y_col="Class",
    subset="training",
    batch_size=32,
    seed=42,
    shuffle=True,
    class_mode="categorical",
    target_size=(64, 64))

valid_generator = train_datagen.flow_from_dataframe(
    dataframe=traindf,
    directory="...",
    color_mode='rgb',
    x_col="ID",
    y_col="Class",
    subset="validation",
    batch_size=32,
    seed=42,
    shuffle=True,
    class_mode="categorical",
    target_size=(64, 64))

base_model = MobileNetV2(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1025, activation='relu')(x)
x = Dense(1025, activation='relu')(x)
x = Dense(512, activation='relu')(x)
preds = Dense(3, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=preds)

model.compile(optimizer='adam', loss='categorical_crossentropy',
                  metrics=['accuracy'])

step_size_train = train_generator.n//train_generator.batch_size
step_size_valid = valid_generator.n//valid_generator.batch_size
history = model.fit_generator(
    generator=train_generator,
    steps_per_epoch=step_size_train,
    validation_data=valid_generator,
    validation_steps=step_size_valid,
    epochs=75)

这些是我的验证损失和验证精度曲线的图片，它们波动太大

有没有什么办法可以减少波动或者让它变得更好？我是不是有过度贴合或贴合不足的问题？我尝试过使用Dropout()，但它只会让事情变得更糟。我需要做些什么来解决这个问题？

谢谢你，阿奎拉·塞蒂亚万·卡纳迪。

mobilenet

python

tensorflow

keras

conv-neural-network

回答 1

Stack Overflow用户

发布于 2020-04-22 16:14:38

首先，缺少验证损失和验证准确性的图片。

为了回答您的问题，以下可能是您的验证损失和验证精度波动的原因-

您已经向base_model添加了大约1.25倍的权重来构建模型。(model Trainable Parameters 5115398 - base_model Trainable Parameters 2223872 = 2891526)

程序统计：

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from keras.utils.layer_utils import count_params

class color:
   PURPLE = '\033[95m'
   CYAN = '\033[96m'
   DARKCYAN = '\033[36m'
   BLUE = '\033[94m'
   GREEN = '\033[92m'
   YELLOW = '\033[93m'
   RED = '\033[91m'
   BOLD = '\033[1m'
   UNDERLINE = '\033[4m'
   END = '\033[0m'

base_model = tf.keras.applications.MobileNetV2(weights='imagenet', include_top=False)

#base_model.summary()
trainable_count = count_params(base_model.trainable_weights)
non_trainable_count = count_params(base_model.non_trainable_weights)
print("\n",color.BOLD + '  base_model Statistics !' + color.END)
print("Trainable Parameters :", color.BOLD + str(trainable_count) + color.END)
print("Non Trainable Parameters :", non_trainable_count,"\n")

x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1025, activation='relu')(x)
x = Dense(1025, activation='relu')(x)
x = Dense(512, activation='relu')(x)
preds = Dense(3, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=preds)

#model.summary()
trainable_count = count_params(model.trainable_weights)
non_trainable_count = count_params(model.non_trainable_weights)
print(color.BOLD + '    model Statistics !' + color.END)
print("Trainable Parameters :", color.BOLD + str(trainable_count) + color.END)
print("Non Trainable Parameters :", non_trainable_count,"\n")

new_weights_added = count_params(model.trainable_weights) - count_params(base_model.trainable_weights)
print("Additional trainable weights added to the model excluding basel model trainable weights :", color.BOLD + str(new_weights_added) + color.END)

输出-

WARNING:tensorflow:`input_shape` is undefined or non-square, or `rows` is not in [96, 128, 160, 192, 224]. Weights for input shape (224, 224) will be loaded as the default.

   base_model Statistics !
Trainable Parameters : 2223872
Non Trainable Parameters : 34112 

    model Statistics !
Trainable Parameters : 5115398
Non Trainable Parameters : 34112 

Additional trainable weights added to the model excluding basel model trainable weights : 2891526

您正在训练完整的模型权重(MobileNetV2权重和其他层权重)。

您问题的解决方案是-

以这样的方式自定义附加层，即与base_model可训练参数相比，新的可训练参数最少。base_model.trainable = False可能会添加最大池化层和较少的密集layers.
Freeze基础模型，并只训练您在MobileNetV2层之上添加的新层。

或

解冻基础模型的顶层(MobileNetV2层)，并将底层设置为不可训练。你可以像下面这样做，我们冻结模型到第100层，剩下的层将是可训练的-

# Let's take a look to see how many layers are in the base model
print("Number of layers in the base model: ", len(base_model.layers))

# Fine-tune from this layer onwards
fine_tune_at = 100

# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
  layer.trainable =  False

输出-

Number of layers in the base model:  155

使用超参数调整来训练模型。您可以找到有关超参数调整here.

的更多信息

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/58868086

复制

相似问题

问验证损失和验证精度曲线随预训练模型的波动
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问验证损失和验证精度曲线随预训练模型的波动EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问验证损失和验证精度曲线随预训练模型的波动
EN