文章/答案/技术大牛

发布

社区首页 >问答首页 >使用回调ReduceLROnPlateau的鉴别层训练问题

问使用回调ReduceLROnPlateau的鉴别层训练问题
EN

Stack Overflow用户

提问于 2021-11-11 20:37:37

回答 1查看 66关注 0票数 0

我正在尝试使用tensorflow addon的多优化器进行鉴别层训练，不同层的学习率不同，但它不能与回调ReduceLROnPlateau一起工作。

from tensorflow.keras.callbacks import ReduceLROnPlateau
reduce_lr = ReduceLROnPlateau(patience=5, min_delta=1e-4, min_lr=1e-7, verbose=0)

with tpu_strategy.scope():
  roberta_model = create_model(512)
  optimizers = [
        AdamWeightDecay(learning_rate=0.00001, weight_decay_rate=0.00001),
        AdamWeightDecay(learning_rate=0.0001, weight_decay_rate=0.0001)
    ]
    
    # specifying the optimizers and layers in which it will operate
  optimizers_and_layers = [
        (optimizers[0], roberta_model.layers[:3]),
        (optimizers[1], roberta_model.layers[3:])
    ]

    # Using Multi Optimizer from Tensorflow Addons
  opt = tfa.optimizers.MultiOptimizer(optimizers_and_layers)
  roberta_model.compile(optimizer=opt, 
  loss=tf.keras.losses.CategoricalCrossentropy(label_smoothing=0.1), metrics=["accuracy"])
 
history=roberta_model.fit(train, epochs=50, validation_data=val, callbacks=[reduce_lr])

在第一个时期结束时，它会产生以下错误：

AttributeError: 'MultiOptimizer' object has no attribute 'lr'

它在没有ReduceLROnPlateau回调的情况下工作得很好。

我尝试了几种方法来解决这个问题，其中最后一次尝试是修改回调--编写我自己的关于平台回调的reduce学习率。但这远远超出了我的编码技能。我已经评论了我对原始回调做了几处更改。我试过这样做：

class My_ReduceLROnPlateau(tf.keras.callbacks.Callback):

  def __init__(self,
               monitor='val_loss',
               factor=0.1,
               patience=10,
               verbose=0,
               mode='auto',
               min_delta=1e-4,
               cooldown=0,
               min_lr=0,
               **kwargs):
    super(My_ReduceLROnPlateau, self).__init__()

    self.monitor = monitor
    if factor >= 1.0:
      raise ValueError(
          f'ReduceLROnPlateau does not support a factor >= 1.0. Got {factor}')
    if 'epsilon' in kwargs:
      min_delta = kwargs.pop('epsilon')
      logging.warning('`epsilon` argument is deprecated and '
                      'will be removed, use `min_delta` instead.')
    self.factor = factor
    self.min_lr = min_lr
    self.min_delta = min_delta
    self.patience = patience
    self.verbose = verbose
    self.cooldown = cooldown
    self.cooldown_counter = 0  # Cooldown counter.
    self.wait = 0
    self.best = 0
    self.mode = mode
    self.monitor_op = None

    self._reset()

  def _reset(self):
    """Resets wait counter and cooldown counter.
    """
    if self.mode not in ['auto', 'min', 'max']:
      logging.warning('Learning rate reduction mode %s is unknown, '
                      'fallback to auto mode.', self.mode)
      self.mode = 'auto'
    if (self.mode == 'min' or
        (self.mode == 'auto' and 'acc' not in self.monitor)):
      self.monitor_op = lambda a, b: np.less(a, b - self.min_delta)
      self.best = np.Inf
    else:
      self.monitor_op = lambda a, b: np.greater(a, b + self.min_delta)
      self.best = -np.Inf
    self.cooldown_counter = 0
    self.wait = 0

  def on_train_begin(self, logs=None):
    self._reset()

  def on_epoch_end(self, epoch, logs=None):
    logs = logs or {}
    logs['lr'] = backend.get_value(self.model.optimizer[1].lr)
    current = logs.get(self.monitor)
    if current is None:
      logging.warning('Learning rate reduction is conditioned on metric `%s` '
                      'which is not available. Available metrics are: %s',
                      self.monitor, ','.join(list(logs.keys())))

    else:
      if self.in_cooldown():
        self.cooldown_counter -= 1
        self.wait = 0

      if self.monitor_op(current, self.best):
        self.best = current
        self.wait = 0
      elif not self.in_cooldown():
        self.wait += 1
        if self.wait >= self.patience:

      # Here below i tried to subscript the self.model.optimizer
      #, guessing that each pointed to one of the optimzers.
      # And using the same code as in the original ReduceLROnPlateau to 
      # update the optimizers. 

          old_lr1 = backend.get_value(self.model.optimizer[1].lr)
          old_lr0 = backend.get_value(self.model.optimizer[0].lr)
          if old_lr1 > np.float32(self.min_lr):
            new_lr1 = old_lr1 * self.factor
            new_lr1 = max(new_lr1, self.min_lr)
            backend.set_value(self.model.optimizer[1].lr, new_lr1)
            new_lr0 = old_lr0 * self.factor
            new_lr0 = max(new_lr0, self.min_lr)
            backend.set_value(self.model.optimizer[0].lr, new_lr0)
            if self.verbose > 0:
              io_utils.print_msg(
                  f'\nEpoch {epoch +1}: '
                  f'ReduceLROnPlateau reducing learning rate to {new_lr0} and {new_lr1}.')
            self.cooldown_counter = self.cooldown
            self.wait = 0

  def in_cooldown(self):
    return self.cooldown_counter > 0

然后我创建了回调

reduce_lr = My_ReduceLROnPlateau(patience=5, min_delta=1e-4, min_lr=1e-7, verbose=0)

又开始训练了。在第一个时期的末尾，我得到了以下错误。

TypeError: 'MultiOptimizer' object is not subscriptable

也就是说你不能这样做，self.model.optimizer1，self.model.optimizer。

所以我的问题是如何解决这个问题？也就是说，使用ReduceLROnPlateau的鉴别层训练。或者通过其他方法，或者修改我创建新回调类的尝试。

这里有一个指向orginal ReduceLROnPlateau callback的链接，也就是说，没有我在自定义回调中所做的一些更改。

一个解决方案也许可以使用this

注意:目前，tfa.optimizers.MultiOptimizer不支持修改优化器的回调。但是，您可以使用tf.keras.optimizers.schedules.LearningRateSchedule而不是静态学习率来实例化优化器层对

python

tensorflow

machine-learning

keras

deep-learning

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-11-12 12:06:09

查看tfa.optimizers.MultiOptimizer的代码(在create_optimizer_spec方法中，似乎可以通过self.model.optimizer.optimizer_specs[0]["optimizer"]和self.model.optimizer.optimizer_specs[1]["optimizer"]访问优化器来更改学习率(这就是self.model.optimizer[1]引发错误的原因)。那么你的自定义回调就可以工作了。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/69934643

复制

相似问题

问使用回调ReduceLROnPlateau的鉴别层训练问题
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用回调ReduceLROnPlateau的鉴别层训练问题EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用回调ReduceLROnPlateau的鉴别层训练问题
EN