首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >PyTorch中的标号平滑

PyTorch中的标号平滑
EN

Stack Overflow用户
提问于 2019-04-15 01:14:30
回答 6查看 36.3K关注 0票数 34

我正在使用传输学习为斯坦福汽车数据集构建一个分类模型。我想要实现标号平滑来惩罚过度自信的预测和改进泛化。

TensorFlowCrossEntropyLoss中有一个简单的关键字参数。有人为PyTorch构建了类似的功能,我可以用它进行即插即用吗?

EN

回答 6

Stack Overflow用户

回答已采纳

发布于 2021-03-24 00:51:01

使用软目标,即硬目标加权平均和标签上的均匀分布,可以显著提高多类神经网络的泛化和学习速度。通过这种方式平滑标签可以防止网络变得过于自信,标签平滑已经应用于许多最先进的模型中,包括图像分类、语言翻译和语音识别。

标签平滑已经在交叉熵损失函数的Tensorflow中实现.BinaryCrossentropyCategoricalCrossentropy.但是目前还没有标签平滑PyTorch中的正式实现。然而,目前正在积极讨论这一问题,希望能向它提供一个正式的一揽子方案。下面是讨论主题:第7455期

在这里,我们将介绍一些标签平滑(LS)的可用最佳实现。基本上,有许多实现LS的方法。请参阅这方面的具体讨论,其中之一是这里另一个在这里。在这里,我们将以2的独特方式实现,每个版本都有两个版本;因此,总4

备选案文1: CrossEntropyLossWithProbs

通过这种方式,它接受one-hot目标向量。用户必须手动平滑目标向量。它可以在with torch.no_grad()范围内完成,因为它暂时将所有requires_grad标志设置为false。

  1. 杨德文来源
代码语言:javascript
复制
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from torch.nn.modules.loss import _WeightedLoss


class LabelSmoothingLoss(nn.Module):
    def __init__(self, classes, smoothing=0.0, dim=-1, weight = None):
        """if smoothing == 0, it's one-hot method
           if 0 < smoothing < 1, it's smooth method
        """
        super(LabelSmoothingLoss, self).__init__()
        self.confidence = 1.0 - smoothing
        self.smoothing = smoothing
        self.weight = weight
        self.cls = classes
        self.dim = dim

    def forward(self, pred, target):
        assert 0 <= self.smoothing < 1
        pred = pred.log_softmax(dim=self.dim)

        if self.weight is not None:
            pred = pred * self.weight.unsqueeze(0)   

        with torch.no_grad():
            true_dist = torch.zeros_like(pred)
            true_dist.fill_(self.smoothing / (self.cls - 1))
            true_dist.scatter_(1, target.data.unsqueeze(1), self.confidence)
        return torch.mean(torch.sum(-true_dist * pred, dim=self.dim))

此外,我们还在self. smoothing上添加了一个断言检查点,并在此实现上添加了损失加权支持。

  1. 沙阿来源

已经把答案发到这里了。这里我们指出,这个实现类似于杨德文的上面的实现,但是,这里我们提到了他的代码,同时最小化了一些code syntax

代码语言:javascript
复制
class SmoothCrossEntropyLoss(_WeightedLoss):
    def __init__(self, weight=None, reduction='mean', smoothing=0.0):
        super().__init__(weight=weight, reduction=reduction)
        self.smoothing = smoothing
        self.weight = weight
        self.reduction = reduction

    def k_one_hot(self, targets:torch.Tensor, n_classes:int, smoothing=0.0):
        with torch.no_grad():
            targets = torch.empty(size=(targets.size(0), n_classes),
                                  device=targets.device) \
                                  .fill_(smoothing /(n_classes-1)) \
                                  .scatter_(1, targets.data.unsqueeze(1), 1.-smoothing)
        return targets

    def reduce_loss(self, loss):
        return loss.mean() if self.reduction == 'mean' else loss.sum() \
        if self.reduction == 'sum' else loss

    def forward(self, inputs, targets):
        assert 0 <= self.smoothing < 1

        targets = self.k_one_hot(targets, inputs.size(-1), self.smoothing)
        log_preds = F.log_softmax(inputs, -1)

        if self.weight is not None:
            log_preds = log_preds * self.weight.unsqueeze(0)

        return self.reduce_loss(-(targets * log_preds).sum(dim=-1))

检查

代码语言:javascript
复制
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from torch.nn.modules.loss import _WeightedLoss


if __name__=="__main__":
    # 1. Devin Yang
    crit = LabelSmoothingLoss(classes=5, smoothing=0.5)
    predict = torch.FloatTensor([[0, 0.2, 0.7, 0.1, 0],
                                 [0, 0.9, 0.2, 0.2, 1], 
                                 [1, 0.2, 0.7, 0.9, 1]])
    v = crit(Variable(predict),
             Variable(torch.LongTensor([2, 1, 0])))
    print(v)

    # 2. Shital Shah
    crit = SmoothCrossEntropyLoss(smoothing=0.5)
    predict = torch.FloatTensor([[0, 0.2, 0.7, 0.1, 0],
                                 [0, 0.9, 0.2, 0.2, 1], 
                                 [1, 0.2, 0.7, 0.9, 1]])
    v = crit(Variable(predict),
             Variable(torch.LongTensor([2, 1, 0])))
    print(v)

tensor(1.4178)
tensor(1.4178)

备选案文2: LabelSmoothingCrossEntropyLoss

这样,它接受目标向量,并且使用不手动平滑目标向量,而是内建模块负责标签平滑。它允许我们根据F.nll_loss实现标签平滑。

(a)。王雷官来源 - (AFAIK),原版海报

(b)。数据龙:添加了来源的加权支持

此外,我们还稍微最小化了编码编写,使其更加简洁。

代码语言:javascript
复制
class LabelSmoothingLoss(torch.nn.Module):
    def __init__(self, smoothing: float = 0.1, 
                 reduction="mean", weight=None):
        super(LabelSmoothingLoss, self).__init__()
        self.smoothing   = smoothing
        self.reduction = reduction
        self.weight    = weight

    def reduce_loss(self, loss):
        return loss.mean() if self.reduction == 'mean' else loss.sum() \
         if self.reduction == 'sum' else loss

    def linear_combination(self, x, y):
        return self.smoothing * x + (1 - self.smoothing) * y

    def forward(self, preds, target):
        assert 0 <= self.smoothing < 1

        if self.weight is not None:
            self.weight = self.weight.to(preds.device)

        n = preds.size(-1)
        log_preds = F.log_softmax(preds, dim=-1)
        loss = self.reduce_loss(-log_preds.sum(dim=-1))
        nll = F.nll_loss(
            log_preds, target, reduction=self.reduction, weight=self.weight
        )
        return self.linear_combination(loss / n, nll)
  1. NVIDIA/深造-实例来源
代码语言:javascript
复制
class LabelSmoothing(nn.Module):
    """NLL loss with label smoothing.
    """
    def __init__(self, smoothing=0.0):
        """Constructor for the LabelSmoothing module.
        :param smoothing: label smoothing factor
        """
        super(LabelSmoothing, self).__init__()
        self.confidence = 1.0 - smoothing
        self.smoothing = smoothing

    def forward(self, x, target):
        logprobs = torch.nn.functional.log_softmax(x, dim=-1)
        nll_loss = -logprobs.gather(dim=-1, index=target.unsqueeze(1))
        nll_loss = nll_loss.squeeze(1)
        smooth_loss = -logprobs.mean(dim=-1)
        loss = self.confidence * nll_loss + self.smoothing * smooth_loss
        return loss.mean()

检查

代码语言:javascript
复制
if __name__=="__main__":
    # Wangleiofficial
    crit = LabelSmoothingLoss(smoothing=0.3, reduction="mean")
    predict = torch.FloatTensor([[0, 0.2, 0.7, 0.1, 0],
                                 [0, 0.9, 0.2, 0.2, 1], 
                                 [1, 0.2, 0.7, 0.9, 1]])

    v = crit(Variable(predict),
             Variable(torch.LongTensor([2, 1, 0])))
    print(v)

    # NVIDIA
    crit = LabelSmoothing(smoothing=0.3)
    predict = torch.FloatTensor([[0, 0.2, 0.7, 0.1, 0],
                                 [0, 0.9, 0.2, 0.2, 1], 
                                 [1, 0.2, 0.7, 0.9, 1]])
    v = crit(Variable(predict),
             Variable(torch.LongTensor([2, 1, 0])))
    print(v)

tensor(1.3883)
tensor(1.3883)

更新:正式加入

代码语言:javascript
复制
torch.nn.CrossEntropyLoss(weight=None, size_average=None, 
                          ignore_index=- 100, reduce=None, 
                          reduction='mean', label_smoothing=0.0)
票数 41
EN

Stack Overflow用户

发布于 2019-12-10 10:17:29

我一直在寻找从_Loss派生的选项,就像PyTorch中的其他损失类一样,并且尊重基本参数,比如reduction。不幸的是,我找不到直接的替代品,所以最终写了我自己的。不过,我还没有对此进行充分的测试:

代码语言:javascript
复制
import torch
from torch.nn.modules.loss import _WeightedLoss
import torch.nn.functional as F

class SmoothCrossEntropyLoss(_WeightedLoss):
    def __init__(self, weight=None, reduction='mean', smoothing=0.0):
        super().__init__(weight=weight, reduction=reduction)
        self.smoothing = smoothing
        self.weight = weight
        self.reduction = reduction

    @staticmethod
    def _smooth_one_hot(targets:torch.Tensor, n_classes:int, smoothing=0.0):
        assert 0 <= smoothing < 1
        with torch.no_grad():
            targets = torch.empty(size=(targets.size(0), n_classes),
                    device=targets.device) \
                .fill_(smoothing /(n_classes-1)) \
                .scatter_(1, targets.data.unsqueeze(1), 1.-smoothing)
        return targets

    def forward(self, inputs, targets):
        targets = SmoothCrossEntropyLoss._smooth_one_hot(targets, inputs.size(-1),
            self.smoothing)
        lsm = F.log_softmax(inputs, -1)

        if self.weight is not None:
            lsm = lsm * self.weight.unsqueeze(0)

        loss = -(targets * lsm).sum(-1)

        if  self.reduction == 'sum':
            loss = loss.sum()
        elif  self.reduction == 'mean':
            loss = loss.mean()

        return loss

其他备选方案:

票数 9
EN

Stack Overflow用户

发布于 2019-04-15 09:00:49

据我所知没有。

下面是两个PyTorch实现的示例:

票数 6
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55681502

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档