首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >在Pytorch中,如何将L1正则化器添加到激活中?

在Pytorch中,如何将L1正则化器添加到激活中?
EN

Stack Overflow用户
提问于 2018-08-01 07:01:36
回答 2查看 0关注 0票数 0

我想将L1正则化器添加到ReLU的激活输出中。或者说,如何仅将规则化器添加到网络中的特定层?

代码语言:javascript
复制
crossentropy + lambda1*L1(layer1) + lambda2*L1(layer2) + ...

我相信提供给torch.optim.Adagrad的参数仅适用于交叉熵损失。或者它可能适用于整个网络的所有参数(权重)。但无论如何,似乎不允许将单一的正则化应用于单层激活,并且不会提供L1损失。

EN

回答 2

Stack Overflow用户

发布于 2018-08-01 15:29:41

以下是如何做到的方法:

  • 在模块的前向返回中,最终输出和要应用L1正则化的层的输出
  • loss变量为交叉熵、输出损失、目标损失和L1惩罚之和。

下面是一个示例代码

代码语言:txt
复制
import torch
from torch.autograd import Variable
from torch.nn import functional as F


class MLP(torch.nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.linear1 = torch.nn.Linear(128, 32)
        self.linear2 = torch.nn.Linear(32, 16)
        self.linear3 = torch.nn.Linear(16, 2)

    def forward(self, x):
        layer1_out = F.relu(self.linear1(x))
        layer2_out = F.relu(self.linear2(layer1_out))
        out = self.linear3(layer2_out)
        return out, layer1_out, layer2_out


def l1_penalty(var):
    return torch.abs(var).sum()


def l2_penalty(var):
    return torch.sqrt(torch.pow(var, 2).sum())


batchsize = 4
lambda1, lambda2 = 0.5, 0.01

model = MLP()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

# usually following code is looped over all batches 
# but let's just do a dummy batch for brevity

inputs = Variable(torch.rand(batchsize, 128))
targets = Variable(torch.ones(batchsize).long())

optimizer.zero_grad()
outputs, layer1_out, layer2_out = model(inputs)
cross_entropy_loss = F.cross_entropy(outputs, targets)
l1_regularization = lambda1 * l1_penalty(layer1_out)
l2_regularization = lambda2 * l2_penalty(layer2_out)

loss = cross_entropy_loss + l1_regularization + l2_regularization
loss.backward()
optimizer.step()
票数 0
EN

Stack Overflow用户

发布于 2018-08-01 16:04:06

正则化应该是模型每一层的加权参数,而不是每一层的输出。

代码语言:txt
复制
import torch
from torch.autograd import Variable
from torch.nn import functional as F


class MLP(torch.nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.linear1 = torch.nn.Linear(128, 32)
        self.linear2 = torch.nn.Linear(32, 16)
        self.linear3 = torch.nn.Linear(16, 2)
    def forward(self, x):
        layer1_out = F.relu(self.linear1(x))
        layer2_out = F.relu(self.linear2(layer1_out))
        out = self.linear3(layer2_out)
        return out

batchsize = 4
lambda1, lambda2 = 0.5, 0.01

model = MLP()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

inputs = Variable(torch.rand(batchsize, 128))
targets = Variable(torch.ones(batchsize).long())
l1_regularization, l2_regularization = torch.tensor(0), torch.tensor(0)

optimizer.zero_grad()
outputs = model(inputs)
cross_entropy_loss = F.cross_entropy(outputs, targets)
for param in model.parameters():
    l1_regularization += torch.norm(param, 1)
    l2_regularization += torch.norm(param, 2)

loss = cross_entropy_loss + l1_regularization + l2_regularization
loss.backward()
optimizer.step()
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/-100008736

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档