文章/答案/技术大牛

发布

社区首页 >问答首页 >我的网络在PyTorch上的参数没有更新

问我的网络在PyTorch上的参数没有更新
EN

Stack Overflow用户

提问于 2020-09-07 06:39:43

回答 1查看 1.2K关注 0票数 1

我想用PyTorch制作一个自动校准系统。

我尝试把齐次变换矩阵作为神经网络的权值来处理。

我编写了一个引用PyTorch教程的代码，但是在调用反向方法之后，我的自定义参数不会被更新。

当我打印每个参数的'grad‘属性时，它是一个None。

我的密码在下面。有什么不对劲吗？

请给我任何建议。谢谢。

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import numpy as np

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.params = nn.Parameter(torch.rand(6))
        self.rx, self.ry, self.rz = self.params[0], self.params[1], self.params[2]
        self.tx, self.ty, self.tz = self.params[3], self.params[4], self.params[5]
        

    def forward(self, x):
        tr_mat = torch.tensor([[1, 0, 0, self.params[3]],
                                [0, 1, 0, self.params[4]],
                                [0, 0, 1, self.params[5]],
                                [0, 0, 0, 1]], requires_grad=True)

        rz_mat = torch.tensor([[torch.cos(self.params[2]), -torch.sin(self.params[2]), 0, 0],
                                [torch.sin(self.params[2]), torch.cos(self.params[2]), 0, 0],
                                [0, 0, 1, 0],
                                [0, 0, 0, 1]], requires_grad=True)

        ry_mat = torch.tensor([[torch.cos(self.params[1]), 0, torch.sin(self.params[1]), 0],
                                [0, 1, 0, 0],
                                [-torch.sin(self.params[1]), 0, torch.cos(self.params[1]), 0],
                                [0, 0, 0, 1]], requires_grad=True)

        rx_mat = torch.tensor([[1, 0, 0, 0],
                                [0, torch.cos(self.params[0]), -torch.sin(self.params[0]), 0],
                                [0, torch.sin(self.params[0]), torch.cos(self.params[0]), 0],
                                [0, 0, 0, 1]], requires_grad=True)

        tf1 = torch.matmul(tr_mat, rz_mat)
        tf2 = torch.matmul(tf1, ry_mat)
        tf3 = torch.matmul(tf2, rx_mat)

        tr_local = torch.tensor([[1, 0, 0, x[0]],
                                [0, 1, 0, x[1]],
                                [0, 0, 1, x[2]],
                                [0, 0, 0, 1]])
        tf_output = torch.matmul(tf3, tr_local)
        output = tf_output[:3, 3]
        return output

    def get_loss(self, output):
        pass



model = Net()

input_ex = np.array([[-0.01, 0.05, 0.92],
                    [-0.06, 0.03, 0.94]])

output_ex = np.array([[-0.3, 0.4, 0.09],
                        [-0.5, 0.2, 0.07]])
print(list(model.parameters()))

optimizer = optim.Adam(model.parameters(), 0.001)
criterion = nn.MSELoss()

for input_np, label_np in zip(input_ex, output_ex):
    input_tensor = torch.from_numpy(input_np).float()
    label_tensor = torch.from_numpy(label_np).float()
    output = model(input_tensor)

    optimizer.zero_grad()
    loss = criterion(output, label_tensor)
    loss.backward()
    optimizer.step()

print(list(model.parameters()))

pytorch

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-09-07 08:29:17

会发生什么？

您的问题与PyTorch将torch.tensor隐式转换为float有关。假设你有这个：

tr_mat = torch.tensor(
    [
        [1, 0, 0, self.params[3]],
        [0, 1, 0, self.params[4]],
        [0, 0, 1, self.params[5]],
        [0, 0, 0, 1],
    ],
    requires_grad=True,
)

torch.tensor只能从具有Python值的list中构造，不能在list中包含torch.tensor。self.params的每个元素都可以转换为float (在本例中所有元素都可以，例如self.params[3]、self.params[4]、self.params[5])。

当tensor的值被传递给float时，它的值就是将复制到中，因此不再是计算图的一部分，它是一个新的纯Python变量(不能明显地反向传播)。

解决方案

您可以做的是选择self.params的元素，并将它们插入眼睛矩阵中，这样梯度就会流动。考虑到这一点，您可以看到对forward方法的重写：

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.params = nn.Parameter(torch.randn(6))

    def forward(self, x):
        sinus = torch.cos(self.params)
        cosinus = torch.cos(self.params)

        tr_mat = torch.eye(4)
        tr_mat[:-1, -1] = self.params[3:]

        rz_mat = torch.eye(4)
        rz_mat[0, 0] = cosinus[2]
        rz_mat[0, 1] = -sinus[2]
        rz_mat[1, 0] = sinus[2]
        rz_mat[1, 1] = cosinus[2]

        ry_mat = torch.eye(4)
        ry_mat[0, 0] = cosinus[1]
        ry_mat[0, 2] = sinus[1]
        ry_mat[2, 0] = -sinus[1]
        ry_mat[2, 2] = cosinus[1]

        rx_mat = torch.eye(4)
        rx_mat[1, 1] = cosinus[0]
        rx_mat[1, 2] = -sinus[0]
        rx_mat[2, 1] = sinus[0]
        rx_mat[2, 2] = cosinus[0]

        tf1 = torch.matmul(tr_mat, rz_mat)
        tf2 = torch.matmul(tf1, ry_mat)
        tf3 = torch.matmul(tf2, rx_mat)

        tr_local = torch.tensor(
            [[1, 0, 0, x[0]], [0, 1, 0, x[1]], [0, 0, 1, x[2]], [0, 0, 0, 1]],
        )
        tf_output = torch.matmul(tf3, tr_local)
        output = tf_output[:3, 3]
        return output

(您可能需要重复检查此重写，但这个想法仍然成立)。另外，请注意，tr_local可以“按您的方式”执行，因为我们不需要任何值来保持梯度。

requires_grad

您可以看到代码中的任何地方都没有使用requires_grad。这是因为需要梯度的不是整个眼底矩阵(我们不会优化0和1)，而是插入到其中的参数。通常，在您的神经网络代码中根本不需要requires_grad at ，因为：

未优化输入张量(通常，当您进行对抗性攻击或such)

nn.Parameter默认要求梯度时(除非frozen)

layers和其他神经网络特定内容默认要求梯度)(除非frozen)

values不需要渐变(输入张量)通过需要它的层(或参数或w/e)，则为backpropagated

)。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/63772506

复制

相似问题

问我的网络在PyTorch上的参数没有更新
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我的网络在PyTorch上的参数没有更新EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我的网络在PyTorch上的参数没有更新
EN