我正在用Pytorch构建MobileNetV1,每次我训练这个模型的时候,我的记忆都会耗尽。(火把日志“被杀了!”突然坠毁)。
这是我的密码
文件:(yaml)
n_gpu: 0
arch:
type: MobileNet
args:
in_channels: 3
num_classes: 26
data_loader:
type: BallDataLoader
args:
data_dir: data/balls/
batch_size: 64
shuffle: true
validation_split: 0.2
num_workers: 0
resize:
- 224
- 224
optimizer:
type: Adam
args:
lr: 1.0e-2
weight_decay: 0
amsgrad: true
loss: nll_loss
metrics:
- accuracy
- top_k_acc
lr_scheduler:
type: StepLR
args:
step_size: 50
gamma: 0.1
trainer:
epochs: 50
save_dir: saved/
save_period: 2
verbosity: 2
monitor: min val_loss
early_stop: 10
tensorboard: truemodules.py:
class DepthwiseSeparableConv(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size = 3, stride = 1, padding = None):
super().__init__()
if padding == None:
padding = kernel_size // 2
self.depth_wise_conv = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, groups= in_channels)
self.bn1 = nn.BatchNorm2d(in_channels)
self.point_wise_conv = nn.Conv2d(in_channels, out_channels, (1,1), 1, 0)
self.bn2 = nn.BatchNorm2d(out_channels)
self.in_channels = in_channels
self.out_channels = out_channels
def forward(self, x):
x = self.depth_wise_conv(x)
x = self.bn1(x)
x = F.relu(x)
x = self.point_wise_conv(x)
x = self.bn2(x)
x = F.relu(x)
return xmodel.py
class MobileNet(ImageNet):
def __init__(self, in_channels = 3, num_classes = 1000):
super().__init__()
self.convs = nn.Sequential(
nn.Conv2d(in_channels, 32, kernel_size= 3, padding= 1, stride = 1 ),
nn.BatchNorm2d(32),
nn.ReLU(inplace = True),
DepthwiseSeparableConv(32, 64),
DepthwiseSeparableConv(64, 128, stride = 2),
DepthwiseSeparableConv(128, 128),
DepthwiseSeparableConv(128, 256),
DepthwiseSeparableConv(256, 256),
DepthwiseSeparableConv(256, 512, stride = 2),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 1024, stride = 1),
DepthwiseSeparableConv(1024, 1024, stride= 2),
nn.AdaptiveAvgPool2d(1)
)
self.fc = nn.Linear(1024, num_classes)
def forward(self, x):
x = self.convs(x)
x = x.view(-1, 1024)
x = self.fc(x)
x = F.log_softmax(x, dim = 1)
return x所以我从https://github.com/jmjeon94/MobileNet-Pytorch找到了一个模型,它成功了。几个小时后,我仍然找不出为什么会发生这种情况,因为模型几乎是一样的,而且由于移动网络的架构师非常轻,所以我认为这不会占用太多的空间。这有可能是因为python解释器或者我的代码有问题吗?
发布于 2022-03-08 07:53:17
我想是因为你的批号。尝试使用较小的批大小,如32、16、8、4、2。
发布于 2022-03-08 09:12:58
我删除行nn.Conv2d(in_channels, 32, kernel_size= 3, padding= 1, stride = 1 ),并重写相同的代码运行。仍然不知道为什么,但似乎是解释器或文本编辑器导致错误。谢谢你的出席。并特别感谢@Anmol先生为您的爱福,我非常感谢。
https://stackoverflow.com/questions/71391513
复制相似问题