今天我们来讲一篇入门级必做的项目,如何使用pytorch进行CIFAR10分类,即利用CIFAR10数据集训练一个简单的图片分类器。
首先,了解一下CIFAR10数据集:
数据集:The CIFAR-10 and CIFAR-100标记为8000万微型图片
收集者: Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.
格式:10类共60000张32*32的图片,每个类别大约6000 张图片,其中训练集50000张,测试集10000张。
可视化观察一下:
我们今天要做的就是如何训练一个神经网络模型,使得输入一张CIFAR中的图片,会输出预测的类别(10个类别之一)。
一、总体步骤:
步骤1:使用torchvision来加载和标准化CIFAR10训练和测试数据集
步骤2:使用pytorch框架定义一个卷积神经网络CNN
步骤3:定义一个损失函数
步骤4:在训练数据集上训练网络
步骤5:在测试数据集上测试网络
步骤6:在不同的类上测试网络
二、重点问题:
1、如何下载数据:
使用:torchvision.datasets.CIFAR10和torch.utils.data.DataLoader下载数据并加载。
train_data = torchvision.datasets.CIFAR10(root='./CIFAR10data', train=True,
download=False, transform=transform)
train_loader = torch.utils.data.DataLoader(train_data, batch_size=4,
shuffle=True, num_workers=2)
2、定义神经网络
必须有的继承:
class Net(nn.Module):
def __init__(self):
super(Net,self).__init__()
卷积层与全连接层直接需要拉成向量;
对于各层,先定义后使用:conv–>relu–>pool
3、定义损失函数与优化器:
criterion = nn.CrossEntropyLoss()
optimzer = optim.SGD(net.parameters(), lr = 0.001, momentum = 0.9)
4、训练网络
输入–>Variable–>net–>loss,optimzer–>Loss
5、预测、测试网络
传入测试数据集,按训练步骤预测
correct += (pred == labels).sum()
6、分类测试
_, pred = torch.max(outputs.data,1)
c = (pred == labels).squeeze() # 1*10000*10-->10*10000
三、整体代码:
(1)导入需要的包
# -*- coding: utf-8 -*-
import torch
import torchvision
import torchvision.transforms as transforms
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np
(2)导入数据并进行标准化处理,转换成需要的格式
ToTensor:导入的数据是PILImage图片格式,需要转换为tensor
Normalize: 将图片数据转化为 [-1, 1]范围,而不是初始的[0,1]
transform = transforms.Compose(
[ transforms.ToTensor(),
transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))])
(3)下载数据
train_data = torchvision.datasets.CIFAR10(root='./CIFAR10data', train=True,
download=False, transform=transform)
train_loader = torch.utils.data.DataLoader(train_data, batch_size=4,
shuffle=True, num_workers=2)
test_data = torchvision.datasets.CIFAR10(root='./CIFAR10data', train=False,
download=False, transform=transform)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=4,
shuffle=False, num_workers=2)
问题:为什么test_loader的shuffle=false,但是train_loader的shuffle=true
因为:shuffle的作用是打乱数据的顺序,train中达到抽取的作用,test时因为测试一般是将所有测试数据跑一遍,不需要打乱顺序
(4)展示图片
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog','frog','horse','ship', 'truck')
def imshow(img):
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
# np.transpose:按需求转置
plt.imshow(np.transpose(npimg, (1, 2, 0)))
(5)定义卷积神经网络模型
class Net(nn.Module):
def __init__(self):
super(Net,self).__init__()
self.conv1 = nn.Conv2d(3,6,5)
self.pool = nn.MaxPool2d(2,2)
self.conv2 = nn.Conv2d(6,16,5)
self.fc1 = nn.Linear(16*5*5,120)
self.fc2 = nn.Linear(120,84)
self.fc3 = nn.Linear(84,10)
def forward(self,x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1,16*5*5) # 拉成向量
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
return x
net = Net()
(6)定义loss函数和优化器
criterion = nn.CrossEntropyLoss()
optimzer = optim.SGD(net.parameters(), lr = 0.001, momentum = 0.9) # SGD(传入参数,定义lr,动量)
(7)训练网络
for epoch in range(1):
running_loss = 0.0
# 0 用于指定索引起始值
for i, data in enumerate(train_loader,0):
input, target = data
input, target = Variable(input),Variable(target)
optimzer.zero_grad()
output = net(input)
loss = criterion(output,target) # output 和 target 的交叉熵损失
loss.backward()
optimzer.step()
# 问题:这里的loss.data[0],为什么不是loss.data()
# 这里的loss是torch.cuda.tensor类型数据,使用loss.data[0]提取其中数据
running_loss += loss.data[0]
if i % 2000 ==1999: # print every 2000 mini_batches,1999,because of index from 0 on
print ('[%d,%5d]loss:%.3f' % (epoch+1,i+1,running_loss/2000))
running_loss = 0.0
print('Finished Training')
输出:
'''
[1, 2000] loss: 2.252
[1, 4000] loss: 1.894
[1, 6000] loss: 1.677
[1, 8000] loss: 1.597
'''
(8)测试网络
dataiter = iter(test_loader)
images,labels = dataiter.next()
imshow(torchvision.utils.make_grid(images))
print('GroundTruth:',' '.join('%5s' % classes[labels[j]] for j in range(4)))
outputs = net(Variable(images))
_, pred = torch.max(outputs.data,1)
print('Predicted: ', ' '.join('%5s' % classes[pred[j][0]] for j in range(4)))
correct = 0.0
total = 0
for data in test_loader:
images,labels = data
outputs = net(Variable(images))
_, pred = torch.max(outputs.data,1)
total += labels.size(0)
correct += (pred == labels).sum()
print('Accuracy of the network on the 10000 test images : %d %%' % (100 * correct / total))
(9)分析结果:什么类别分类的效果好,什么类别的不好
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
for data in test_loader:
images, labels = data
outputs = net(Variable(images))
_, pred = torch.max(outputs.data,1)
c = (pred == labels).squeeze() # 1*10000*10-->10*10000
for i in range(4):
label = labels[i]
class_correct[label] += c[i]
class_total[label] += 1
for i in range(10):
print('Accuracy of %5s : %2d %%' %(classes[i],100 * class_correct[i]/class_total[i]))
这个小项目就到这里啦,看了之后还要自己动手操作一下,看看结果哦!