PyTorch入门笔记-PyTorch初见

触摸壹缕阳光

修改于 2021-03-25 19:13:20

1.2K0

修改于 2021-03-25 19:13:20

文章被收录于专栏：AI机器学习与深度学习算法

深度学习框架介绍

2002年 Torch 框架发布，Torch 是一个基于 BSD License 的开源机器学习框架，但是由于 Torch 框架支持的是比较小众的 Lua 开发语言，因此并没有大范围的流行起来。

2016年10月，Facebook 人工智能研究院（FAIR）基于 Torch 推出了测试版本的PyTorch。它是一个基于Python的可续计算包，提供两个高级功能：

具有强大的GPU加速的张量计算（如NumPy）；
包含自动求导系统的的深度神经网络；

2018年12月发布了第一个正式版本 PyTorch1.0，「其中在 PyTorch0.3 和 PyTorch0.4 之间有了较大的更新，可能会有部分不兼容的情况」，也就是说如果想要在 PyTorch0.4 以后的版本中运行PyTorch0.3以前的代码需要进行少量的代码修改。

目前比较公认的前两名深度学习框架为 PyTorch 和 TensorFlow1.X（TensorFlow2.X支持动态图），这两个框架最本质的区别是动态图优先还是静态图优先。

动态图能够让程序按照我们编写的代码顺序来执行，这种机制更加容易进行调试，可以将想法思路直接通过代码的展现出来。

静态图能够允许编译器进行更大程序的优化，将创建计算图和运算计算图进行分离，这意味着代码的调试更加困难，无法及时发现代码中的错误。在 TensorFlow1.X 中获取节点结果需要在 Session 会话中运算计算图。「静态图中的计算图一旦定义好，在运算的时候是不允许改变的。」

# 创建计算图
x_ph = tf.placeholder(tf.int32, name = 'x')
y_ph = tf.placeholder(tf.int32, name = 'y')
z_ph = tf.multiply(x_ph, y_ph, name = 'x * y')

# 运算计算图
with tf.Session() as sess:
    z_val = sess.run(z_ph, feed_dict = {x_ph: [8], y_ph: [9]})
    print(z_val)

「对于研究人员推荐使用PyTorch，对于工程师来说推荐TensorFlow2.X。」

PyTorch能干什么？

GPU加速

import  torch
import  time
print(torch.__version__)
print(torch.cuda.is_available())
# print('hello, world.')

a = torch.randn(10000, 1000)
b = torch.randn(1000, 2000)

t0 = time.time()
c = torch.matmul(a, b)
t1 = time.time()
print(a.device, t1 - t0, c.norm(2))

device = torch.device('cuda')
a = a.to(device)
b = b.to(device)

# 第一次在cuda上面运行的时候需要完成一些环境的初始化
t0 = time.time()
c = torch.matmul(a, b)
t2 = time.time()
print(a.device, t2 - t0, c.norm(2))

t0 = time.time()
c = torch.matmul(a, b)
t2 = time.time()
print(a.device, t2 - t0, c.norm(2))

自动求导 y = a^2x + bx + c，有。

import torch
from torch import autograd

x = torch.tensor(1.)
# requires_grad = True为可以进行求导张量
# 求导的张量必须为浮点类型
a = torch.tensor(1., requires_grad = True)
b = torch.tensor(2., requires_grad = True)
c = torch.tensor(3., requires_grad = True)
y = a**2 * x + b * x + c

print('before:', a.grad, b.grad, c.grad) # before: None None None
# 计算y对a, b, c的导数
grads = autograd.grad(y, [a, b, c])
print('after:', grads[0], grads[1], grads[2]) # after: tensor(2.) tensor(1.) tensor(1.)

当 x = 1.，a = 1.，b = 2.，c = 3. 时，\frac{\partial y}{\partial a} = 2ax = 2 \times 1. \times 1. = 2.，\frac{\partial y}{\partial b} = x = 1.，\frac{\partial y}{\partial c} = 1。