image.png

image.png

Forward Propagation

image.png

image.png

image.png

image.png

image.png

Cost function 得到输出层的输出并进行激活函数计算之后，就要输入到cost function中计算errors！。这里我们采用的交叉熵代价函数， cross-entropy

image.png

image.png

image.png

image.png

Backpropagation

image.png

image.png

image.png

image.png

image.png

• 首先从cost function到sigmoid layer

image.png

image.png

image.png

• 从sigmoid层到输出层的微分，就是求取sigmoid函数的微分

image.png

image.png

• 从输出层到输入层的微分就是关于权重的微分，我们先看y关于权重的形式

image.png

image.png

image.png

image.png

image.png

image.png

image.png

```import numpy as np

def sigmoid(x):
return 1/(1+np.exp(-x))

def derivative_sigmoid(x):
return np.multiply(sigmoid(x), (1-sigmoid(x)))

# initialization
# X : 1*3
X = np.matrix("2, 4, -2")
# W : 3*2
W = np.random.normal(size=(3, 2))
# label
ycap = [0]
# number of training of examples
num_examples = 1
# step size
h = 0.01
# forward-propogation
y = np.dot(X, W)
y_o = sigmoid(y)
# loss calculation
loss = -np.sum(np.log(y_o[range(num_examples), ycap]))
print(loss)     # outputs 3.6821105514(for you it would be different due to random initialization of weights.)
# backprop starts
temp1 = np.copy(y_o)
# implementation of derivative of cost function with respect to y_o
temp1[range(num_examples), ycap] = 1 / -(temp1[range(num_examples), ycap])
temp = np.zeros_like(y_o)
temp[range(num_examples), ycap] = 1
# derivative of cost with respect to y_o
dcost = np.multiply(temp, temp1)
# derivative of y_o with respect to y
dy_o = derivative_sigmoid(y)
# element-wise multiplication
dgrad = np.multiply(dcost, dy_o)
dw = np.dot(X.T, dgrad)
# weight-update
W -= h * dw
# forward prop again with updated weight to find new loss
y = np.dot(X, W)
yo = sigmoid(y)
loss = -np.sum(np.log(yo[range(num_examples), ycap]))
print(loss)     # 3.45476397276 outpus (again for you it would be different!)```

待续

381 篇文章35 人订阅

0 条评论

相关文章

开发 | 监督学习最常见的五种算法，你知道几个？

AI科技评论按：本文作者李东轩，原文载于作者个人博客，AI科技评论已经获得授权。 在机器学习中，无监督学习（Unsupervised learning）就是聚类...

40290

37650

424110

45740

37360

30320

35370

手把手教你如何用 TensorFlow 实现 CNN

CNN 的引入 在人工的全连接神经网络中，每相邻两层之间的每个神经元之间都是有边相连的。当输入层的特征维度变得很高时，这时全连接网络需要训练的参数就会增大很...

750120