TVP

# 关于梯度下降算法的的一些总结

1. 前言

2. 正文

2.1 梯度

2.2 梯度下降算法

2.2.1 批量梯度下降算法

2.2.2 随机梯度下降算法

3.参考文献

1.前言

2 正文

2.1 梯度

,

2.2 梯度下降算法

# -*- coding: utf-8 -*-

import numpy as np

from matplotlib import pyplot as plt

from mpl_toolkits.mplot3d import Axes3D

from matplotlib import animation as amat

"this function: f(x,y) = (1-x)^2 + 100*(y - x^2)^2"

def Rosenbrock(x, y):

return np.power(1 - x, 2) + np.power(100 * (y - np.power(x, 2)), 2)

fig = plt.figure()

ax = Axes3D(fig)

X, Y = np.meshgrid(X, Y, sparse=True)

Z = func(X, Y)

ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap='rainbow', )

ax.set_xlabel('x label', color='r')

ax.set_ylabel('y label', color='g')

ax.set_zlabel('z label', color='b')

amat.FuncAnimation(fig, Rosenbrock, frames=200, interval=20, blit=True)

plt.show()

if __name__ == '__main__':

X = np.arange(-2, 2, 0.1)

Y = np.arange(-2, 2, 0.1)

Z = Rosenbrock(X, Y)

# -*- coding: utf-8 -*-

import random

import numpy as np

from matplotlib import pyplot as plt

from mpl_toolkits.mplot3d import Axes3D

from matplotlib import animation as amat

"this function: f(x,y) = (1-x)^2 + 100*(y - x^2)^2"

def Rosenbrock(x, y):

return np.power(1 - x, 2) + np.power(100 * (y - np.power(x, 2)), 2)

fig = plt.figure()

ax = Axes3D(fig)

X, Y = np.meshgrid(X, Y, sparse=True)

Z = func(X, Y)

ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap='rainbow', )

ax.set_xlabel('x label', color='r')

ax.set_ylabel('y label', color='g')

ax.set_zlabel('z label', color='b')

plt.show()

def drawPaht(px, py, pz, X, Y, func=Rosenbrock):

fig = plt.figure()

ax = Axes3D(fig)

X, Y = np.meshgrid(X, Y, sparse=True)

Z = func(X, Y)

ax.set_xlabel('x label', color='r')

ax.set_ylabel('y label', color='g')

ax.set_zlabel('z label', color='b')

ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap='rainbow', )

ax.plot(px, py, pz, 'r.') # 绘点

plt.show()

# x, Y = np.meshgrid(X, Y, sparse=True)

new_Y = [Y]

g_z=[Rosenbrock(X, Y)]

current_x = X

current_Y = Y

for cycle in range(Maxcycles):

current_Y -= learnRate * 200 * (Y - X * X)

current_x -= learnRate * (-2 * (1 - X) - 400 * X * (Y - X * X))

X = current_x

Y = current_Y

new_x.append(X)

new_Y.append(Y)

g_z.append(Rosenbrock(X, Y))

if __name__ == '__main__':

X = np.arange(-3, 4, 0.1)

Y = np.arange(-3, 4, 0.1)

x = random.uniform(-3, 4)

y = random.uniform(-3, 4)

print x,y

x, y, z = gradeAscent(x, y)

print len(x),x

print len(y),y

print len(z),z

drawPaht(x, y, z, X, Y, Rosenbrock)

1,,我们设定x0=1,然后我们将其简化为:

J表示给定的函数预测值和实际值Y的均方差,它反映的是预测值与实际值的一个偏离的程度. 但我们说完这些,再结合上面的那个题,

2.2.1 批量梯度下降算法(BGD)

,然后我们将会得到这样一个方程:

,解释一下这个方程吧~:

"这里权重用W表示 , trainingSet 表示训练数据集合 "for i in range(len(trainingSet)): "n 表示有多少个特征Xj (j属于[1,n])" for j in range(n): w -= a*(yi - h(xi))Xij

2.2.2 随机梯度下降算法(SGD)

3.参考文献:

http://cs229.stanford.edu/materials.html 斯坦福机器学习讲义第一章.

https://blog.slinuxer.com/2016/09/sgd-comparisonsgd算法比较

• 发表于:
• 原文链接http://kuaibao.qq.com/s/20180312A0DUPF00?refer=cp_1026
• 腾讯「腾讯云开发者社区」是腾讯内容开放平台帐号（企鹅号）传播渠道之一，根据《腾讯内容开放平台服务协议》转载发布内容。
• 如有侵权，请联系 cloudcommunity@tencent.com 删除。

2019-01-06

2018-04-18

2018-02-22

2018-01-29

2018-09-05

2020-03-29

2018-02-23

2018-07-23

2018-02-23

2018-10-24

2018-04-28

2018-12-01

2019-03-30

2018-01-27

2018-05-04

2023-07-02

2018-01-26

2018-10-22

2018-12-21

2018-08-09