《统计学习方法》之感知机Python实现

本系列文章会通过Python实现李航老师《统计机器学习方法》的系列算法,以加深对它们的理解。本篇实现该系列的第一个算法,感知机!

感知机,英语名perceptron。是由F. Rosenblatt在上个世纪50年代在其经典论文《The Perceptron: A probabilistic model for information storage and organization in the brain》中提出,其本意是想回答关于人脑的如下两个问题。

* In what form is information stored, or remembered?

人脑是通过何种方式存储信息的?

* How does information contained in storage, or in memory, influence recognition and behavior?

人脑中的信息是如何影响人的认知和行为的?

因为在那个年代,认知科学家和神经学家们对信息在人脑中的组织有两种解释;一种解释认为人脑会对事物进行一对一的编码,然后,把这个编码存放在人脑的某个位置;譬如,人看到了猫,就会把猫图像化,然后,对这个图像进行神经编码;另一种解释认为,大脑不会对这些东西进行编码,大脑就是一个精巧的有各种开关的网络,类似逻辑门,当大脑看到一个东西的时候,某些开关打开,一些神经元之间建立连接;另一些开关关闭,一些神经元断开连接,以这样一组状态集来表示某个事物。后面这种解释已经具有连接主义的雏形。而感知机的思想也是基于连接主义的,用作者的话说就是:

The theory to be presented here takes the empiricist, or "connectionist" position with regard to these questions. The theory has been developed for a hypothetical nervous system, or machine, called a perceptron. The perceptron is designed to illustrate some of the fundamental properties of intelligent systems in general.

以上就是感知机算法的一些背景。下面这张图来自哥伦比亚大学CS学院感知机的PDF讲义,其勾勒了感知机的整个框架。有兴趣的同学,可以访问文后的链接查看原件。

《统计机器学习》对这个算法有详细的描述,对其收敛性也有比较精巧的证明。下面是实现该书感知机章节算法的代码。

-------------------------------------------------------

importnumpy

classPerceptron:

def__init__(self, training_set = [], iterations =5, learning_rate =0.5):

self._training_set = training_set

self._iterations = iterations

self._learning_rate = learning_rate

self._setup()

def_setup(self):

iflen(self._training_set) >:

self._weight = numpy.zeros(len(self._training_set[][:-1]))

self._bias =0.0

self._alpha = numpy.zeros(len(self._training_set[]))

else:

self._weight =None

self._bias =None

self._alpha =None

defraw_training(self):

self._setup()

ifself._weightisNone:

return

foriterinxrange(self._iterations):

forsampleinself._training_set:

prediction = numpy.dot(self._weight, sample[:-1]) +self._bias

ifsample[-1] * prediction

self._weight =self._weight + numpy.multiply(self._learning_rate * sample[-1], sample[:-1])

self._bias =self._bias +self._learning_rate * sample[-1]

print("[+] Raw Perceptron - Stochatic Gradient Descent")

print("[+] weight ",self._weight)

print("[+] bias ",self._bias)

print("[+] learing rate ",self._learning_rate)

print("[+] iterations ",self._iterations)

defcalculate_gram_matrix(self):

dim =len(self._training_set)

self._gram = numpy.zeros((dim,dim))

shape = numpy.shape(self._gram)

forrowinrange(shape[]):

forcolinrange(shape[1]):

self._gram[row][col] = numpy.dot(self._training_set[row][:-1],self._training_set[col][:-1])

print("[+] Gram matrix for duality traing:")

print(self._gram)

defduality_training(self):

self._setup()

ifself._weightisNone:

return

# calculate Gram Matrix

self.calculate_gram_matrix()

dim =len(self._training_set)

foriterinxrange(self._iterations):

sample_idx =

forsampleinself._training_set:

prediction =0.0

alpha_idx=

foridxinxrange(dim):

prediction +=self._alpha[idx] *self._training_set[idx][-1] *self._gram[sample_idx][idx]

prediction = sample[-1] * (prediction +self._bias)

ifprediction

self._alpha[sample_idx] =self._alpha[sample_idx] +self._learning_rate

self._bias =self._bias +self._learning_rate * sample[-1]

sample_idx +=1

#calculate weight

idx =

self._weight =0.0

foriterinxrange(dim):

self._weight = numpy.add(numpy.dot(self._alpha[idx] *self._training_set[iter][-1],self._training_set[iter][:-1]),self._weight)

idx+=1

print("[+] Duality Perceptron - Stochatic Gradient Descent")

print("[+] weight ",self._weight)

print("[+] bias ",self._bias)

print("[+] learing rate ",self._learning_rate)

print("[+] iterations ",self._iterations)

if__name__ =='__main__':

training_set = [[3,3,1],[4,3,1],[1,1, -1]]

perceptron = Perceptron(training_set=training_set,iterations=5,learning_rate=1.0)

perceptron.raw_training()

perceptron.duality_training()

-----------------------------------------------------

其中,

* raw_traing方法,对应于书中感知机的原始形式,也就是在遇到误分类训练样本时,通过随机梯度下降(Stochastic gradient descent)来更新权

重_weight和偏置_bias

* duality_training, 对应于书中感知机的对偶形式,先计算训练样本的Gram矩阵,这是由calculate_gram_matrix函数来完成,然后,通过查_gram

来进行相应的更新。

主要用到了numpy关于向量点积的计算numpy.dot,有兴趣的同学可以深究一下numpy和numpy.linag。

Reference:

* http://www.cs.columbia.edu/~mcollins/courses/6998-2012/notes/perc.converge.pdf

  • 发表于:
  • 原文链接https://kuaibao.qq.com/s/20181009G1QCN700?refer=cp_1026
  • 腾讯「云+社区」是腾讯内容开放平台帐号(企鹅号)传播渠道之一,根据《腾讯内容开放平台服务协议》转载发布内容。
  • 如有侵权,请联系 yunjia_community@tencent.com 删除。

同媒体快讯

扫码关注云+社区

领取腾讯云代金券