本系列文章会通过Python实现李航老师《统计机器学习方法》的系列算法,以加深对它们的理解。本篇实现该系列的第一个算法,感知机!
感知机,英语名perceptron。是由F. Rosenblatt在上个世纪50年代在其经典论文《The Perceptron: A probabilistic model for information storage and organization in the brain》中提出,其本意是想回答关于人脑的如下两个问题。
* In what form is information stored, or remembered?
人脑是通过何种方式存储信息的?
* How does information contained in storage, or in memory, influence recognition and behavior?
人脑中的信息是如何影响人的认知和行为的?
因为在那个年代,认知科学家和神经学家们对信息在人脑中的组织有两种解释;一种解释认为人脑会对事物进行一对一的编码,然后,把这个编码存放在人脑的某个位置;譬如,人看到了猫,就会把猫图像化,然后,对这个图像进行神经编码;另一种解释认为,大脑不会对这些东西进行编码,大脑就是一个精巧的有各种开关的网络,类似逻辑门,当大脑看到一个东西的时候,某些开关打开,一些神经元之间建立连接;另一些开关关闭,一些神经元断开连接,以这样一组状态集来表示某个事物。后面这种解释已经具有连接主义的雏形。而感知机的思想也是基于连接主义的,用作者的话说就是:
The theory to be presented here takes the empiricist, or "connectionist" position with regard to these questions. The theory has been developed for a hypothetical nervous system, or machine, called a perceptron. The perceptron is designed to illustrate some of the fundamental properties of intelligent systems in general.
以上就是感知机算法的一些背景。下面这张图来自哥伦比亚大学CS学院感知机的PDF讲义,其勾勒了感知机的整个框架。有兴趣的同学,可以访问文后的链接查看原件。
《统计机器学习》对这个算法有详细的描述,对其收敛性也有比较精巧的证明。下面是实现该书感知机章节算法的代码。
-------------------------------------------------------
importnumpy
classPerceptron:
def__init__(self, training_set = [], iterations =5, learning_rate =0.5):
self._training_set = training_set
self._iterations = iterations
self._learning_rate = learning_rate
self._setup()
def_setup(self):
iflen(self._training_set) >:
self._weight = numpy.zeros(len(self._training_set[][:-1]))
self._bias =0.0
self._alpha = numpy.zeros(len(self._training_set[]))
else:
self._weight =None
self._bias =None
self._alpha =None
defraw_training(self):
self._setup()
ifself._weightisNone:
return
foriterinxrange(self._iterations):
forsampleinself._training_set:
prediction = numpy.dot(self._weight, sample[:-1]) +self._bias
ifsample[-1] * prediction
self._weight =self._weight + numpy.multiply(self._learning_rate * sample[-1], sample[:-1])
self._bias =self._bias +self._learning_rate * sample[-1]
print("[+] Raw Perceptron - Stochatic Gradient Descent")
print("[+] weight ",self._weight)
print("[+] bias ",self._bias)
print("[+] learing rate ",self._learning_rate)
print("[+] iterations ",self._iterations)
defcalculate_gram_matrix(self):
dim =len(self._training_set)
self._gram = numpy.zeros((dim,dim))
shape = numpy.shape(self._gram)
forrowinrange(shape[]):
forcolinrange(shape[1]):
self._gram[row][col] = numpy.dot(self._training_set[row][:-1],self._training_set[col][:-1])
print("[+] Gram matrix for duality traing:")
print(self._gram)
defduality_training(self):
self._setup()
ifself._weightisNone:
return
# calculate Gram Matrix
self.calculate_gram_matrix()
dim =len(self._training_set)
foriterinxrange(self._iterations):
sample_idx =
forsampleinself._training_set:
prediction =0.0
alpha_idx=
foridxinxrange(dim):
prediction +=self._alpha[idx] *self._training_set[idx][-1] *self._gram[sample_idx][idx]
prediction = sample[-1] * (prediction +self._bias)
ifprediction
self._alpha[sample_idx] =self._alpha[sample_idx] +self._learning_rate
self._bias =self._bias +self._learning_rate * sample[-1]
sample_idx +=1
#calculate weight
idx =
self._weight =0.0
foriterinxrange(dim):
self._weight = numpy.add(numpy.dot(self._alpha[idx] *self._training_set[iter][-1],self._training_set[iter][:-1]),self._weight)
idx+=1
print("[+] Duality Perceptron - Stochatic Gradient Descent")
print("[+] weight ",self._weight)
print("[+] bias ",self._bias)
print("[+] learing rate ",self._learning_rate)
print("[+] iterations ",self._iterations)
if__name__ =='__main__':
training_set = [[3,3,1],[4,3,1],[1,1, -1]]
perceptron = Perceptron(training_set=training_set,iterations=5,learning_rate=1.0)
perceptron.raw_training()
perceptron.duality_training()
-----------------------------------------------------
其中,
* raw_traing方法,对应于书中感知机的原始形式,也就是在遇到误分类训练样本时,通过随机梯度下降(Stochastic gradient descent)来更新权
重_weight和偏置_bias
* duality_training, 对应于书中感知机的对偶形式,先计算训练样本的Gram矩阵,这是由calculate_gram_matrix函数来完成,然后,通过查_gram
来进行相应的更新。
主要用到了numpy关于向量点积的计算numpy.dot,有兴趣的同学可以深究一下numpy和numpy.linag。
Reference:
* http://www.cs.columbia.edu/~mcollins/courses/6998-2012/notes/perc.converge.pdf
领取专属 10元无门槛券
私享最新 技术干货