如何从一大堆数据里求出线性回归方程呢?假定输入数据存放在矩阵X中,而回归系数存放在向量w中。 那么对于给定的X,预测结果将会通过
给出。
现在的问题是,现有一些X和对于的y,怎样才能找到合适的w呢?一个常用的方式就是找出使误差最小的w。这里的误差是指预测值和真实y值的差。直接使用该差值累加求和可能使得正差值和负差值相互抵消,所以我们采用平方误差。平方误差写做:
该矩阵还可以表示为:
如果对w 求导,得到
令其等于0,解出如下:
注意,上述公式需要对矩阵求逆,因此只在逆矩阵存在时适用。这种方法就是普通的最小二乘法。
示例代码如下,可适用于多维数据。
import numpy as np
def loadDataSet(filename):
'''加载数据集'''
#numFeat = len(open(filename).readline().split("\t")) - 1 #特征数
dataMat =[]; labelMat =[]
with open(filename) as fr:
for line in fr.readlines():
a = line.strip().split('\t')
dataMat.append(list(map(float,a[:-1])))
labelMat.append(float(a[-1]))
return dataMat,labelMat
def standRegres(xArr,yArr):
'''计算回归系数'''
X = np.mat(xArr) # n*2
Y = np.mat(yArr).T # n*1
Xt= X.T #2*n
XtX = Xt * X #2*2
if np.linalg.det(XtX) == 0.0:#行列式为0,矩阵奇异
print("This matrix is singular, can not do invers")
Ws = XtX.I * Xt * Y # 2*2 *2* n * n*1 ,return 2 *1 #回归系数矩阵
return Ws
xArr,yArr = loadDataSet("ex0.txt")
ws = standRegres(xArr,yArr)
print (ws)
X=np.mat(xArr)
Y=np.mat(yArr)
Yhat = X* ws # 预测值
#print (Yhat)
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
#ax.scatter(X[:,1].flatten().A[0],Y.flatten().A[0])
ax.scatter(np.array(xArr)[:,1], yArr, color = "blue", s=20, label="raw data")
Xcopy = X.copy()
Xcopy.sort(0)
Yhat = Xcopy *ws
ax.plot(Xcopy[:,1],Yhat, color ="r", lw=2 ,
label ="最佳拟合直线 Y=%f X + %f" %(ws[1,0],ws[0,0]))
plt.title("最小二乘法 线性回归 \n(OLS Regres)",fontsize =16)
plt.xlabel("X")
plt.ylabel("Y")
plt.legend(loc="lower right")
plt.show()
print (np.corrcoef((X* ws).T,Y))#求预测值和真实值的相关系数
数据集如下。其中第一列全是1,作为添加到x中的常数项,和w0相乘表示预测y值中的常数项。最后一列为y,中间列为原始x值。
1.000000 0.067732 3.176513
1.000000 0.427810 3.816464
1.000000 0.995731 4.550095
1.000000 0.738336 4.256571
1.000000 0.981083 4.560815
1.000000 0.526171 3.929515
1.000000 0.378887 3.526170
1.000000 0.033859 3.156393
1.000000 0.132791 3.110301
1.000000 0.138306 3.149813
1.000000 0.247809 3.476346
1.000000 0.648270 4.119688
1.000000 0.731209 4.282233
1.000000 0.236833 3.486582
1.000000 0.969788 4.655492
1.000000 0.607492 3.965162
1.000000 0.358622 3.514900
1.000000 0.147846 3.125947
1.000000 0.637820 4.094115
1.000000 0.230372 3.476039
1.000000 0.070237 3.210610
1.000000 0.067154 3.190612
1.000000 0.925577 4.631504
1.000000 0.717733 4.295890
1.000000 0.015371 3.085028
1.000000 0.335070 3.448080
1.000000 0.040486 3.167440
1.000000 0.212575 3.364266
1.000000 0.617218 3.993482
1.000000 0.541196 3.891471
1.000000 0.045353 3.143259
1.000000 0.126762 3.114204
1.000000 0.556486 3.851484
1.000000 0.901144 4.621899
1.000000 0.958476 4.580768
1.000000 0.274561 3.620992
1.000000 0.394396 3.580501
1.000000 0.872480 4.618706
1.000000 0.409932 3.676867
1.000000 0.908969 4.641845
1.000000 0.166819 3.175939
1.000000 0.665016 4.264980
1.000000 0.263727 3.558448
1.000000 0.231214 3.436632
1.000000 0.552928 3.831052
1.000000 0.047744 3.182853
1.000000 0.365746 3.498906
1.000000 0.495002 3.946833
1.000000 0.493466 3.900583
1.000000 0.792101 4.238522
1.000000 0.769660 4.233080
1.000000 0.251821 3.521557
1.000000 0.181951 3.203344
1.000000 0.808177 4.278105
1.000000 0.334116 3.555705
1.000000 0.338630 3.502661
1.000000 0.452584 3.859776
1.000000 0.694770 4.275956
1.000000 0.590902 3.916191
1.000000 0.307928 3.587961
1.000000 0.148364 3.183004
1.000000 0.702180 4.225236
1.000000 0.721544 4.231083
1.000000 0.666886 4.240544
1.000000 0.124931 3.222372
1.000000 0.618286 4.021445
1.000000 0.381086 3.567479
1.000000 0.385643 3.562580
1.000000 0.777175 4.262059
1.000000 0.116089 3.208813
1.000000 0.115487 3.169825
1.000000 0.663510 4.193949
1.000000 0.254884 3.491678
1.000000 0.993888 4.533306
1.000000 0.295434 3.550108
1.000000 0.952523 4.636427
1.000000 0.307047 3.557078
1.000000 0.277261 3.552874
1.000000 0.279101 3.494159
1.000000 0.175724 3.206828
1.000000 0.156383 3.195266
1.000000 0.733165 4.221292
1.000000 0.848142 4.413372
1.000000 0.771184 4.184347
1.000000 0.429492 3.742878
1.000000 0.162176 3.201878
1.000000 0.917064 4.648964
1.000000 0.315044 3.510117
1.000000 0.201473 3.274434
1.000000 0.297038 3.579622
1.000000 0.336647 3.489244
1.000000 0.666109 4.237386
1.000000 0.583888 3.913749
1.000000 0.085031 3.228990
1.000000 0.687006 4.286286
1.000000 0.949655 4.628614
1.000000 0.189912 3.239536
1.000000 0.844027 4.457997
1.000000 0.333288 3.513384
1.000000 0.427035 3.729674
1.000000 0.466369 3.834274
1.000000 0.550659 3.811155
1.000000 0.278213 3.598316
1.000000 0.918769 4.692514
1.000000 0.886555 4.604859
1.000000 0.569488 3.864912
1.000000 0.066379 3.184236
1.000000 0.335751 3.500796
1.000000 0.426863 3.743365
1.000000 0.395746 3.622905
1.000000 0.694221 4.310796
1.000000 0.272760 3.583357
1.000000 0.503495 3.901852
1.000000 0.067119 3.233521
1.000000 0.038326 3.105266
1.000000 0.599122 3.865544
1.000000 0.947054 4.628625
1.000000 0.671279 4.231213
1.000000 0.434811 3.791149
1.000000 0.509381 3.968271
1.000000 0.749442 4.253910
1.000000 0.058014 3.194710
1.000000 0.482978 3.996503
1.000000 0.466776 3.904358
1.000000 0.357767 3.503976
1.000000 0.949123 4.557545
1.000000 0.417320 3.699876
1.000000 0.920461 4.613614
1.000000 0.156433 3.140401
1.000000 0.656662 4.206717
1.000000 0.616418 3.969524
1.000000 0.853428 4.476096
1.000000 0.133295 3.136528
1.000000 0.693007 4.279071
1.000000 0.178449 3.200603
1.000000 0.199526 3.299012
1.000000 0.073224 3.209873
1.000000 0.286515 3.632942
1.000000 0.182026 3.248361
1.000000 0.621523 3.995783
1.000000 0.344584 3.563262
1.000000 0.398556 3.649712
1.000000 0.480369 3.951845
1.000000 0.153350 3.145031
1.000000 0.171846 3.181577
1.000000 0.867082 4.637087
1.000000 0.223855 3.404964
1.000000 0.528301 3.873188
1.000000 0.890192 4.633648
1.000000 0.106352 3.154768
1.000000 0.917886 4.623637
1.000000 0.014855 3.078132
1.000000 0.567682 3.913596
1.000000 0.068854 3.221817
1.000000 0.603535 3.938071
1.000000 0.532050 3.880822
1.000000 0.651362 4.176436
1.000000 0.901225 4.648161
1.000000 0.204337 3.332312
1.000000 0.696081 4.240614
1.000000 0.963924 4.532224
1.000000 0.981390 4.557105
1.000000 0.987911 4.610072
1.000000 0.990947 4.636569
1.000000 0.736021 4.229813
1.000000 0.253574 3.500860
1.000000 0.674722 4.245514
1.000000 0.939368 4.605182
1.000000 0.235419 3.454340
1.000000 0.110521 3.180775
1.000000 0.218023 3.380820
1.000000 0.869778 4.565020
1.000000 0.196830 3.279973
1.000000 0.958178 4.554241
1.000000 0.972673 4.633520
1.000000 0.745797 4.281037
1.000000 0.445674 3.844426
1.000000 0.470557 3.891601
1.000000 0.549236 3.849728
1.000000 0.335691 3.492215
1.000000 0.884739 4.592374
1.000000 0.918916 4.632025
1.000000 0.441815 3.756750
1.000000 0.116598 3.133555
1.000000 0.359274 3.567919
1.000000 0.814811 4.363382
1.000000 0.387125 3.560165
1.000000 0.982243 4.564305
1.000000 0.780880 4.215055
1.000000 0.652565 4.174999
1.000000 0.870030 4.586640
1.000000 0.604755 3.960008
1.000000 0.255212 3.529963
1.000000 0.730546 4.213412
1.000000 0.493829 3.908685
1.000000 0.257017 3.585821
1.000000 0.833735 4.374394
1.000000 0.070095 3.213817
1.000000 0.527070 3.952681
1.000000 0.116163 3.129283
本文分享自 Python可视化编程机器学习OpenCV 微信公众号,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。
本文参与 腾讯云自媒体同步曝光计划 ,欢迎热爱写作的你一起参与!