首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >最小二乘法 线性回归

最小二乘法 线性回归

作者头像
用户6021899
发布2019-08-14 17:15:44
7840
发布2019-08-14 17:15:44
举报

如何从一大堆数据里求出线性回归方程呢?假定输入数据存放在矩阵X中,而回归系数存放在向量w中。 那么对于给定的X,预测结果将会通过

给出。

现在的问题是,现有一些X和对于的y,怎样才能找到合适的w呢?一个常用的方式就是找出使误差最小的w。这里的误差是指预测值和真实y值的差。直接使用该差值累加求和可能使得正差值和负差值相互抵消,所以我们采用平方误差。平方误差写做:

该矩阵还可以表示为:

如果对w 求导,得到

令其等于0,解出如下:

注意,上述公式需要对矩阵求逆,因此只在逆矩阵存在时适用。这种方法就是普通的最小二乘法。

示例代码如下,可适用于多维数据。

import numpy as np
def loadDataSet(filename):
    '''加载数据集'''
    #numFeat = len(open(filename).readline().split("\t")) - 1 #特征数
    dataMat  =[]; labelMat =[]
    with open(filename) as fr:
        for line in fr.readlines():
            a = line.strip().split('\t')
            dataMat.append(list(map(float,a[:-1])))
            labelMat.append(float(a[-1]))
    return dataMat,labelMat
def standRegres(xArr,yArr):
    '''计算回归系数'''
    X = np.mat(xArr)  #  n*2
    Y = np.mat(yArr).T # n*1
    Xt= X.T  #2*n
    XtX = Xt * X #2*2
    if np.linalg.det(XtX) == 0.0:#行列式为0,矩阵奇异
        print("This matrix is singular, can not do invers")
    Ws = XtX.I * Xt * Y # 2*2 *2* n * n*1 ,return 2 *1 #回归系数矩阵
    return Ws

xArr,yArr = loadDataSet("ex0.txt")
ws = standRegres(xArr,yArr)
print (ws)
X=np.mat(xArr)
Y=np.mat(yArr)
Yhat = X* ws # 预测值
#print (Yhat)
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
#ax.scatter(X[:,1].flatten().A[0],Y.flatten().A[0])
ax.scatter(np.array(xArr)[:,1], yArr, color = "blue", s=20, label="raw data")

Xcopy = X.copy()
Xcopy.sort(0)
Yhat = Xcopy *ws
ax.plot(Xcopy[:,1],Yhat, color ="r", lw=2 ,
        label ="最佳拟合直线 Y=%f X + %f" %(ws[1,0],ws[0,0]))
plt.title("最小二乘法 线性回归 \n(OLS Regres)",fontsize =16)
plt.xlabel("X")
plt.ylabel("Y")
plt.legend(loc="lower right")
plt.show()
print (np.corrcoef((X* ws).T,Y))#求预测值和真实值的相关系数

数据集如下。其中第一列全是1,作为添加到x中的常数项,和w0相乘表示预测y值中的常数项。最后一列为y,中间列为原始x值。

1.000000 0.067732 3.176513
1.000000 0.427810 3.816464
1.000000 0.995731 4.550095
1.000000 0.738336 4.256571
1.000000 0.981083 4.560815
1.000000 0.526171 3.929515
1.000000 0.378887 3.526170
1.000000 0.033859 3.156393
1.000000 0.132791 3.110301
1.000000 0.138306 3.149813
1.000000 0.247809 3.476346
1.000000 0.648270 4.119688
1.000000 0.731209 4.282233
1.000000 0.236833 3.486582
1.000000 0.969788 4.655492
1.000000 0.607492 3.965162
1.000000 0.358622 3.514900
1.000000 0.147846 3.125947
1.000000 0.637820 4.094115
1.000000 0.230372 3.476039
1.000000 0.070237 3.210610
1.000000 0.067154 3.190612
1.000000 0.925577 4.631504
1.000000 0.717733 4.295890
1.000000 0.015371 3.085028
1.000000 0.335070 3.448080
1.000000 0.040486 3.167440
1.000000 0.212575 3.364266
1.000000 0.617218 3.993482
1.000000 0.541196 3.891471
1.000000 0.045353 3.143259
1.000000 0.126762 3.114204
1.000000 0.556486 3.851484
1.000000 0.901144 4.621899
1.000000 0.958476 4.580768
1.000000 0.274561 3.620992
1.000000 0.394396 3.580501
1.000000 0.872480 4.618706
1.000000 0.409932 3.676867
1.000000 0.908969 4.641845
1.000000 0.166819 3.175939
1.000000 0.665016 4.264980
1.000000 0.263727 3.558448
1.000000 0.231214 3.436632
1.000000 0.552928 3.831052
1.000000 0.047744 3.182853
1.000000 0.365746 3.498906
1.000000 0.495002 3.946833
1.000000 0.493466 3.900583
1.000000 0.792101 4.238522
1.000000 0.769660 4.233080
1.000000 0.251821 3.521557
1.000000 0.181951 3.203344
1.000000 0.808177 4.278105
1.000000 0.334116 3.555705
1.000000 0.338630 3.502661
1.000000 0.452584 3.859776
1.000000 0.694770 4.275956
1.000000 0.590902 3.916191
1.000000 0.307928 3.587961
1.000000 0.148364 3.183004
1.000000 0.702180 4.225236
1.000000 0.721544 4.231083
1.000000 0.666886 4.240544
1.000000 0.124931 3.222372
1.000000 0.618286 4.021445
1.000000 0.381086 3.567479
1.000000 0.385643 3.562580
1.000000 0.777175 4.262059
1.000000 0.116089 3.208813
1.000000 0.115487 3.169825
1.000000 0.663510 4.193949
1.000000 0.254884 3.491678
1.000000 0.993888 4.533306
1.000000 0.295434 3.550108
1.000000 0.952523 4.636427
1.000000 0.307047 3.557078
1.000000 0.277261 3.552874
1.000000 0.279101 3.494159
1.000000 0.175724 3.206828
1.000000 0.156383 3.195266
1.000000 0.733165 4.221292
1.000000 0.848142 4.413372
1.000000 0.771184 4.184347
1.000000 0.429492 3.742878
1.000000 0.162176 3.201878
1.000000 0.917064 4.648964
1.000000 0.315044 3.510117
1.000000 0.201473 3.274434
1.000000 0.297038 3.579622
1.000000 0.336647 3.489244
1.000000 0.666109 4.237386
1.000000 0.583888 3.913749
1.000000 0.085031 3.228990
1.000000 0.687006 4.286286
1.000000 0.949655 4.628614
1.000000 0.189912 3.239536
1.000000 0.844027 4.457997
1.000000 0.333288 3.513384
1.000000 0.427035 3.729674
1.000000 0.466369 3.834274
1.000000 0.550659 3.811155
1.000000 0.278213 3.598316
1.000000 0.918769 4.692514
1.000000 0.886555 4.604859
1.000000 0.569488 3.864912
1.000000 0.066379 3.184236
1.000000 0.335751 3.500796
1.000000 0.426863 3.743365
1.000000 0.395746 3.622905
1.000000 0.694221 4.310796
1.000000 0.272760 3.583357
1.000000 0.503495 3.901852
1.000000 0.067119 3.233521
1.000000 0.038326 3.105266
1.000000 0.599122 3.865544
1.000000 0.947054 4.628625
1.000000 0.671279 4.231213
1.000000 0.434811 3.791149
1.000000 0.509381 3.968271
1.000000 0.749442 4.253910
1.000000 0.058014 3.194710
1.000000 0.482978 3.996503
1.000000 0.466776 3.904358
1.000000 0.357767 3.503976
1.000000 0.949123 4.557545
1.000000 0.417320 3.699876
1.000000 0.920461 4.613614
1.000000 0.156433 3.140401
1.000000 0.656662 4.206717
1.000000 0.616418 3.969524
1.000000 0.853428 4.476096
1.000000 0.133295 3.136528
1.000000 0.693007 4.279071
1.000000 0.178449 3.200603
1.000000 0.199526 3.299012
1.000000 0.073224 3.209873
1.000000 0.286515 3.632942
1.000000 0.182026 3.248361
1.000000 0.621523 3.995783
1.000000 0.344584 3.563262
1.000000 0.398556 3.649712
1.000000 0.480369 3.951845
1.000000 0.153350 3.145031
1.000000 0.171846 3.181577
1.000000 0.867082 4.637087
1.000000 0.223855 3.404964
1.000000 0.528301 3.873188
1.000000 0.890192 4.633648
1.000000 0.106352 3.154768
1.000000 0.917886 4.623637
1.000000 0.014855 3.078132
1.000000 0.567682 3.913596
1.000000 0.068854 3.221817
1.000000 0.603535 3.938071
1.000000 0.532050 3.880822
1.000000 0.651362 4.176436
1.000000 0.901225 4.648161
1.000000 0.204337 3.332312
1.000000 0.696081 4.240614
1.000000 0.963924 4.532224
1.000000 0.981390 4.557105
1.000000 0.987911 4.610072
1.000000 0.990947 4.636569
1.000000 0.736021 4.229813
1.000000 0.253574 3.500860
1.000000 0.674722 4.245514
1.000000 0.939368 4.605182
1.000000 0.235419 3.454340
1.000000 0.110521 3.180775
1.000000 0.218023 3.380820
1.000000 0.869778 4.565020
1.000000 0.196830 3.279973
1.000000 0.958178 4.554241
1.000000 0.972673 4.633520
1.000000 0.745797 4.281037
1.000000 0.445674 3.844426
1.000000 0.470557 3.891601
1.000000 0.549236 3.849728
1.000000 0.335691 3.492215
1.000000 0.884739 4.592374
1.000000 0.918916 4.632025
1.000000 0.441815 3.756750
1.000000 0.116598 3.133555
1.000000 0.359274 3.567919
1.000000 0.814811 4.363382
1.000000 0.387125 3.560165
1.000000 0.982243 4.564305
1.000000 0.780880 4.215055
1.000000 0.652565 4.174999
1.000000 0.870030 4.586640
1.000000 0.604755 3.960008
1.000000 0.255212 3.529963
1.000000 0.730546 4.213412
1.000000 0.493829 3.908685
1.000000 0.257017 3.585821
1.000000 0.833735 4.374394
1.000000 0.070095 3.213817
1.000000 0.527070 3.952681
1.000000 0.116163 3.129283
本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2019-07-28,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 Python可视化编程机器学习OpenCV 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档