# 轻松玩转 Scikit-Learn 系列 —— 线性回归及 ML 相关评价标准

1、相关评价标准

1）均方误差 MSE (Mean Squared Error) :

2）均方根误差 RMSE (Root Mean Squared Error) :

3）平均绝对误差 MAE (Mean Absolute Error) :

4）R方误差 ( R Squared ) :

2、线性回归

1）小引—— kNN 回归

Talk is cheap, let's see the code！

MAE: 3.651057, MSE: 25.966010, R2 Accuracy: 0.464484 0.602674505081

• n_neighbors——即 k 值，默认n_neighbors=5；weights：表示是否为距离加权重，默认 weights=’uniform’；
• algorithm——用于计算距离的算法，默认algorithm=’auto’，即根据 fit 方法传入值选择合适算法；
• p——明可夫斯基距离的指数，默认p=2（欧氏距离），p=1 为曼哈顿距离；
• n_jobs——调用CPU的核心数，默认 n_jobs=None；

{'n_neighbors': 6, 'p': 1, 'weights': 'distance'} 0.735424490609

### 2）Linear Regression

80% 好像还不错，毕竟是默认模型嘛。让我们看下线性回归都有哪些超参数呢！

• fit_intercept——默认 fit_intercept=True，决定是否计算模型截距；
• normalize——默认 normalize=False，如果fit_intercept=True，X 会在被减去均值并除以 L2 正则项之前正则化；
• n_jobs——计算时所使用的CPU核心数；

array([ -1.14235739e-01, 3.12783163e-02, -4.30926281e-02, -9.16425531e-02, -1.09940036e+01, 3.49155727e+00, -1.40778005e-02, -1.06270960e+00, 2.45307516e-01, -1.23179738e-02, -8.80618320e-01, 8.43243544e-03, -3.99667727e-01])

Attribute Information (in order)

• CRIM——per capita crime rate by town
• ZN——proportion of residential land zoned for lots over 25,000 sq.ft.
• INDUS——proportion of non-retail business acres per town
• CHAS——Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
• NOX——nitric oxides concentration (parts per 10 million)
• RM——average number of rooms per dwelling
• AGE——proportion of owner-occupied units built prior to 1940
• DIS——weighted distances to five Boston employment centres
• TAX——full-value property-tax rate per \$10,000
• PTRATIO——pupil-teacher ratio by town
• B——1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
• LSTAT——% lower status of the population
• MEDV——Median value of owner-occupied homes in \$1000's

array(['NOX', 'DIS', 'PTRATIO', 'LSTAT', 'CRIM', 'CHAS', 'INDUS', 'AGE', 'TAX', 'B', 'ZN', 'RAD', 'RM'],

471 篇文章54 人订阅

0 条评论

15530

38740

13220

11830

14030

18720

23860

21330

11930