# 3. Bayesian statistics and Regularization

Content

3. Bayesian statistics and Regularization.

3.1 Underfitting and overfitting.

3.2 Bayesian statistics and regularization.

3.3 Optimize Cost function by regularization.

3.3.1 Regularized linear regression.

3.3.2 Regularized logistic regression.

key words: underfitting, overfitting, regularization, bayesian statistic

## 3.1 Underfitting and overfitting

1. 减少特征的数量
• 尽量选择我们认为具有一般化的特征，除去可能只有训练集才有的特征。（人工的）
• 采用模型选择算法(Model selection algorithm)
2. 正则化(Regularization)

## 3.2 Bayesian statistics and regularization

，如果我们要对新的进行预测，我们可以通过贝叶斯公式算出θ的后验概率(posterior distribution)，即：

（当然也有其他的假设方式）。在实际中，The Bayesian MAP estimate比极大似然估计更好的减少过拟合。例如，用Bayesian Logistic 回归算法可以用来处理特征数远大于训练样本数文本分类问题。

## 3.3 Optimize Cost function by regularization

### 3.3.1 Regularized linear regression

（注意正则化不包括theta0）

Lambda的取值应该合适，如果过大(如10^10)将会导致theta都趋于0，所有的特征量没有被学习到，导致欠拟合。后面将会讨论lambda的取值，现在暂时认为在0~10之间。

### 3.3.2 Regularized logistic regression

matlab实现Logistic regression的该函数代码如下：

```function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters.

m = length(y); % number of training examples
n = size(X,2);   % features number

J = 0;

h = sigmoid(X * theta); % sigmoid function

J = sum((-y) .* log(h) - (1-y) .* log(1-h)) / m + lambda * sum(theta(2:n) .^ 2) / (2*m);

grad(1) = sum((h - y) .* X(:,1)) / m;
for i = 2:n
grad(i) = sum((h - y) .* X(:,i)) / m + lambda * theta(i) / m;
end

end```

调用时的matlab代码片段如下：

```% Initialize fitting parameters
initial_theta = zeros(size(X, 2), 1);
% Set regularization parameter lambda to 1 (you can vary this)
lambda = 1;
% Set Options
options = optimset('GradObj', 'on', 'MaxIter', 400);
% Optimize
[theta, J, exit_flag] = ...
fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);```

38 篇文章37 人订阅

0 条评论

## 相关文章

### 机器学习(3) -- 贝叶斯及正则化

Content 3. Bayesian statistics and Regularization. 　　　　3.1 Underfitting and ov...

3319

2643

### FR+FCN

FR:Face Recovery FCN:facial component deep network 自然条件下，因为角度，光线，occlusions（咬合/张...

2297

55710

55710

731

1973

4314

2566

3586