# python实现多变量线性回归(Linear Regression with Multiple Variables)

1、加载训练数据

X1,X2,Y

2104,3,399900

1600,3,329900

2400,3,369000

1416,2,232000

#加载数据

```#加载数据
data = []
with open(filename, 'r') as f:
line = line.split(',')
current = [int(item) for item in line]
#5.5277,9.1302
data.append(current)
return data

data = np.array(data,np.int64)

x = data[:,(0,1)].reshape((-1,2))
y = data[:,2].reshape((-1,1))
m = y.shape[0]

# Print out some data points
print('First 10 examples from the dataset: \n')
print(' x = ',x[range(10),:],'\ny=',y[range(10),:])```

First 10 examples from the dataset:

x = [[2104 3]

[1600 3]

[2400 3]

[1416 2]

[3000 4]

[1985 4]

[1534 3]

[1427 3]

[1380 3]

[1494 3]]

y= [[399900]

[329900]

[369000]

[232000]

[539900]

[299900]

[314900]

[198999]

[212000]

[242500]]

2、通过梯度下降求解theta

（1）在多维特征问题的时候，要保证特征具有相近的尺度，这将帮助梯度下降算法更快地收敛。

（2）损失函数和单变量一样，依然计算损失平方和均值

（3）向量化计算

(1/2*m) * (X.T.dot(X.dot(theta) - y))

theta = theta - (alpha/m) * (X.T.dot(X.dot(theta) - y))

（4）完整代码如下：

```#特征缩放
def featureNormalize(X):
X_norm = X;
mu = np.zeros((1,X.shape[1]))
sigma = np.zeros((1,X.shape[1]))
for i in range(X.shape[1]):
mu[0,i] = np.mean(X[:,i]) # 均值
sigma[0,i] = np.std(X[:,i])     # 标准差
#     print(mu)
#     print(sigma)
X_norm  = (X - mu) / sigma
return X_norm,mu,sigma

#计算损失
def computeCost(X, y, theta):
m = y.shape[0]
#     J = (np.sum((X.dot(theta) - y)**2)) / (2*m)
C = X.dot(theta) - y
J2 = (C.T.dot(C))/ (2*m)
return J2

#梯度下降
def gradientDescent(X, y, theta, alpha, num_iters):
m = y.shape[0]
#print(m)
# 存储历史误差
J_history = np.zeros((num_iters, 1))
for iter in range(num_iters):
# 对J求导，得到 alpha/m * (WX - Y)*x(i)， (3,m)*(m,1)  X (m,3)*(3,1) = (m,1)
theta = theta - (alpha/m) * (X.T.dot(X.dot(theta) - y))
J_history[iter] = computeCost(X, y, theta)
return J_history,theta

iterations = 10000  #迭代次数
alpha = 0.01    #学习率
x = data[:,(0,1)].reshape((-1,2))
y = data[:,2].reshape((-1,1))
m = y.shape[0]
x,mu,sigma = featureNormalize(x)
X = np.hstack([x,np.ones((x.shape[0], 1))])
# X = X[range(2),:]
# y = y[range(2),:]

theta = np.zeros((3, 1))

j = computeCost(X,y,theta)
J_history,theta = gradientDescent(X, y, theta, alpha, iterations)

Theta found by gradient descent [[ 109447.79646964]

[ -6578.35485416]

[ 340412.65957447]]

plt.plot(J_history)

plt.ylabel('lost');

plt.xlabel('iter count')

plt.title('convergence graph')

```def predict(data):
testx = np.array(data)
testx = ((testx - mu) / sigma)
testx = np.hstack([testx,np.ones((testx.shape[0], 1))])
price = testx.dot(theta)
print('price is %d ' % (price))

predict([1650,3])```

price is 293081

no bb,上代码，代码下载

102 篇文章37 人订阅

0 条评论

## 相关文章

1291

1282

1272

3306

2448

### Pytorch实现Logistic回归二分类

? 摘要：本文主要介绍使用深度学习框架Pytorch实现简单的Logistic回归模型，进而实现简单的分类问题。 一．逻辑回归简述 逻辑回归实质上是线性回...

1.1K14

### 【深度学习】②--细说卷积神经网络

1. 神经网络与卷积神经网络 先来回忆一下神经网络的结构，如下图，由输入层，输出层，隐藏层组成。每一个节点之间都是全连接，即上一层的节点会链接到下一层的每一个节...

4138

### 深度学习与TensorFlow:FCN论文翻译(二)

Each layer of data in a convnet is a three-dimensional array of size h × w × d, ...

1902

1013

### 【学术】一文教你如何正确利用kNN进行机器学习

AiTechYun 编辑：xiaoshan k最近邻算法（kNN）是机器学习中最简单的分类方法之一，并且是入门机器学习和分类的好方法。它基本上是通过在训练数据中...

2705