斯坦福tensorflow教程(三) 线性和逻辑回归1. 线性回归：根据出生率来预测平均寿命

致Great

发布于 2018-06-14 16:09:55

6180

发布于 2018-06-14 16:09:55

文章被收录于专栏：程序生活

1. 线性回归：根据出生率来预测平均寿命

相信大家对线性回归很熟悉了，在这里不介绍了。我们将简单地构建一个神经网络，只包含一层，用来预测自变量X与因变量Y之间的线性关系。

问题描述下面图片是关于出生率和平均寿命关系的可视化图片，数据来自全世界不同的国家。你会发现一个有趣的结论：对于一个地区，儿童越多，平均寿命就越短。详细请见.

问题是我们可以量化X与Y之间的关系吗？换句话说，如果一个国家的出生率是X，平均寿命是Y，我们能够找到线性函数吗，例如Y=f(X)?如果我们量化这种关系，给出一个国家的出生率，我们就能预测这个国家的平均寿命。完整数据集：https://datacatalog.worldbank.org/dataset/world-development-indicators 为了简便，我们仅适用2010年的数据集：https://github.com/chiphuyen/stanford-tensorflow-tutorials/blob/master/examples/data/birth_life_2010.txt

数据描述 Name: Birth rate - life expectancy in 2010 X = birth rate. Type: float. Y = life expectancy. Type: foat. Number of datapoints: 190
方法首先，我们假设出生率和寿命的关系是线性的，这就意味着我们可以找到类似Y=wX+b这种方程。为了计算出w和b，我们将在一层神经网络使用反向传播算法。对于损失函数，使用均方差，在训练每一轮之后，我们计算出实际值与预测值Y之间的均方差。 03_linreg_starter.py

# -*- coding: utf-8 -*-
# @Author: yanqiang
# @Date:   2018-05-10 22:31:37
# @Last Modified by:   yanqiang
# @Last Modified time: 2018-05-10 23:05:47
import tensorflow as tf
import utils
import matplotlib.pyplot as plt

DATA_FILE = 'data/birth_life_2010.txt'

# Step 1: read in data from the .txt file
# data is a numpy array of shape (190, 2), each row is a datapoint
data, n_samples = utils.read_birth_life_data(DATA_FILE)

# Step 2: create placeholders for X (birth rate) and Y (life expectancy)
X = tf.placeholder(tf.float32, name='X')
Y = tf.placeholder(tf.float32, name='Y')

# Step 3: create weight and bias, initialized to 0
w = tf.get_variable('weights', initializer=tf.constant(0.0))
b = tf.get_variable('bias', initializer=tf.constant(0.0))

# Step 4: construct model to predict Y (life expectancy from birth rate)
Y_predicted = w * X + b

# Step 5: use the square error as the loss function
loss = tf.square(Y - Y_predicted, name='loss')

# Step 6: using gradient descent with learning rate of 0.01 to minimize loss
optimizer = tf.train.GradientDescentOptimizer(
    learning_rate=0.001).minimize(loss)

with tf.Session() as sess:
    # Step 7: initialize the necessary variables, in this case, w and b
    sess.run(tf.global_variables_initializer())

    # Step 8: train the model
    for i in range(100):  # run 100 epochs
        for x, y in data:
            # Session runs train_op to minimize loss
            sess.run(optimizer, feed_dict={X: x, Y: y})
    # Step 9: output the values of w and b
    w_out, b_out = sess.run([w, b])


# uncomment the following lines to see the plot
plt.plot(data[:, 0], data[:, 1], 'bo', label='Real data')
plt.plot(data[:, 0], data[:, 0] * w_out + b_out, 'r', label='Predicted data')
plt.legend()
plt.show()

[utils.py以及以后其他代码都在github](https://github.com/chiphuyen/stanford-tensorflow-tutorials) 预测结果：