前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >TensorFlow2.0 代码实战专栏(八):双向循环神经网络示例

TensorFlow2.0 代码实战专栏(八):双向循环神经网络示例

作者头像
磐创AI
发布2019-12-23 17:05:30
1.8K0
发布2019-12-23 17:05:30
举报
作者 | Aymeric Damien

编辑 | 奇予纪

出品 | 磐创AI团队

原项目 | https://github.com/aymericdamien/TensorFlow-Examples/

双向循环神经网络示例

使用TensorFlow 2.0构建双向循环神经网络。

BiRNN 概述

mark

参考文献: 长短时记忆(Long Short Term Memory)[1],Sepp Hochreiter & Jurgen Schmidhuber, Neural Computation 9(8): 1735-1780, 1997.

MNIST 数据集概述

此示例使用手写数字的MNIST数据集。该数据集包含60,000个用于训练的示例和10,000个用于测试的示例。这些数字已经过尺寸标准化并位于图像中心,图像是固定大小(28x28像素),值为0到255。为简单起见,每个图像都被展平并转换为包含784个特征(28*28)的一维numpy数组。

为了使用递归神经网络对图像进行分类,我们将每个图像行都视为像素序列。由于MNIST的图像形状为28 * 28px,因此我们将为每个样本处理28个时间步长的28个序列。

更多信息请查看链接: http://yann.lecun.com/exdb/mnist/

代码语言:javascript
复制
from __future__ import print_function

import tensorflow as tf
from tensorflow.contrib import rnn
import numpy as np

# 导入MNIST数据
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

output:

代码语言:javascript
复制
Extracting /tmp/data/train-images-idx3-ubyte.gz

Extracting /tmp/data/train-labels-idx1-ubyte.gz

Extracting /tmp/data/t10k-images-idx3-ubyte.gz

Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
代码语言:javascript
复制
# 训练参数
learning_rate = 0.001
training_steps = 10000
batch_size = 128
display_step = 200

# 网络参数
num_input = 28 # MNIST数据输入 (图像形状: 28*28)
timesteps = 28 # 时间步长
num_hidden = 128 # 隐藏层特征数
num_classes = 10 # 所有类别(数字 0-9)

# tf 图输入
X = tf.placeholder("float", [None, timesteps, num_input])
Y = tf.placeholder("float", [None, num_classes])
代码语言:javascript
复制
# 定义权重
weights = {
    #隐含层权重值=> 2*n_hidden,因为前向+后向单元
    'out': tf.Variable(tf.random_normal([2*num_hidden, num_classes]))
}
biases = {
    'out': tf.Variable(tf.random_normal([num_classes]))
}
代码语言:javascript
复制
def BiRNN(x, weights, biases):

    # 准备数据形状以符合rnn函数要求
    # 当前数据输入形状: (batch_size, timesteps, n_input)
    # 要求的形状: 形状为'timesteps'个张量的列表 (batch_size, num_input)

    # 分解得到形状为'timesteps'个张量的列表形状为'timesteps'个张量的列表  
    x = tf.unstack(x, timesteps, 1)

    # 使用tensorflow定义lstm单元
    # 前向单元
    lstm_fw_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
    # 后向单元
    lstm_bw_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)

    # 得到lstm单元输出
    try:
        outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
                                              dtype=tf.float32)
    except Exception: # 旧的TensorFlow版本只返回输出,而不是状态
        outputs = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
                                        dtype=tf.float32)

    # 线性激活,使用rnn内部循环最后的输出
    return tf.matmul(outputs[-1], weights['out']) + biases['out']
代码语言:javascript
复制
logits = BiRNN(X, weights, biases)
prediction = tf.nn.softmax(logits)

# 定义损失和优化器
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=logits, labels=Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)

# 评估模型(测试时,禁用dropout)
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# 初始化变量(即分配它们的默认值)
init = tf.global_variables_initializer()
代码语言:javascript
复制
# 开始训练
with tf.Session() as sess:

    # 运行初始化
    sess.run(init)

    for step in range(1, training_steps+1):
        batch_x, batch_y = mnist.train.next_batch(batch_size)
        # 改变数据形状以获得28个元素的28个序列
        batch_x = batch_x.reshape((batch_size, timesteps, num_input))
        # 运行优化操作 (反向传播)
        sess.run(train_op, feed_dict={X: batch_x, Y: batch_y})
        if step % display_step == 0 or step == 1:
            # 计算批损失和准确率
            loss, acc = sess.run([loss_op, accuracy], feed_dict={X: batch_x,
                                                                 Y: batch_y})
            print("Step " + str(step) + ", Minibatch Loss= " + \
                  "{:.4f}".format(loss) + ", Training Accuracy= " + \
                  "{:.3f}".format(acc))

    print("Optimization Finished!")

    # 计算128个mnist测试图像的准确率
    test_len = 128
    test_data = mnist.test.images[:test_len].reshape((-1, timesteps, num_input))
    test_label = mnist.test.labels[:test_len]
    print("Testing Accuracy:", \
        sess.run(accuracy, feed_dict={X: test_data, Y: test_label}))

output:

代码语言:javascript
复制
Step 1, Minibatch Loss= 2.6218, Training Accuracy= 0.086

Step 200, Minibatch Loss= 2.1900, Training Accuracy= 0.211

Step 400, Minibatch Loss= 2.0144, Training Accuracy= 0.375

Step 600, Minibatch Loss= 1.8729, Training Accuracy= 0.445

Step 800, Minibatch Loss= 1.8000, Training Accuracy= 0.469

Step 1000, Minibatch Loss= 1.7244, Training Accuracy= 0.453

Step 1200, Minibatch Loss= 1.5657, Training Accuracy= 0.523

Step 1400, Minibatch Loss= 1.5473, Training Accuracy= 0.547

Step 1600, Minibatch Loss= 1.5288, Training Accuracy= 0.500

Step 1800, Minibatch Loss= 1.4203, Training Accuracy= 0.555

Step 2000, Minibatch Loss= 1.2525, Training Accuracy= 0.641

Step 2200, Minibatch Loss= 1.2696, Training Accuracy= 0.594

Step 2400, Minibatch Loss= 1.2000, Training Accuracy= 0.664

Step 2600, Minibatch Loss= 1.1017, Training Accuracy= 0.625

Step 2800, Minibatch Loss= 1.2656, Training Accuracy= 0.578

Step 3000, Minibatch Loss= 1.0830, Training Accuracy= 0.656

Step 3200, Minibatch Loss= 1.1522, Training Accuracy= 0.633

Step 3400, Minibatch Loss= 0.9484, Training Accuracy= 0.680

Step 3600, Minibatch Loss= 1.0470, Training Accuracy= 0.641

Step 3800, Minibatch Loss= 1.0609, Training Accuracy= 0.586

Step 4000, Minibatch Loss= 1.1853, Training Accuracy= 0.648

Step 4200, Minibatch Loss= 0.9438, Training Accuracy= 0.750

Step 4400, Minibatch Loss= 0.7986, Training Accuracy= 0.766

Step 4600, Minibatch Loss= 0.8070, Training Accuracy= 0.750

Step 4800, Minibatch Loss= 0.8382, Training Accuracy= 0.734

Step 5000, Minibatch Loss= 0.7397, Training Accuracy= 0.766

Step 5200, Minibatch Loss= 0.7870, Training Accuracy= 0.727

Step 5400, Minibatch Loss= 0.6380, Training Accuracy= 0.828

Step 5600, Minibatch Loss= 0.7975, Training Accuracy= 0.719

Step 5800, Minibatch Loss= 0.7934, Training Accuracy= 0.766

Step 6000, Minibatch Loss= 0.6628, Training Accuracy= 0.805

Step 6200, Minibatch Loss= 0.7958, Training Accuracy= 0.672

Step 6400, Minibatch Loss= 0.6582, Training Accuracy= 0.773

Step 6600, Minibatch Loss= 0.5908, Training Accuracy= 0.812

Step 6800, Minibatch Loss= 0.6182, Training Accuracy= 0.820

Step 7000, Minibatch Loss= 0.5513, Training Accuracy= 0.812

Step 7200, Minibatch Loss= 0.6683, Training Accuracy= 0.789

Step 7400, Minibatch Loss= 0.5337, Training Accuracy= 0.828

Step 7600, Minibatch Loss= 0.6428, Training Accuracy= 0.805

Step 7800, Minibatch Loss= 0.6708, Training Accuracy= 0.797

Step 8000, Minibatch Loss= 0.4664, Training Accuracy= 0.852

Step 8200, Minibatch Loss= 0.4249, Training Accuracy= 0.859

Step 8400, Minibatch Loss= 0.7723, Training Accuracy= 0.773

Step 8600, Minibatch Loss= 0.4706, Training Accuracy= 0.859

Step 8800, Minibatch Loss= 0.4800, Training Accuracy= 0.867

Step 9000, Minibatch Loss= 0.4636, Training Accuracy= 0.891

Step 9200, Minibatch Loss= 0.5734, Training Accuracy= 0.828

Step 9400, Minibatch Loss= 0.5548, Training Accuracy= 0.875

Step 9600, Minibatch Loss= 0.3575, Training Accuracy= 0.922

Step 9800, Minibatch Loss= 0.4566, Training Accuracy= 0.844

Step 10000, Minibatch Loss= 0.5125, Training Accuracy= 0.844

Optimization Finished!

Testing Accuracy: 0.890625

[1]: http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2019-12-16,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 磐创AI 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 双向循环神经网络示例
    • BiRNN 概述
      • MNIST 数据集概述
      领券
      问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档