# 如何用TensorFlow预测时间序列：TFTS库详细教程

### 作者 | 何之源

• 读入时间序列数据（分为从numpy数组和csv文件两种方式）
• 用AR模型对时间序列进行预测
• 用LSTM模型对时间序列进行预测（包含单变量和多变量）

# coding: utf-8 from __future__ import print_function import numpy as np import matplotlib matplotlib.use('agg') import matplotlib.pyplot as plt import tensorflow as tf from tensorflow.contrib.timeseries.python.timeseries import NumpyReader x = np.array(range(1000)) noise = np.random.uniform(-0.2, 0.2, 1000) y = np.sin(np.pi * x / 100) + x / 200. + noise plt.plot(x, y) plt.savefig('timeseries_y.jpg')

train_input_fn = tf.contrib.timeseries.RandomWindowInputFn( reader, batch_size=2, window_size=10)

with tf.Session() as sess: batch_data = train_input_fn.create_batch() coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) one_batch = sess.run(batch_data[0]) coord.request_stop() print('one_batch_data:', one_batch)

1,-0.6656603714 2,-0.1164380359 3,0.7398626488 4,0.7368633029 5,0.2289480898 6,2.257073255 7,3.023457405 8,2.481161007 9,3.773638612 10,5.059257738 11,3.553186083

CSV文件的第一列为时间点，第二列为该时间点上观察到的值。将其读入的方法为：

# coding: utf-8 from __future__ import print_function import tensorflow as tf csv_file_name = './data/period_trend.csv' reader = tf.contrib.timeseries.CSVReader(csv_file_name)

x = np.array(range(1000)) noise = np.random.uniform(-0.2, 0.2, 1000) y = np.sin(np.pi * x / 100) + x / 200. + noise plt.plot(x, y) plt.savefig('timeseries_y.jpg') data = { tf.contrib.timeseries.TrainEvalFeatures.TIMES: x, tf.contrib.timeseries.TrainEvalFeatures.VALUES: y, } reader = NumpyReader(data) train_input_fn = tf.contrib.timeseries.RandomWindowInputFn( reader, batch_size=16, window_size=40)

ar = tf.contrib.timeseries.ARRegressor( periodicities=200, input_window_size=30, output_window_size=10, num_features=1, loss=tf.contrib.timeseries.ARModel.NORMAL_LIKELIHOOD_LOSS)

num_features参数表示在一个时间点上观察到的数的维度。我们这里每一步都是一个单独的值，所以num_features=1。

ar.train(input_fn=train_input_fn, steps=6000)

TFTS中验证(evaluation)的含义是：使用训练好的模型在原先的训练集上进行计算，由此我们可以观察到模型的拟合效果，对应的程序段是：

evaluation_input_fn = tf.contrib.timeseries.WholeDatasetInputFn(reader) evaluation = ar.evaluate(input_fn=evaluation_input_fn, steps=1)

evaluation[‘start_tuple’]会被用于之后的预测中，它相当于最后30步的输出值和对应的时间点。以此为起点，我们可以对1000步以后的值进行预测，对应的代码为：

(predictions,) = tuple(ar.predict( input_fn=tf.contrib.timeseries.predict_continuation_input_fn( evaluation, steps=250)))

plt.figure(figsize=(15, 5)) plt.plot(data['times'].reshape(-1), data['values'].reshape(-1), label='origin') plt.plot(evaluation['times'].reshape(-1), evaluation['mean'].reshape(-1), label='evaluation') plt.plot(predictions['times'].reshape(-1), predictions['mean'].reshape(-1), label='prediction') plt.xlabel('time_step') plt.ylabel('values') plt.legend(loc=4) plt.savefig('predict_result.jpg')

x = np.array(range(1000)) noise = np.random.uniform(-0.2, 0.2, 1000) y = np.sin(np.pi * x / 50 ) + np.cos(np.pi * x / 50) + np.sin(np.pi * x / 25) + noise data = { tf.contrib.timeseries.TrainEvalFeatures.TIMES: x, tf.contrib.timeseries.TrainEvalFeatures.VALUES: y, } reader = NumpyReader(data) train_input_fn = tf.contrib.timeseries.RandomWindowInputFn( reader, batch_size=4, window_size=100)

estimator = ts_estimators.TimeSeriesRegressor( model=_LSTMModel(num_features=1, num_units=128), optimizer=tf.train.AdamOptimizer(0.001))

num_features = 1表示单变量时间序列，即每个时间点上观察到的量只是一个单独的数值。num_units=128表示使用隐层为128大小的LSTM模型。

estimator.train(input_fn=train_input_fn, steps=2000) evaluation_input_fn = tf.contrib.timeseries.WholeDatasetInputFn(reader) evaluation = estimator.evaluate(input_fn=evaluation_input_fn, steps=1) # Predict starting after the evaluation (predictions,) = tuple(estimator.predict( input_fn=tf.contrib.timeseries.predict_continuation_input_fn( evaluation, steps=200)))

0,0.926906299771,1.99107237682,2.56546245685,3.07914768197,4.04839057867 1,0.108010001864,1.41645361423,2.1686839775,2.94963962176,4.1263503303 2,-0.800567600028,1.0172132907,1.96434754116,2.99885333086,4.04300485864 3,0.0607042871898,0.719540073421,1.9765012584,2.89265588817,4.0951014426 4,0.933712200629,0.28052120776,1.41018552514,2.69232603996,4.06481164223 5,-0.171730652974,0.260054421028,1.48770816369,2.62199129293,4.44572807842 6,-1.00180162933,0.333045158863,1.50006392277,2.88888309683,4.24755865606 7,0.0580061875336,0.688929398826,1.56543458772,2.99840358953,4.52726873347

csv_file_name = path.join("./data/multivariate_periods.csv") reader = tf.contrib.timeseries.CSVReader( csv_file_name, column_names=((tf.contrib.timeseries.TrainEvalFeatures.TIMES,) + (tf.contrib.timeseries.TrainEvalFeatures.VALUES,) * 5)) train_input_fn = tf.contrib.timeseries.RandomWindowInputFn( reader, batch_size=4, window_size=32)

estimator = ts_estimators.TimeSeriesRegressor( model=_LSTMModel(num_features=5, num_units=128), optimizer=tf.train.AdamOptimizer(0.001))

973 篇文章117 人订阅

0 条评论

## 相关文章

34340

10540

### 【干货】计算机视觉实战系列01——用Python做图像处理

【导读】在当今互联网飞速发展的社会中，数量庞大的图像和视频充斥着我们的生活，让我们需要对图片进行检索、分类等操作时，利用人工手段显然是不现实的，于是，计算机视觉...

1.1K120

33490

28750

15220

20720

### 开发 | 如何优雅地用TensorFlow预测时间序列：TFTS库详细教程

AI 科技评论按：本文作者何之源，原文载于知乎专栏AI Insight，AI 科技评论获其授权发布。 前言 如何用TensorFlow结合LSTM来做时间序列预...

38250

### Caffe学习系列(6)：Blob,Layer and Net以及对应配置文件的编写

http://www.cnblogs.com/denny402/p/5073427.html

9210

2.7K400