我有一个时间序列是这样的:
2000 0.000
2001 -0.174
2002 -0.131
2003 0.127
2004 0.566
2005 0.723
2006 0.675
2007 1.171
2008 2.338
2009 2.625
2010 3.746
2011 3.612
2012 4.729
2013 8.156
2014 16.330
2015 27.584
估计此序列的线性趋势,然后计算趋势线和序列之间的差距的最有效方法是什么?
提前谢谢你!
发布于 2019-10-07 22:14:44
使用scikit-learn
中的简单线性回归模型
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
a = {'year':[2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015],'y_true':[0,-0.174,-0.131,0.127,0.566,0.723,0.675,1.171,2.338,2.625,3.746,3.612,4.729,8.156,16.330,27.584]}
df = pd.DataFrame(a)
x = np.array(df['year']).reshape(-1,1)
y_true = df['y_true']
linear_reg = LinearRegression().fit(x,y_true)
y_pred = linear_reg.predict(x)
df['y_pred'] = y_pred
df['difference'] = y_true - y_pred
print(df)
输出:
year y_true y_pred difference
0 2000 0.000 -4.366596 4.366596
1 2001 -0.174 -3.183741 3.009741
2 2002 -0.131 -2.000887 1.869887
3 2003 0.127 -0.818032 0.945032
4 2004 0.566 0.364822 0.201178
5 2005 0.723 1.547676 -0.824676
6 2006 0.675 2.730531 -2.055531
7 2007 1.171 3.913385 -2.742385
8 2008 2.338 5.096240 -2.758240
9 2009 2.625 6.279094 -3.654094
10 2010 3.746 7.461949 -3.715949
11 2011 3.612 8.644803 -5.032803
12 2012 4.729 9.827657 -5.098657
13 2013 8.156 11.010512 -2.854512
14 2014 16.330 12.193366 4.136634
15 2015 27.584 13.376221 14.207779
https://stackoverflow.com/questions/58277633
复制