我有时间序列代码,生成线性和二次趋势。我对为degree参数选择什么感到困惑。我看到以下定义:
Within scikit-learn's PolynomialFeatures, when the argument degree is passed, all terms up to that degree are created.
我只是不明白这个定义。有没有简单的数学解释?我怎样才能确保我在使用最好的学位?
这是我的代码,如果你想要它作为一个样本。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
import statsmodels.tsa.api as smt
import random
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
y = [5*np.random.normal() for j in range(50)] + [30 + 5 * np.random.normal() for j in range(50)] + [50 + 5 * np.random.normal() for j in range(50)] + [20 + 5 * np.random.normal() for j in range(50)]
X = [x for x in range(len(y))]
X = np.reshape(X, (len(X), 1))
model = LinearRegression()
model.fit(X, y)
trend = model.predict(X)
model = make_pipeline(PolynomialFeatures(2), Ridge())
model.fit(X, y)
quadratic = model.predict(X)
fig = plt.figure(1, figsize=(15, 9))
ax = fig.add_subplot(111)
ax.plot(trend, label="Linear Trend")
ax.plot(quadratic, label="Quadratic Trend")
ax.plot(X, y, label='Time Series')
ax.legend()
plt.show()发布于 2017-08-25 21:33:56
您使用2表示度;线性分量将包含在二次型中。例如,如果计算出的线性分量是2x - 5,二次分量是3x^2 + x + 1,那么从函数中得到的是和,3x^2 + 3x + 4。
https://stackoverflow.com/questions/45886879
复制相似问题