如何使用seasonal_decompose。如何在使用seasonal_decompose时处理各种错误。我们如何实际使用或实现seasonal_decompose。
发布于 2022-03-03 18:30:07
获得所有进口品
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime
from statsmodels.tsa.seasonal import seasonal_decompose
准备测试数据
data = {'Unix Timestamp': ['1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12','1.61888E+12'],
'Date': ['4/20/2021 0:02','4/20/2021 0:01','4/20/2021 0:00','4/19/2021 23:59','4/19/2021 23:58','4/19/2021 23:57','4/19/2021 23:56','4/19/2021 23:55','4/19/2021 23:54','4/19/2021 23:53','4/19/2021 23:52','4/19/2021 23:51','4/19/2021 23:50','4/19/2021 23:49','4/19/2021 23:48','4/19/2021 23:47','4/19/2021 23:46','4/20/2021 0:02','4/20/2021 0:01','4/20/2021 0:00','4/19/2021 23:59','4/19/2021 23:58','4/19/2021 23:57','4/19/2021 23:56','4/19/2021 23:55','4/19/2021 23:54','4/19/2021 23:53','4/19/2021 23:52','4/19/2021 23:51','4/19/2021 23:50','4/19/2021 23:49','4/19/2021 23:48','4/19/2021 23:47','4/19/2021 23:46'],
'Symbol': ['BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD','BTCUSD'],
'Open': [55717.47,55768.94,55691.79,55777.86,55803.5,55690.64,55624.69,55651.82,55688.08,55749.28,55704.59,55779.38,55816.61,55843.69,55880.12,55890.88,0,55717.47,55768.94,55691.79,55777.86,55803.5,55690.64,55624.69,55651.82,55688.08,55749.28,55704.59,55779.38,55816.61,55843.69,55880.12,55890.88,0],
'High': [55723,55849.82,55793.15,55777.86,55823.88,55822.91,55713.02,55675.92,55730.21,55749.28,55759.27,55779.38,55835.57,55863.89,55916.47,55918.87,0,55723,55849.82,55793.15,55777.86,55823.88,55822.91,55713.02,55675.92,55730.21,55749.28,55759.27,55779.38,55835.57,55863.89,55916.47,55918.87,0],
'Low': [55541.69,55711.74,55691.79,55677.92,55773.08,55682.56,55624.63,55621.58,55641.46,55688.08,55695.42,55688.66,55769.46,55797.08,55815.99,55826.84,0,55541.69,55711.74,55691.79,55677.92,55773.08,55682.56,55624.63,55621.58,55641.46,55688.08,55695.42,55688.66,55769.46,55797.08,55815.99,55826.84,0]}
df=pd.DataFrame(data)
执行分解
df_seasonal = seasonal_decompose(df)
我们犯了第一个错误
ValueError: could not convert string to float:
让我们修复上面的错误,因为这个运行在下面的代码
df['Date'] = df['Date'].apply(
lambda x : datetime.datetime.strptime(str(x),'%m/%d/%Y %H:%M')
)
现在,如果您再次运行seasonal_decompose,您将得到新的错误
df_seasonal = seasonal_decompose(df)
现在新的错误将是
TypeError: float() argument must be a string or a number, not 'Timestamp'
为了修正这个错误,我们一次传递一个列,传递的列应该是一个字符串或一个数字。尝试使用下面的代码进行分解
df_seasonal = seasonal_decompose(df['Open'])
现在您将得到一个新的错误,如下所示
ValueError: You must specify a period or x must be a pandas object with a PeriodIndex or a DatetimeIndex with a freq not set to None
这个错误的第一解决方案有两种解决方案:-使用seasonal_decompose的周期参数
df_seasonal = seasonal_decompose(df['Open'],period = 1) ## here we have data for every minute and hence period is 1 , but this need not be correct.
在上面的代码中,我们每分钟都有数据,因此周期是1。然而,这个不一定是正确的周期,实际上是输入数据的循环周期。要了解更多关于如何确定周期的信息,请阅读此页。要知道freq缩写的完整列表,请单击这里
第二个解决方案:-为数据创建日期时间索引以及频率
df = df.set_index(df.Date).asfreq('2Min') ## M for Months S for Seconds. Here we cannot resample data with frequency 1Min, as data is already in frequency of 1Min, hence we used 2Min here
df_seasonal = seasonal_decompose(df['Open']) ## here we didn't use period and freq argument
分解我们必须设置模型(默认情况下它会使人上瘾)。我们可以将模型设置为可加的或可乘的。选择正确模型的经验法则是在我们的图表中查看趋势和季节变化是否随着时间的推移相对恒定,换句话说,是线性的。如果是,那么我们将选择加法模型。否则,如果趋势和季节变化随时间的增加或减少,则采用乘积模型。因此,这意味着在我们进行seasonal_decompose之前,我们必须绘制经过预处理的数据,看看是否有任何趋势或周期。
最后,我们可以毫无错误地运行它。
我们可能会看到的另一个错误是TypeError: Index(...) must be called with a collection of some kind, 'seasonal' was passed
,这再次发生是由于错误地使用了seasonal_decompose,例如下面这样
df_bt_decomp = seasonal_decompose(df_bt[['Open','High']],period=1) ## this is wrong because we have used two columns together and both are valid metric and not an index.
https://stackoverflow.com/questions/71342080
复制相似问题