我希望将一个时间序列的时间序列代码转换为一个自动代码,该代码可以用于多个时间序列数据(我的数据包含一个月时间序列)。对于一个时间序列,我的一般方法是去除季节性成分,先取差异来实现平稳性。然后使用auto.arima获取ARIMA参数。我使用这些参数来用原始的时间序列数据建立ARIMA模型。然后,我预测和比较了4个月的实际数据(我之前已经删除),并计算了RMSE。由于我不能使用我的实际数据,我只是生成一个随机的时间序列和测试集作为一个例子-当然,结果没有多大意义。
library('forecast')
set.seed(123)
# create random time series and 4 months testing data
ts <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
test.data <- runif(4, min = 50, max = 3000)
# Decomompose
comp.ts = decompose(ts)
# subtrect seasonal trend
ts2 <- ts - comp.ts$seasonal
ts2 <- diff(ts2, differences=1)
auto.arima(ts2, trace = T, seasonal = TRUE,ic = 'aicc', max.p = 10,max.q = 10,max.P = 10,max.Q = 10,max.d = 10, stepwise = F)
# Use auto.arima outcome as input
my.arima <- Arima(ts2, order=c(0,0,0),seasonal = list(order = c(0,1,0), period = 12),method="ML", include.drift = F)
# Forecast and calculate RMSE
data.forecast <- forecast(my.arima, h=4, level=c(99.5))
my.difference <- test.data - data.forecast$mean
my.rmse <- (sum(sqrt(my.difference^2)))/length(my.difference)由于我的实际数据集包含超过500个时间序列,我需要自动化整个过程。不幸的是,到目前为止,我还没有为时间序列使用R,所以我遇到了一个自动化过程的问题。
假设4个随机时间序列和4个随机测试集。我如何为这些时间序列(我也可以用于实际的500+时间序列)生成一个自动化的过程,它做的事情与上面完全相同?
ts1 <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
ts2 <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
ts3 <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
ts4 <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
test.data1 <- runif(4, min = 50, max = 3000)
test.data2 <- runif(4, min = 50, max = 3000)
test.data3 <- runif(4, min = 50, max = 3000)
test.data4 <- runif(4, min = 50, max = 3000)谢谢你的帮助!
发布于 2019-07-12 08:52:22
只要把你的工作流程变成一个函数就行了。
serialArima <- function(ts, test.data) {
library(forecast)
# Decomompose
comp.ts=decompose(ts)
# subtrect seasonal trend
ts2 <- ts - comp.ts$seasonal
ts2 <- diff(ts2, differences=1)
auto.arima(ts2, trace=T, seasonal=TRUE, ic='aicc', max.p=0, max.q=0, max.P=0,
max.Q=0, max.d=0, stepwise=F)
# Use auto.arima outcome as input
my.arima <- Arima(ts2, order=c(0, 0, 0),
seasonal=list(order=c(0, 1, 0), period=2),
method="ML", include.drift=F)
# Forecast and calculate RMSE
data.forecast <- forecast(my.arima, h=4, level=c(99.5))
my.difference <- test.data - data.forecast$mean
my.rmse <- (sum(sqrt(my.difference^2)))/length(my.difference)
return(list(data.forecast=data.forecast, my.difference=my.difference, my.rmse=my.rmse))
}奇异应用
serialArima(ts, test.data)
# ARIMA(0,0,0) with zero mean : 82.45803
# ARIMA(0,0,0) with non-zero mean : 88.13593
#
#
#
# Best model: ARIMA(0,0,0) with zero mean
#
# $data.forecast
# Point Forecast Lo 99.5 Hi 99.5
# 2020.00 -349.1424 -2595.762 1897.477
# 2020.50 772.6014 -1474.018 3019.221
# 2021.00 -349.1424 -3526.342 2828.057
# 2021.50 772.6014 -2404.598 3949.801
#
# $my.difference
# Time Series:
# Start = c(2020, 1)
# End = c(2021, 2)
# Frequency = 2
# [1] 1497.2446 840.4139 2979.4553 993.5614
#
# $my.rmse
# [1] 1577.669多重应用
Map(serialArima, list(ts1, ts2, ts3, ts4),
list(test.data1, test.data2, test.data3, test.data4))https://stackoverflow.com/questions/56966778
复制相似问题