The package mrf provides access to a univariate time series forecasting method. This forecasting method uses a redundant Haar wavelet transform to decompose a given time series in wavelet and smooth coefficients. The obtained wavelet and smooth coefficients are processed in a specific prediction scheme in order to compute one step forecasts. This scheme can be computed with an autoregression or a neural network. Multi step forecasts are created recursively using the one step forecast. A rolling forecasting origin (special type of cross validation designed for time series forecasting) can be used to assess the performance of a specific model. In order to assess a best model out of many possible models, a model selection using cross validation or nested cross validation can be used. The best model is chosen with Akaikes information criterion (AIC), the Mean Absolute Error (MAE) or the Mean Root Error (MRE). The input space in the model selection function is searched for with an evolutionary optimization method.
In the following example, the seasonal univariate entsoe dataset is loaded and visualized:
#library(mrf)
data("entsoe", package="mrf")
= entsoe$value
UnivariateData plot(UnivariateData, type = "l", xlab="Days", ylab="Electricity Demand MW/h", col = 4)
The decomposition of the time series “ENTSOE” is shown for two levels. All wavelet levels and only the last smooth part level is used for further computation.
= mrf::wavelet_decomposition(UnivariateData = UnivariateData, Aggregation = c(2,4))
dec plot(dec$WaveletCoefficients[1,2:length(dec$SmoothCoefficients[1,])], type = "l", main = "Wavelet level 1", xlab="Days", ylab="Electricity Demand MW/h", col = 4)
plot(dec$WaveletCoefficients[2,4:length(dec$WaveletCoefficients[1,])], type = "l", main = "Wavelet level 2", xlab="Days", ylab="Electricity Demand MW/h", col = 4)
plot(dec$SmoothCoefficients[2,4:length(dec$SmoothCoefficients[1,])], type = "l", main = "Last smooth part level", xlab="Days", ylab="Electricity Demand MW/h", col = 4)
Multiresolution forecasts can be obtained by generating a model on training data and using the forecast functionality to create forecasts for different horizons. In the following, two examples are presented, one showing a one-step forecast, the other a multi-step forecast for horizon 2.
= length(UnivariateData)
len_data = UnivariateData[1:(len_data)]
Train1 = UnivariateData[len_data]
Test1 # One-step forecast (Multiresolution Forecast)
= mrf_train(Train1)
model1 = mrf_forecast(model1, Horizon=1)
one_step = one_step$Forecast - Test1 Erro1
= UnivariateData[1:(len_data-2)]
Train2 = UnivariateData[(len_data-1):len_data]
Test2 # Multi-step forecast (Multiresolution Forecast)
# Horizon = 2 => Forecast with Horizon 1 and 2 as vector
= mrf_train(Train2, Horizon=2)
model2 = mrf_forecast(model2, Horizon=2)
multi_step = multi_step$Forecast - Test2 Error2
In order to obtain robust forecasting models, multiple models are evaluated over a large sample space. This can be done with a rolling forecasting origin - a cross validation approach specially designed for time series forecasting. Cross validation is the partitioning into training and test data multiple times. Time series data cannot be split arbitrarily. The training set always consists of observations occurring in time prior to the test set. The training set must not be too short. All training data up to a specific time point - the origin - is used to create a forecast. In the following example, forecasts for horizon 2 (one-step forecast + multi-step forecast for horizon = “) are computed. Such forecasts are computed here as example for 3 days. The autoregression is used to compute the forecasts. The parameters Aggregation and CoefficientCombination are the results of a model selection done in experiments as presented later.
= c(10,10,10)
CoefficientCombination = c(2,4)
Aggregation = mrf::mrf_rolling_forecasting_origin(UnivariateData,
rw_forecasts
CoefficientCombination,
Aggregation,Horizon = 2,
Window = 3,
Method = "r",
NumClusters = 1)
= rw_forecasts$Error
Error = rw_forecasts$Forecast
Forecast = sum(abs(Error))/(dim(Error)[1] * dim(Error)[2]) # Mean Absolute Error MAE