The mrf package

Quirin Stier

2022-02-23

The package mrf provides access to a univariate time series forecasting method. This forecasting method uses a redundant Haar wavelet transform to decompose a given time series in wavelet and smooth coefficients. The obtained wavelet and smooth coefficients are processed in a specific prediction scheme in order to compute one step forecasts. This scheme can be computed with an autoregression or a neural network. Multi step forecasts are created recursively using the one step forecast. A rolling forecasting origin (special type of cross validation designed for time series forecasting) can be used to assess the performance of a specific model. In order to assess a best model out of many possible models, a model selection using cross validation or nested cross validation can be used. The best model is chosen with Akaikes information criterion (AIC), the Mean Absolute Error (MAE) or the Mean Root Error (MRE). The input space in the model selection function is searched for with an evolutionary optimization method.

In the following example, the seasonal univariate entsoe dataset is loaded and visualized:

#library(mrf)
data("entsoe", package="mrf")
UnivariateData = entsoe$value
plot(UnivariateData, type = "l", xlab="Days", ylab="Electricity Demand MW/h", col = 4)

Decomposition of time series “ENTSOE”

The decomposition of the time series “ENTSOE” is shown for two levels. All wavelet levels and only the last smooth part level is used for further computation.

dec = mrf::wavelet_decomposition(UnivariateData = UnivariateData, Aggregation = c(2,4))
plot(dec$WaveletCoefficients[1,2:length(dec$SmoothCoefficients[1,])], type = "l", main = "Wavelet level 1", xlab="Days", ylab="Electricity Demand MW/h", col = 4)

plot(dec$WaveletCoefficients[2,4:length(dec$WaveletCoefficients[1,])], type = "l", main = "Wavelet level 2", xlab="Days", ylab="Electricity Demand MW/h",  col = 4)

plot(dec$SmoothCoefficients[2,4:length(dec$SmoothCoefficients[1,])], type = "l", main = "Last smooth part level", xlab="Days", ylab="Electricity Demand MW/h",  col = 4)

Multiresolution Forecasting

Multiresolution forecasts can be obtained by generating a model on training data and using the forecast functionality to create forecasts for different horizons. In the following, two examples are presented, one showing a one-step forecast, the other a multi-step forecast for horizon 2.

One-step forecasts:

len_data = length(UnivariateData)
Train1    = UnivariateData[1:(len_data)]
Test1     = UnivariateData[len_data]
# One-step forecast (Multiresolution Forecast)
model1   = mrf_train(Train1)
one_step = mrf_forecast(model1, Horizon=1)
Erro1    = one_step$Forecast - Test1

Multi-step forecasts:

Train2    = UnivariateData[1:(len_data-2)]
Test2     = UnivariateData[(len_data-1):len_data]
# Multi-step forecast (Multiresolution Forecast)
# Horizon = 2 => Forecast with Horizon 1 and 2 as vector
model2    = mrf_train(Train2, Horizon=2)
multi_step = mrf_forecast(model2, Horizon=2)
Error2     = multi_step$Forecast - Test2

Evaluating time series forecasting models with cross validation

In order to obtain robust forecasting models, multiple models are evaluated over a large sample space. This can be done with a rolling forecasting origin - a cross validation approach specially designed for time series forecasting. Cross validation is the partitioning into training and test data multiple times. Time series data cannot be split arbitrarily. The training set always consists of observations occurring in time prior to the test set. The training set must not be too short. All training data up to a specific time point - the origin - is used to create a forecast. In the following example, forecasts for horizon 2 (one-step forecast + multi-step forecast for horizon = “) are computed. Such forecasts are computed here as example for 3 days. The autoregression is used to compute the forecasts. The parameters Aggregation and CoefficientCombination are the results of a model selection done in experiments as presented later.

Rolling forecasting origin

CoefficientCombination = c(10,10,10)
Aggregation = c(2,4)
rw_forecasts = mrf::mrf_rolling_forecasting_origin(UnivariateData,
                                                   CoefficientCombination,
                                                   Aggregation,
                                                   Horizon = 2,
                                                   Window = 3,
                                                   Method = "r",
                                                   NumClusters = 1)
Error = rw_forecasts$Error
Forecast = rw_forecasts$Forecast
MAE = sum(abs(Error))/(dim(Error)[1] * dim(Error)[2]) # Mean Absolute Error