Automatic Selection and Ensembling

Introduction

The issm_modelspec() function is the primary entry point for modeling in the tsissm package. It defines a model specification with support for automatic selection across multiple configurations. In addition to selecting the single best model (e.g., by AIC), it can optionally return the top N ranked models, enabling downstream model ensembling for filtering, forecasting and simulation.

This vignette demonstrates that functionality using U.S. Advanced Retail Sales data available in the package.

Advanced Retail Sales

This dataset represents the monthly, non-seasonally adjusted advance estimate of retail trade sales, based on a sub-sample of firms from the larger Monthly Retail Trade Survey.

As shown in the plot below, the series exhibits two prominent structural breaks: one around 2008 during the Global Financial Crisis, and another during the COVID-19 pandemic. To address this, we use the auto_regressors() function from the tsaux package, which detects and encodes three types of anomalies: additive outliers, temporary changes, and structural breaks (see this blog post for details).

Modeling such anomalies is important. When ignored, they are often absorbed by other components of the model, leading to biased coefficient estimates and inflated forecast variance.

Code

data("us_retail_sales")
opar <- par(mfrow = c(1,1))
y <- as.xts(us_retail_sales)
par(mar = c(2,2,2,2))
plot(as.zoo(y), ylab = "Sales (Mil's S)", xlab = "", main = "Advance Retail Sales")
grid()

Code

par(opar)

Model Selection

For this demonstration, we reserve the last 38 months of data for forecasting evaluation and select the top 4 models for ensembling.

Code

train <- y["/2021"]
test <- y["2022/"]
lambda_pre_estimate <- box_cox(lambda = NA)$transform(train) |> attr("lambda")
xreg <- auto_regressors(train, frequency = 12, lambda = lambda_pre_estimate, sampling = "months", h = nrow(test), method = "sequential", 
                        check.rank = TRUE, discard.cval = 3.5, maxit.iloop = 10, maxit.oloop = 10, 
                        forc_dates = index(test))
spec <- issm_modelspec(train, auto = TRUE, slope = c(TRUE, FALSE), seasonal = TRUE, 
                             seasonal_harmonics = list(c(3,4,5)), xreg = xreg$xreg[index(train), ], 
                             seasonal_frequency = 12, ar = 0:2, ma = 0:2, lambda = lambda_pre_estimate, top_n = 4)
mod <- spec |> estimate()
#> Warning in eval_f(x, ...): NaN or Infinite value in likelihood
#> Warning in eval_f(x, ...): NaN or Infinite value in likelihood
#> Warning in eval_f(x, ...): NaN or Infinite value in likelihood
#> Warning in eval_f(x, ...): NaN or Infinite value in likelihood
#> Warning in eval_f(x, ...): NaN or Infinite value in likelihood

We first inspect the top selected model:

Code

mod$models[[1]] |> summary() |> as_flextable()
#> Warning in sqrt(diag(V)): NaNs produced

ISSM Model: AAA(12{5})+ARMA(1,1)+X(8)
	Estimate	Std. Error	t value	Pr(>\|t\|)
$\alpha$	0.3122	0.0388	8.0506	0.0000	***
$\beta$	0.0000
$\gamma^{12}_{1}$	-0.0035	0.0027	-1.2960	0.1950
$\gamma^{12}_{2}$	0.0036	0.0015	2.3521	0.0187	*
$\theta_{1}$	-0.9900	0.0123	-80.5804	0.0000	***
$\psi_{1}$	0.9339	0.0379	24.6326	0.0000	***
$\kappa_{1}$	-0.8377	0.4109	-2.0386	0.0415	*
$\kappa_{2}$	-2.7068	0.4216	-6.4201	0.0000	***
$\kappa_{3}$	-5.1388	0.4180	-12.2925	0.0000	***
$\kappa_{4}$	-1.7848	0.4255	-4.1941	0.0000	***
$\kappa_{5}$	3.0853	0.4538	6.7985	0.0000	***
$\kappa_{6}$	3.6757	0.4621	7.9537	0.0000	***
$\kappa_{7}$	-2.7213	0.3464	-7.8564	0.0000	***
$\kappa_{8}$	2.2838	0.3750	6.0894	0.0000	***
Signif. codes: 0 '*' 0.001 '' 0.01 '*' 0.05 '.' 0.1 ' ' 1
sigma^2: 0.4512
LogLik: -3614.342
AIC: 3672 \| BIC: 7399
DAC : 84 \| MAPE : 1.5
Model Equation
$y^{\omega}_t=w'x_{t-1} + \sum_{j=1}^{8} \kappa_j \xi_{j,t} + \varepsilon, \varepsilon \sim N\left(0,\sigma\right)$
$x_t = Fx_{t-1}+g\varepsilon_{t}$

This model uses 5 harmonics and an ARMA(2,1) structure. Most coefficients appear statistically significant. We can also summarize the top 4 selected models:

Code

mod$table |> kable()

iter	slope	slope_damped	seasonal	ar	ma	Seasonal12	variance	distribution	lambda	AIC	MAPE
45	TRUE	FALSE	TRUE	1	1	5	constant	norm	0.2521608	3672.342	0.0151402
51	TRUE	FALSE	TRUE	1	2	5	constant	norm	0.2521608	3674.406	0.0150257
47	TRUE	FALSE	TRUE	2	1	5	constant	norm	0.2521608	3677.111	0.0153474
53	TRUE	FALSE	TRUE	2	2	5	constant	norm	0.2521608	3677.701	0.0151564

All top models use 5 harmonics. The main differences lie in the ARMA specification, and one model does not include a slope term. The pairwise correlation among model predictions is very high — expected, given the small structural differences:

Code

mod$correlation |> kable()

	model_1	model_2	model_3	model_4
model_1	1.0000000	0.9946482	0.9870880	0.9934973
model_2	0.9946482	1.0000000	0.9816921	0.9930069
model_3	0.9870880	0.9816921	1.0000000	0.9884887
model_4	0.9934973	0.9930069	0.9884887	1.0000000

Prediction and Ensembling

The object returned by estimate() when top_n > 1 is of class tsissm.selection. This class supports tsfilter(), predict(), and simulate() methods, which apply operations to each retained model, returning individual results. These can then be ensembled using the tsensemble() function.

Below, we generate forecasts from the top model and from all 4 models, and then ensemble them using equal weights:

Code

p_top <- mod$models[[1]] |> predict(h = nrow(test), seed = 200, nsim = 4000, newxreg = xreg$xreg[index(test),])
p_all <- mod |> predict(h = nrow(test), seed = 200, nsim = 4000, newxreg = xreg$xreg[index(test),])
p_ensemble <- p_all |> tsensemble(weights = rep(1/4,4))

We visualize the top model’s forecast and overlay the ensembled forecast distribution. The actual data is also shown for reference.

Code

opar <- par(mfrow = c(1,1))
par(mar = c(2,2,2,2))
p_top |> plot(n_original = 50, gradient_color =  "aliceblue", interval_type = 2, interval_color = "steelblue", interval_width = 1, median_width = 1.5, xlab = "")
p_ensemble$distribution |> plot(gradient_color =  "aliceblue", interval_type = 3, interval_color = "green", interval_width = 1, median_width = 1.5, 
                                median_type = 1, median_color = "green", add = TRUE)
lines(index(test), as.numeric(test), col = 2, lwd = 1.5)
legend("topleft", c("Historical","Actual (Forecast)", "Top (Forecast)", "Ensemble (Forecast)"), col = c("red","red","black","green"), lty = c(1,1.5,1.5,1.5), bty = "n")

Code

par(opar)

Note: You can overlay multiple predictive distributions using the add = TRUE argument in plot() for class tsmodel.distribution.

The following table compares performance metrics between the top model and the ensemble forecast. The ensemble consistently outperforms the top model across all metrics:

Code

tab <- rbind(tsmetrics(p_top, actual = test, alpha = 0.1),
             tsmetrics(p_ensemble, actual = test, alpha = 0.1))
rownames(tab) <- c("Top","Ensemble")
tab |> kable()

	h	MAPE	MASE	MSLRE	BIAS	MIS	CRPS
Top	38	0.0209576	0.7466572	0.0007027	-8692.113	84545.86	9228.121
Ensemble	38	0.0204617	0.7295890	0.0006565	-7274.586	83071.95	8969.030

Conclusion

The tsissm package provides a flexible and robust framework for modeling, forecasting, and simulating time series data using the innovations state space model. By supporting full model enumeration, automatic selection, and simulation-based prediction, it offers a rich toolkit for handling complex seasonal patterns and heteroscedasticity, and the availability of regressors allows the handling of identified outliers and structural breaks.

This vignette demonstrated how to apply automatic model selection with support for retaining the top N models and performing ensemble forecasting. By leveraging both predictive distributions and weighted ensembling, users can obtain more stable and accurate forecasts — particularly valuable when models are structurally similar but individually uncertain.