Filtering and Prediction

Code
library(tsissm)
library(tsaux)
library(xts)
library(data.table)
library(zoo)

Introduction

Online filtering refers to the process of sequentially updating the estimated states of a time series model as new observations become available, without re-estimating the model parameters. This is a real-time inference technique where internal components of the model—such as level, trend, seasonal effects, or other latent states—are updated dynamically using the most recent data, while holding model parameters fixed.

This functionality is especially useful for real-time forecasting, where updated h-step ahead forecasts are generated as new data arrives, avoiding the computational cost of full re-estimation.

Demonstration

We’ll use the U.S. Advance Monthly Retail Sales dataset, which captures monthly, non-seasonally adjusted sales figures based on a sub-sample of firms from the broader Monthly Retail Trade Survey.

Estimation

Code
data("us_retail_sales")
y <- as.xts(us_retail_sales)
train <- y["/2023"]
test <- y["2024/"]
lambda_pre_estimate <- box_cox(lambda = NA)$transform(train) |> attr("lambda")
xreg <- auto_regressors(train, frequency = 12, lambda = lambda_pre_estimate, sampling = "months", h = 14, method = "sequential", 
                        check.rank = TRUE, discard.cval = 3.5, maxit.iloop = 10, maxit.oloop = 10, 
                        forc_dates = index(test))
spec <- issm_modelspec(train, slope = TRUE, seasonal = TRUE, seasonal_frequency = 12,
                       seasonal_harmonics = 5, ar = 1, ma = 0, lambda = lambda_pre_estimate, 
                       xreg = xreg$xreg[index(train),])
spec$parmatrix[grepl("^kappa", parameters), initial := xreg$init]
mod <- spec |> estimate()

Rolling Forecasts via Online Filtering

With the model estimated, we can now generate a rolling 1-step ahead forecast without re-estimation. This is similar to the functionality in tsbacktest, but online filtering allows for real-time updates to the model states.

Code
test_xreg <- xreg$xreg[index(test),]
predict_list <- vector(mode = "list", length = 14)
predict_list[[1]] <- predict(mod, h = 1, newxreg = test_xreg[1,])
new_mod <- mod
for (i in 2:14) {
    new_mod <-  tsfilter(new_mod, y = test[i - 1], newxreg = test_xreg[i - 1, ])
    predict_list[[i]] <- predict(new_mod, h = 1, newxreg = test_xreg[i,])
}
forecast_distribution <- do.call(cbind, lapply(predict_list, function(x) x$distribution))
forecast_mean <- do.call(rbind, lapply(predict_list, function(x) x$analytic_mean))
class(forecast_distribution) <- "tsmodel.distribution"
test_transformed <- cbind(fitted(new_mod)[index(test)], forecast_mean)
colnames(test_transformed) <- c("Filtered","Forecast")
head(test_transformed)
#>            Filtered Forecast
#> 2024-01-31 571622.1 571665.8
#> 2024-02-29 551434.4 551477.9
#> 2024-03-31 612547.6 612591.8
#> 2024-04-30 600184.3 600228.4
#> 2024-05-31 633269.2 633313.6
#> 2024-06-30 616013.2 616057.5

Filtering vs Forecasting at Horizon ( h = 1 )

You may notice a small discrepancy between the 1-step ahead filtered values (Filtered) and the corresponding forecasts (Forecast). This difference is not due to rounding, but rather to a bias adjustment applied during the back-transformation from the Box-Cox scale in forecasting.

The filtered value reflects the “best guess of the latent process given all information up to (t-1), whereas the forecast represents the expected value of the original variable on its natural scale.

To confirm the two are the same on the transformed scale:

Code
p <- tsmoments(mod, h = 1, newxreg = test_xreg[1,], transform = FALSE)
tmp_mod <-  tsfilter(mod, y = test[1], newxreg = test_xreg[1, ])
fitted_value <- tail(tmp_mod$spec$target$y, 1) - tail(tmp_mod$model$error,1)
all.equal(p$mean, fitted_value)
#> [1] TRUE

which are exactly the same in the transformed space.

The tsfilter method takes in new information (it need not be a scalar), and updates the information set in the estimated model, after checking for any duplicate time indices in the submitted data, or time indices before the last time index. Thus the returned object is just an updated tsissm.estimate class object.

Visual Comparison

We now compare the original fitted values with the filtered and forecasted values.

Code
filtered_model <- fitted(new_mod)
original_model <- fitted(mod)
forecast_model <- forecast_mean
Z <- cbind(original_model, filtered_model, forecast_model)
matplot(index(tail(Z,50)), coredata(tail(Z,50)), type = c("l","l","p"), col = c("black","orange","steelblue"), pch = c(1,1,1), lty = c(1,2,1), ylab = "", xlab = "", lwd = c(1,1.5, 1.8))
legend("topleft", c("Fitted","Filtered","Forecast"), col = c("black","orange","steelblue"), lty = c(1,1,NA), lwd = c(1,1.5, 1.8), pch = c(NA,NA,1), bty = "n")

Conclusion

This vignette demonstrates how online filtering can be used to sequentially update the internal state of a time series model without re-estimating parameters. This enables fast, real-time updates and 1-step ahead forecasts.