--- title: "Filtering and Prediction" output: rmarkdown::html_vignette: css: custom.css code_folding: hide toc: yes vignette: > %\VignetteIndexEntry{Filtering and Prediction} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, warning=FALSE,message=FALSE} library(tsissm) library(tsaux) library(xts) library(data.table) library(zoo) ``` ## Introduction Online filtering refers to the process of sequentially updating the estimated states of a time series model as new observations become available, without re-estimating the model parameters. This is a real-time inference technique where internal components of the model—such as level, trend, seasonal effects, or other latent states—are updated dynamically using the most recent data, while holding model parameters fixed. This functionality is especially useful for real-time forecasting, where updated h-step ahead forecasts are generated as new data arrives, avoiding the computational cost of full re-estimation. ## Demonstration We'll use the U.S. Advance Monthly Retail Sales dataset, which captures monthly, non-seasonally adjusted sales figures based on a sub-sample of firms from the broader Monthly Retail Trade Survey. ### Estimation ```{r} data("us_retail_sales") y <- as.xts(us_retail_sales) train <- y["/2023"] test <- y["2024/"] lambda_pre_estimate <- box_cox(lambda = NA)$transform(train) |> attr("lambda") xreg <- auto_regressors(train, frequency = 12, lambda = lambda_pre_estimate, sampling = "months", h = 14, method = "sequential", check.rank = TRUE, discard.cval = 3.5, maxit.iloop = 10, maxit.oloop = 10, forc_dates = index(test)) spec <- issm_modelspec(train, slope = TRUE, seasonal = TRUE, seasonal_frequency = 12, seasonal_harmonics = 5, ar = 1, ma = 0, lambda = lambda_pre_estimate, xreg = xreg$xreg[index(train),]) spec$parmatrix[grepl("^kappa", parameters), initial := xreg$init] mod <- spec |> estimate() ``` ### Rolling Forecasts via Online Filtering With the model estimated, we can now generate a rolling 1-step ahead forecast without re-estimation. This is similar to the functionality in `tsbacktest`, but online filtering allows for real-time updates to the model states. ```{r} test_xreg <- xreg$xreg[index(test),] predict_list <- vector(mode = "list", length = 14) predict_list[[1]] <- predict(mod, h = 1, newxreg = test_xreg[1,]) new_mod <- mod for (i in 2:14) { new_mod <- tsfilter(new_mod, y = test[i - 1], newxreg = test_xreg[i - 1, ]) predict_list[[i]] <- predict(new_mod, h = 1, newxreg = test_xreg[i,]) } forecast_distribution <- do.call(cbind, lapply(predict_list, function(x) x$distribution)) forecast_mean <- do.call(rbind, lapply(predict_list, function(x) x$analytic_mean)) class(forecast_distribution) <- "tsmodel.distribution" test_transformed <- cbind(fitted(new_mod)[index(test)], forecast_mean) colnames(test_transformed) <- c("Filtered","Forecast") head(test_transformed) ``` ### Filtering vs Forecasting at Horizon ( h = 1 ) You may notice a small discrepancy between the 1-step ahead filtered values (Filtered) and the corresponding forecasts (Forecast). This difference is not due to rounding, but rather to a bias adjustment applied during the back-transformation from the Box-Cox scale in forecasting. - Filtering computes the conditional expectation of the transformed variable and inverts it without bias correction. - Forecasting returns the unbiased expectation on the original scale by applying a second-order bias correction. The filtered value reflects the "best guess of the latent process given all information up to (t-1), whereas the forecast represents the expected value of the original variable on its natural scale. To confirm the two are the same on the transformed scale: ```{r} p <- tsmoments(mod, h = 1, newxreg = test_xreg[1,], transform = FALSE) tmp_mod <- tsfilter(mod, y = test[1], newxreg = test_xreg[1, ]) fitted_value <- tail(tmp_mod$spec$target$y, 1) - tail(tmp_mod$model$error,1) all.equal(p$mean, fitted_value) ``` which are exactly the same in the transformed space. The `tsfilter` method takes in new information (it need not be a scalar), and updates the information set in the estimated model, after checking for any duplicate time indices in the submitted data, or time indices before the last time index. Thus the returned object is just an updated `tsissm.estimate` class object. ### Visual Comparison We now compare the original fitted values with the filtered and forecasted values. ```{r, fig.width=6,fig.height=4} filtered_model <- fitted(new_mod) original_model <- fitted(mod) forecast_model <- forecast_mean Z <- cbind(original_model, filtered_model, forecast_model) matplot(index(tail(Z,50)), coredata(tail(Z,50)), type = c("l","l","p"), col = c("black","orange","steelblue"), pch = c(1,1,1), lty = c(1,2,1), ylab = "", xlab = "", lwd = c(1,1.5, 1.8)) legend("topleft", c("Fitted","Filtered","Forecast"), col = c("black","orange","steelblue"), lty = c(1,1,NA), lwd = c(1,1.5, 1.8), pch = c(NA,NA,1), bty = "n") ``` ## Conclusion This vignette demonstrates how online filtering can be used to sequentially update the internal state of a time series model without re-estimating parameters. This enables fast, real-time updates and 1-step ahead forecasts.