--- title: "Nowcasting revisions using the Jacobs-Van Norden model" output: rmarkdown::html_vignette bibliography: references.bib biblio-style: apalike link-citations: true vignette: > %\VignetteIndexEntry{Nowcasting revisions using the Jacobs-Van Norden model} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) rebuild_vignette_results <- identical( tolower(Sys.getenv("REVISER_REBUILD_VIGNETTE_RESULTS")), "true" ) vignette_result_path <- function(name) { candidates <- c( file.path("vignettes", "precomputed", name), file.path("precomputed", name) ) existing <- candidates[file.exists(candidates)] if (length(existing) > 0) { existing[[1]] } else { candidates[[1]] } } load_or_build_vignette_result <- function(name, builder) { path <- vignette_result_path(name) if (!rebuild_vignette_results) { if (!file.exists(path)) { stop("Missing precomputed vignette result: ", basename(path)) } return(readRDS(path)) } result <- builder() dir.create(dirname(path), recursive = TRUE, showWarnings = FALSE) saveRDS(result, path) result } ``` This vignette describes the Jacobs-Van Norden (JVN) revision model as implemented in `reviser::jvn_nowcast()`. The presentation follows the same Durbin-Koopman state-space notation used in the KK vignette: observations are linked to latent states through \(Z\), state dynamics through \(T\), and innovations through \(R\), \(H\), and \(Q\) [@durbinTimeSeriesAnalysis2012]. The key idea of the JVN framework is that revision errors are not treated as a single residual. Instead, they are decomposed into **news** and **noise**. News corresponds to genuinely new information incorporated by later releases, whereas noise corresponds to transitory measurement error that is corrected in subsequent vintages [@jacobsModelingDataRevisions2011]. ## Revision decomposition Let \(l\) denote the number of vintages used in the model and let \(y_t^{t+j}\) be the estimate for reference period \(t\) available in vintage \(t+j\). Stack the vintages into $$ y_t = \begin{bmatrix} y_t^{t+1} \\ y_t^{t+2} \\ \vdots \\ y_t^{t+l} \end{bmatrix}. $$ Let \(\tilde y_t\) denote the latent "true" value and let \(\iota_l\) be an \(l \times 1\) vector of ones. The JVN decomposition is $$ y_t = \iota_l \tilde y_t + \nu_t + \zeta_t, $$ where \(\nu_t\) is the news component and \(\zeta_t\) is the noise component. - \(\nu_t\) captures information that was unavailable when early releases were produced and is therefore rationally incorporated later. - \(\zeta_t\) captures transitory measurement error that is eventually revised away. This decomposition is the main attraction of the JVN model: it separates revisions that reflect learning about the economy from revisions that reflect mistakes in earlier measurement. ## Durbin-Koopman state-space form In the notation of @durbinTimeSeriesAnalysis2012, the generic state-space model is $$ y_t = Z \alpha_t + \varepsilon_t, \qquad \varepsilon_t \sim N(0, H), $$ $$ \alpha_{t+1} = T \alpha_t + R \eta_t, \qquad \eta_t \sim N(0, Q). $$ The current `reviser` implementation sets \(H = 0\), so all uncertainty enters through the transition equation. It also fixes \(Q = I\) and places the scale parameters directly in the shock-loading matrix \(R\). ## The `reviser` implementation `jvn_nowcast()` implements a restricted but practical version of the JVN model. The latent true value follows an AR(\(p\)) process, and the user may include a news block, a noise block, or both. Optional spillovers are implemented as diagonal persistence terms in the selected measurement-error blocks. When both news and noise are included, the state vector is $$ \alpha_t = \begin{bmatrix} \tilde y_t \\ \tilde y_{t-1} \\ \vdots \\ \tilde y_{t-p+1} \\ \nu_t \\ \zeta_t \end{bmatrix}, $$ where \(\nu_t\) and \(\zeta_t\) are both \(l \times 1\) vectors. ### Measurement equation With \(l\) vintages and an AR(\(p\)) latent process, the observation matrix is $$ Z = \begin{bmatrix} \iota_l & 0_{l \times (p - 1)} & I_l & I_l \end{bmatrix}, $$ so the observation equation is $$ y_t = Z \alpha_t. $$ If only news or only noise is included, the corresponding block is simply omitted from \(Z\). ### Transition equation The true-value block follows the companion-form AR(\(p\)) transition $$ \Phi = \begin{bmatrix} \rho_1 & \rho_2 & \cdots & \rho_p \\ 1 & 0 & \cdots & 0 \\ 0 & 1 & \ddots & \vdots \\ \vdots & \vdots & \ddots & 0 \end{bmatrix}. $$ The full transition matrix can therefore be written compactly as $$ T = \begin{bmatrix} \Phi & 0 & 0 \\ 0 & T_{\nu} & 0 \\ 0 & 0 & T_{\zeta} \end{bmatrix}, $$ where \(T_{\nu}\) and \(T_{\zeta}\) are diagonal spillover blocks when spillovers are enabled and zero matrices otherwise. ### Shock-loading matrix The implementation uses \(Q = I\) and places the innovation standard deviations inside \(R\). - The first structural shock loads on the latent true value with coefficient \(\sigma_e\). - The news shocks load negatively on the true value and positively on the news states in the upper-triangular pattern implied by `jvn_update_matrices()`. This enforces the idea that later vintages embed information unavailable to earlier vintages. - The noise shocks load independently on the corresponding noise states with coefficients \(\sigma_{\zeta,1}, \dots, \sigma_{\zeta,l}\). This is the main implementation detail that differs from writing every variance parameter inside \(Q\): in `reviser`, \(Q\) is fixed and \(R\) carries the scale parameters. ## Nested JVN specifications The function covers the empirically relevant subclasses discussed by @jacobsModelingDataRevisions2011. - `include_news = TRUE`, `include_noise = FALSE`: pure news model - `include_news = FALSE`, `include_noise = TRUE`: pure noise model - `include_news = TRUE`, `include_noise = TRUE`: combined news-noise model - `include_spillovers = TRUE`: diagonal persistence in the selected measurement-error block(s) Because these are nested specifications, information criteria are often useful for comparing them, although standard boundary-value caveats still apply. ## Example: Euro Area GDP revisions We illustrate the workflow with four vintages of Euro Area GDP growth from `reviser::gdp`. ```{r warning = FALSE, message = FALSE} library(reviser) library(dplyr) library(tidyr) library(tsbox) library(ggplot2) gdp_growth <- reviser::gdp |> tsbox::ts_pc() |> dplyr::filter( id == "EA", time >= min(pub_date), time <= as.Date("2020-01-01") ) |> tidyr::drop_na() df <- get_nth_release(gdp_growth, n = 0:3) df ``` The resulting data frame has one row per reference period and one column per release, which is the format expected by `jvn_nowcast()`. ```{r warning = FALSE, message = FALSE} fit_jvn <- load_or_build_vignette_result( "nowcasting-revisions-jvn-fit.rds", function() { jvn_nowcast( df = df, e = 4, ar_order = 2, h = 0, include_news = TRUE, include_noise = TRUE, include_spillovers = TRUE, spillover_news = TRUE, spillover_noise = TRUE, method = "MLE", standardize = FALSE, solver_options = list( method = "L-BFGS-B", maxiter = 100, se_method = "hessian" ) ) } ) summary(fit_jvn) ``` The parameter table contains the AR coefficients, the latent-process innovation scale \(\sigma_e\), the news and noise innovation scales, and, when selected, the diagonal spillover persistence parameters. ```{r warning = FALSE, message = FALSE} fit_jvn$params ``` The state named `true_lag_0` is the current latent true value. ```{r warning = FALSE, message = FALSE} fit_jvn$states |> dplyr::filter( state == "true_lag_0", filter == "smoothed" ) |> dplyr::slice_tail(n = 8) ``` The default plot method shows the filtered estimate of the latent true value. ```{r warning = FALSE, message = FALSE} plot(fit_jvn) ``` We can also inspect the smoothed news and noise states directly. ```{r warning = FALSE, message = FALSE} fit_jvn$states |> dplyr::filter( filter == "smoothed", grepl("news|noise", state) ) |> ggplot(aes(x = time, y = estimate, color = state)) + geom_line() + labs( title = "Smoothed news and noise states", x = NULL, y = "State estimate" ) + theme_minimal() ``` ## Other JVN specifications Pure-news and pure-noise variants are obtained by switching off the unwanted measurement-error block. ```{r eval = FALSE} fit_news <- jvn_nowcast( df = df, e = 4, ar_order = 2, include_news = TRUE, include_noise = FALSE, include_spillovers = FALSE ) fit_noise <- jvn_nowcast( df = df, e = 4, ar_order = 2, include_news = FALSE, include_noise = TRUE, include_spillovers = FALSE ) ``` If desired, the data can be approximately standardized before estimation using `standardize = TRUE`. In that case, scaling metadata are returned in the `scale` element of the fitted object.