This vignette provides a detailed explanation of the statistical
measures and hypothesis tests used in
get_revision_analysis(). The function is designed to
analyze revisions between preliminary and final data releases, helping
to assess bias, efficiency, correlation, seasonality, and the relative
contributions of news and noise in revisions. The function returns a
tibble with the desired statistics and hypothesis tests. In the
following sections, we provide a detailed explanation of the statistics
and hypothesis tests used in the function and provide the corresponding
column name in brackets to assess it.
Let’s define some notation to facilitate the discussion of the statistics and hypothesis tests used in the function:
The function takes two mandatory arguments:
The function argument degree allows users to specify
which statistics and hypothesis tests to include in the output:
degree = 1 includes information about revision size
(default)degree = 2 includes correlation statistics of
revisiondegree = 3 includes news and noise testsdegree = 4 includes sign switches, seasonality analysis
and Theil’s Udegree = 5 includes all the statistics and hypothesis
testsRevisions provide insights into the reliability of initial data releases. Various metrics assess their magnitude and distribution:
"Bias (mean)")The mean revision quantifies systematic bias in the revisions:
\[ \text{Bias} = \bar{r} = \frac{1}{N} \sum_{t=1}^{N} r_t^{f} \]
If the mean revision is significantly different from zero, it indicates a tendency for initial releases to be systematically over- or under-estimated. A t-test is conducted to test the null hypothesis \(H_0: \bar{r} = 0\).
"Bias (p-value)" reports the standard t-test
p-value."Bias (robust p-value)" provides a
heteroskedasticity-robust alternative."MAR")The mean absolute revision measures the average size of revisions, regardless of direction:
\[ \text{MAR} = \frac{1}{N} \sum_{t=1}^{N} |r_t^{f}| \]
This metric is useful when evaluating the magnitude of revisions rather than their direction.
"Minimum", "Maximum")These statistics capture the most extreme downward and upward revisions in the dataset.
"10Q",
"Median", "90Q")The 10th, 50th (median), and 90th percentiles provide a distributional perspective, helping to identify skewness and tail behavior in revisions.
"Std. Dev.")The standard deviation quantifies the dispersion of revisions:
\[ \sigma_r = \sqrt{ \frac{1}{N-1} \sum_{t=1}^{N} (r_t^{f} - \bar{r})^2 } \]
A higher standard deviation indicates greater variability in the revision process.
"Noise/Signal")This metric measures the relative size of revisions compared to the total variance in the final data:
\[ \frac{\sigma_r}{\sigma_y} \]
where \(\sigma_y\) is the standard deviation of the final value \(y_t^f\). A high noise-to-signal ratio suggests that revisions are large relative to the underlying variability in the final data, indicating high uncertainty in initial estimates.
library(reviser)
library(dplyr)
library(tsbox)
gdp <- reviser::gdp |>
ts_pc() |>
filter(id == "EA") |>
na.omit()
df <- get_nth_release(gdp, 0:1)
final_release <- get_latest_release(gdp)
results <- get_revision_analysis(
df,
final_release,
degree = 1
)
results
#>
#> === Revision Analysis Summary ===
#>
#> # A tibble: 2 × 14
#> id release N `Bias (mean)` `Bias (p-value)` `Bias (robust p-value)`
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 EA release_0 178 0.055 0.002 0
#> 2 EA release_1 177 0.05 0.004 0.001
#> # ℹ 8 more variables: Minimum <dbl>, Maximum <dbl>, `10Q` <dbl>, Median <dbl>,
#> # `90Q` <dbl>, MAR <dbl>, `Std. Dev.` <dbl>, `Noise/Signal` <dbl>
#>
#> === Interpretation ===
#>
#> id=EA, release=release_0:
#> • Significant upward bias detected (p = 0 )
#> • Moderate revision volatility (Noise/Signal = 0.169 )
#>
#> id=EA, release=release_1:
#> • Significant upward bias detected (p = 0.001 )
#> • Moderate revision volatility (Noise/Signal = 0.165 )Understanding how revisions relate to initial releases and past revisions can reveal patterns in the revision process:
"Correlation")This measures whether revisions are systematically related to the initial release \(y_t^h\), which serves as a proxy for available information at the time of release:
\[ \rho = \frac{\sum (y_t^h - \bar{y}^h) (r_t^f - \bar{r})}{\sqrt{\sum (y_t^h - \bar{y}^h)^2} \sqrt{\sum (r_t^f - \bar{r})^2}} \]
A significant correlation suggests that initial estimates contain
information that predicts later revisions. A t-test
("Correlation (p-value)") is used to test whether \(\rho\) is significantly different from
zero.
"Autocorrelation (1st)")The first-order autocorrelation measures the persistence in the revision process by examining whether past revisions predict future revisions:
\[ \rho_1 = \frac{\sum (r_t^f - \bar{r}) (r_{t-1}^f - \bar{r})}{\sqrt{\sum (r_t^f - \bar{r})^2 } \sqrt{\sum (r_{t-1}^f - \bar{r})^2}} \]
If revisions exhibit strong autocorrelation, it may indicate a
systematic pattern in the revision process rather than purely random
adjustments. A t-test
("Autocorrelation (1st p-value)") is used to assess whether
\(\rho_1\) is significantly different
from zero.
"Autocorrelation up to 1yr (Ljung-Box p-value)")This metric assesses the autocorrelation of revisions up to a lag of 1 year using the Ljung-Box test. The null hypothesis of the Ljung-Box test is that there is no autocorrelation up to the specified lag. In other words, the revisions are independent of one another. The alternative hypothesis is that there is autocorrelation in the revisions.
The Ljung-Box test statistic is computed using the following formula:
\[ Q = N(N + 2) \sum_{k=1}^{m} \frac{\hat{\rho}_k^2}{N - k} \]
\(Q\) is the Ljung-Box statistic.
\(N\) is the number of observations in the time series.
\(\hat{\rho}_k\) is the sample autocorrelation at lag \(k\). This measures the correlation between the revisions at time \(t\) and time \(t-k\).
\(m\) is the number of lags considered (which corresponds to the frequency of the data: 4 for quarterly data, 12 for monthly data).
The test statistic \(Q\) is used to determine whether there is significant autocorrelation. If the autocorrelation at a particular lag is large, it suggests that the revisions at that lag are not independent of each other. The larger the \(Q\)-statistic, the stronger the evidence that the revisions are autocorrelated.
These metrics collectively help evaluate the reliability, predictability, and potential biases in data revisions.
results <- get_revision_analysis(
df,
final_release,
degree = 2
)
head(results)
#>
#> === Revision Analysis Summary ===
#>
#> # A tibble: 2 × 8
#> id release N Correlation Correlation (p-value…¹ Autocorrelation (1st…²
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 EA release… 178 -0.315 0 -0.092
#> 2 EA release… 177 -0.318 0 -0.157
#> # ℹ abbreviated names: ¹`Correlation (p-value)`, ²`Autocorrelation (1st)`
#> # ℹ 2 more variables: `Autocorrelation (1st p-value)` <dbl>,
#> # `Autocorrelation up to 1yr (Ljung-Box p-value)` <dbl>
#>
#> === Interpretation ===
#>
#> id=EA, release=release_0:
#> • Significant negative correlation between revisions and initial values (ρ = -0.315 , p = 0 )
#>
#> id=EA, release=release_1:
#> • Significant negative correlation between revisions and initial values (ρ = -0.318 , p = 0 )
#> • Significant autocorrelation in revisions
#> (ρ₁ = -0.157 ): revisions are persistentSign switches in data revisions can occur when the direction of an economic indicator changes between its initial release and later revisions. We define two key metrics to assess the stability of these directional signals:
This metric ("Fraction of correct sign") evaluates how
often the sign of the initially reported value \(y_t^h\) differs from the final revised
value \(y_t^f\). Mathematically, we
compute:
\[ \text{Fraction of sign consistency} = \frac{\sum_{t=1}^{T} \mathbb{1}(\text{sign}(y_t^h) = \text{sign}(y_t^f))}{T} \]
where \(\mathbb{1}(\cdot)\) is an indicator function that equals 1 if the signs match and 0 otherwise.
This metric ("Fraction of correct growth rate change")
assesses whether the direction of change in the variable remains
consistent after revisions. Specifically, we compare the sign of the
period-over-period differences:
\[ \text{Fraction of sign consistency in growth} = \frac{\sum_{t=2}^{T} \mathbb{1}(\text{sign}(\Delta y_t^h) = \text{sign}(\Delta y_t^f))}{T-1} \]
where \(\Delta y_t^h = y_t^h - y_{t-1}^h\) and \(\Delta y_t^f = y_t^f - y_{t-1}^f\) represent the first differences of the initially reported and final values, respectively.
A high fraction of incorrect signs in either metric suggests that early estimates may be unreliable in capturing the true direction of the variable.
The news and noise tests analyze the properties of data revisions in relation to the final and preliminary releases of an economic variable. These tests help to evaluate whether the revisions are systematic, whether they contain new information, or whether they are simply noisy adjustments (Mankiw and Shapiro 1986; Aruoba 2008).
Final revisions can be classified into two categories:
Noise: The initial announcement is an observation of the final series, measured with error. This means that the revision is uncorrelated with the final value but correlated with the data available when the estimate is made.
News: The initial announcement is an efficient forecast that reflects all available information, and subsequent estimates reduce the forecast error by incorporating new information. The revision is correlated with the final value but uncorrelated with the data available when the estimate is made, i.e., unpredictable using the information set at the time of the initial announcement.
To test these categories, we run two regression tests:
The noise test examines whether the revision can be predicted by the final value:
\[ {r}_t^f = \alpha + \beta \cdot {y_t^f} + \epsilon_t \]
If the preliminary value were an efficient and unbiased estimate of the final value (i.e incorporating all relevant information), revisions should be uncorrelated with the final value. The null hypothesis is:
\[ H_0: \alpha = 0, \quad \beta = 0 \]
"Noise joint test (p-value)") that both \(\alpha = 0\) and \(\beta = 0\) using a
heteroskedasticity and autocorrelation consistent (HAC)
covariance matrix to ensure robust inference."Noise test Intercept (p-value)") whether the revisions
have a systematic bias."Noise test Coefficient (p-value)") whether the revisions
are correlated with the final value, indicating inefficiency.The news test examines whether the revision is predictable using the preliminary release:
\[ {r}_t^f = \alpha + \beta \cdot {y}_t^h + \epsilon_t \]
Here, we test whether revisions are systematically related to the initial value. The null hypothesis is:
\[ H_0: \alpha = 0, \quad \beta = 0 \]
"News joint test (p-value)") that both \(\alpha = 0\) and \(\beta = 0\) using a HAC covariance
matrix to ensure robust inference."News test Intercept (p-value)") whether the revisions
have a systematic bias."News test Coefficient (p-value)") whether the revisions
are correlated with the initial value, indicating inefficiency.Note: Instead of regressing \(r_t^f\) on \(y_t^f\) or \(y_t^h\), one can similarly regress \[ \text{Noise: } \quad y_t^h = \alpha + \beta \cdot y_t^f + \epsilon_t \\ \text{News: } \quad y_t^f = \alpha + \beta \cdot y_t^h + \epsilon_t \] and test \(\alpha = 0\) and \(\beta = 1\) for the presence of noise and news in the data revisions.
results <- get_revision_analysis(
df,
final_release,
degree = 3
)
head(results)
#>
#> === Revision Analysis Summary ===
#>
#> # A tibble: 2 × 17
#> id release N `News test Intercept` `News test Intercept (std.err)`
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 EA release_0 178 0.074 0.02
#> 2 EA release_1 177 0.069 0.018
#> # ℹ 12 more variables: `News test Intercept (p-value)` <dbl>,
#> # `News test Coefficient` <dbl>, `News test Coefficient (std.err)` <dbl>,
#> # `News test Coefficient (p-value)` <dbl>, `News joint test (p-value)` <dbl>,
#> # `Noise test Intercept` <dbl>, `Noise test Intercept (std.err)` <dbl>,
#> # `Noise test Intercept (p-value)` <dbl>, `Noise test Coefficient` <dbl>,
#> # `Noise test Coefficient (std.err)` <dbl>,
#> # `Noise test Coefficient (p-value)` <dbl>, …
#>
#> === Interpretation ===
#>
#> id=EA, release=release_0:
#> • Revisions contain NEWS (p = 0 ): systematic information
#> • Revisions contain NOISE (p = 0.007 ): measurement error
#>
#> id=EA, release=release_1:
#> • Revisions contain NEWS (p = 0 ): systematic information
#> • Revisions contain NOISE (p = 0.005 ): measurement errorTo test whether seasonality is present in revisions, we employ a Friedman test. Note that this test is always performed on first-differentiated series.
The Friedman test ("Seasonality (Friedman p-value)") is
a non-parametric statistical test that examines whether the distribution
of ranked data differs systematically across seasonal periods, such as
months or quarters. It does not require distributional assumptions,
making it a robust tool for detecting seasonality in revision series.
The test is applied by first transforming the revision series into a
matrix where each row represents a year and each column represents a
specific month or quarter. It is constructed as follows. Consider first
the matrix of data \(\{x_{ij}\}_{n \times
k}\) with \(n\) rows (i.e., the
number of years in the sample) and \(k\) columns (i.e., either 12 months or 4
quarters, depending on the frequency of the data). The data matrix needs
to be replaced by a new matrix \(\{r_{ij}\}_{n
\times k}\), where the entry \(r_{ij}\) is the rank of \(x_{ij}\) within the row \(i\). The test is executed using the
stats::friedman.test() function. The null hypothesis is
that the average ranking across columns is the same, indicating no
seasonality in the revisions.
In the context of revision analysis, Theil’s inequality coefficient, or Theil’s U, provides a measure of the accuracy of a initial estimates (\(y_t^h\)) relative to final or more refined values (\(y_t^f\)). Various definitions of Theil’s statistics exist, leading to different interpretations. This package considers two widely used variants: U1 and U2.
"Theil's U1")The first measure, U1, is given by:
\[ U_1 = \frac{\sqrt{\frac{1}{n}\sum^n_{t=1}(y_t^f-y_t^h)^2}}{\sqrt{\frac{1}{n}\sum^n_{t=1}{y_t^f}^2}+\sqrt{\frac{1}{n}\sum^n_{t=1}{y_t^h}^2}} \]
The U1 statistic is bounded between 0 and 1:
However, U1 suffers from several limitations, as highlighted by Granger and Newbold (1973). A critical issue arises when one of \(y_t^h\) or \(y_t^f\) equals zero in all periods. In such cases, the denominator and numerator become equal, leading to U1 = 1 even when preliminary estimates are close to the final values.
"Theil's U2")To address these shortcomings, Theil et al. (1966) proposed an alternative measure, U2, which is defined as:
\[ U_2 = \frac{\sqrt{\sum_{t=1}^{n} \left( \frac{y_{t+1}^h- y_{t+1}^f}{Y_t^f} \right)^2}}{\sqrt{\sum_{t=1}^{n} \left( \frac{y_{t+1}^f - y_t^f}{Y_t^f} \right)^2}} \]
where:
Unlike U1, U2 is not bounded between 0 and 1. Instead:
Whenever possible (if \(y_{t}^f \neq 0\) for all \(t\)), U2 is the preferred metric for evaluating preliminary estimates.
results <- get_revision_analysis(
df,
final_release,
degree = 4
)
head(results)
#>
#> === Revision Analysis Summary ===
#>
#> # A tibble: 2 × 8
#> id release N Fraction of correct …¹ Fraction of correct …² `Theil's U1`
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 EA releas… 178 0.944 0.764 0.082
#> 2 EA releas… 177 0.944 0.763 0.08
#> # ℹ abbreviated names: ¹`Fraction of correct sign`,
#> # ²`Fraction of correct growth rate change`
#> # ℹ 2 more variables: `Theil's U2` <dbl>,
#> # `Seasonality (Friedman p-value)` <dbl>
#>
#> === Interpretation ===
#>
#> id=EA, release=release_0:
#> • Good forecast accuracy (Theil's U1 = 0.082 )
#> • Excellent sign prediction (94.4% correct)
#>
#> id=EA, release=release_1:
#> • Good forecast accuracy (Theil's U1 = 0.08 )
#> • Excellent sign prediction (94.4% correct)In the above examples, we analyze GDP revisions by comparing the final release with the first and second releases, which correspond to the diagonal elements of the real-time data matrix. This approach provides insight into how initial estimates evolve over time.
Alternatively, we can compare GDP revisions between specific
publication dates. In the example below, we analyze revisions from the
GDP releases from April and July 2024, using the final release from
October 2024 as the benchmark. The grouping of statistics is by default
determined by the pub_date or release column.
Alternatively, it can be specified via the optional function argument
grouping_var.
df <- gdp |>
filter(pub_date %in% c("2024-04-01", "2024-07-01"))
results <- get_revision_analysis(
df,
final_release,
degree = 5
)
results
#>
#> === Revision Analysis Summary ===
#>
#> # A tibble: 2 × 39
#> id pub_date N Frequency `Bias (mean)` `Bias (p-value)`
#> <chr> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 EA 2024-04-01 176 4 0.007 0.159
#> 2 EA 2024-07-01 177 4 0.002 0.52
#> # ℹ 33 more variables: `Bias (robust p-value)` <dbl>, Minimum <dbl>,
#> # Maximum <dbl>, `10Q` <dbl>, Median <dbl>, `90Q` <dbl>, MAR <dbl>,
#> # `Std. Dev.` <dbl>, `Noise/Signal` <dbl>, Correlation <dbl>,
#> # `Correlation (p-value)` <dbl>, `Autocorrelation (1st)` <dbl>,
#> # `Autocorrelation (1st p-value)` <dbl>,
#> # `Autocorrelation up to 1yr (Ljung-Box p-value)` <dbl>, `Theil's U1` <dbl>,
#> # `Theil's U2` <dbl>, `Seasonality (Friedman p-value)` <dbl>, …
#>
#> === Interpretation ===
#>
#> id=EA, pub_date=2024-04-01:
#> • Significant upward bias detected (p = 0.033 )
#> • Very low revision volatility (Noise/Signal = 0.045 )
#> • Revisions do NOT contain news (p = 0.237 )
#> • Revisions do NOT contain noise (p = 0.069 )
#> • Significant autocorrelation in revisions
#> (ρ₁ = -0.201 ): revisions are persistent
#> • Good forecast accuracy (Theil's U1 = 0.022 )
#> • Excellent sign prediction (97.2% correct)
#>
#> id=EA, pub_date=2024-07-01:
#> • No significant bias detected (p = 0.375 )
#> • Very low revision volatility (Noise/Signal = 0.031 )
#> • Revisions do NOT contain news (p = 0.579 )
#> • Revisions do NOT contain noise (p = 0.279 )
#> • Good forecast accuracy (Theil's U1 = 0.015 )
#> • Excellent sign prediction (98.9% correct)The get_revision_analysis() function requires a single
time series in final_release and compares it to multiple
time series in df for the same id. To always
compare two consecutive publication dates, it is necessary to structure
the analysis sequentially. The code below iterates over consecutive
publication dates in the gdp dataset, comparing revisions between each
pair. For each pair of consecutive dates, it extracts the corresponding
data, runs get_revision_analysis() to analyze revisions,
and then combines the results into a single dataframe.
# Get unique sorted publication dates
pub_dates <- gdp |>
distinct(pub_date) |>
arrange(pub_date) |>
pull(pub_date)
# Run the function for each pair of consecutive publication dates
results <- purrr::map_dfr(seq_along(pub_dates[-length(pub_dates)]),
function(i) {
df <- gdp |>
filter(pub_date %in% pub_dates[i])
final_release <- gdp |>
filter(pub_date %in% pub_dates[i + 1])
get_revision_analysis(df, final_release, degree = 5)
}
)
head(results)