---
title: "lmForc"
author:
- name: "Nelson Rayl"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{lmForc}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r}
library(lmForc)
```
![](lmForc_logo.png){width=100%}
## Sections
#### [Forecast Class]
#### [is_forc]
#### [is_forc_general]
#### [oos_realized_forc]
#### [oos_realized_forc_general]
#### [oos_vintage_forc]
#### [oos_vintage_forc_general]
#### [conditional_forc]
#### [conditional_forc_general]
#### [oos_lag_forc]
#### [historical_average_forc]
#### [random_walk_forc]
#### [autoreg_forc]
#### [performance_weighted_forc]
#### [states_weighted_forc]
#### [Subsetting Functions Overview]
#### [subset_forcs]
#### [subset_bytime]
#### [subset_identical]
#### [Transformation Functions Overview]
#### [convert_bytime]
#### [transform_bytime]
#### [convert_byh]
#### [transform_byh]
## Forecast Class
At the heart of the *lmForc* package is the `Forecast` class. Base R does not
provide a good format for working with forecasts, so *lmForc* addresses this by
introducing a new class for storing forecasts that is simple and rigorous. The
`Forecast` class is paramount to the *lmForc* philosophy of simple syntax and
realistic tests. `Forecast` is an S4 class that contains equal length vectors
with the following data:
* `origin` \ \ \ The time when the forecast was made.
* `future` \ \ \ The time that is being forecasted.
* `forecast` The forecast itself.
* `realized` \ If available, the realized value at the time being forecasted.
The `Forecast` class also includes an additional length-one slot `h_ahead` for
representing how many periods ahead are being forecasted. This slot is optional,
but becomes useful for documentation and performing out-of-sample forecast
tests.
We demonstrate the `Forecast` class by constructing a simple `Forecast` object.
This forecast contains four observations at the quarterly frequency.
```{r}
my_forecast <- Forecast(
origin = as.Date(c("2010-03-31", "2010-06-30", "2010-09-30", "2010-12-31")),
future = as.Date(c("2011-03-31", "2011-06-30", "2011-09-30", "2011-12-31")),
forecast = c(4.21, 4.27, 5.32, 5.11),
realized = c(4.40, 4.45, 4.87, 4.77),
h_ahead = 4L
)
```
Note that we have chosen Date objects at the quarterly frequency for our
`origin` and `future` slots. This forecast is for four quarters ahead, so we
fill in the `h_ahead` slot with the integer four.
The `origin`, `future`, and `h_ahead` slots can be filled with any values. This
is where the general nature of the `Forecast` class shines. In the `origin` and
`future` slots we could use dates at a daily or yearly frequency, the `POSIXct`
class to include minutes and seconds, or integers to represent discrete periods.
We can also store different types of forecasts. In the example above, we have
forecasts made at different `origin` times for a constant four quarter ahead
forecast horizon. We could also store a forecast made at a single `origin` time
for a horizon of `future` times by setting all of the `origin` values to one
time and using `future` to represent the horizon of times that we are
forecasting. In this case the `h_ahead` slot becomes irrelevant and it is left
as `NULL`. The flexibility of these slots allows us to represent any type of
numeric forecast.
The `forecast` and `realized` slots take numeric vectors. In the `forecast` slot
we see the forecast that was made at each `origin` time and in the `realized`
slot we see the true value that was realized at each `future` time. The realized
values may not exist yet, so this slot may be partially populated or not populated
at all. If the `realized` slot is set to `NULL` then it will be
populated with a vector of `NA` values.
The `Forecast` class strikes a balance between simplicity and rigor. It is
simple enough to store any numeric forecast, but it is rigorous enough to create
a useful data structure. For example, we can quickly calculate a number of
forecast accuracy metrics for the `Forecast` object we created above using only
one argument.
```{r}
mse(my_forecast)
rmse(my_forecast)
mae(my_forecast)
mape(my_forecast)
R2(my_forecast)
```
Because the `forecast` and `realized` slots must be numeric vectors, and all
slots must be of the same length, we can calculate forecast accuracy metrics
without having to worry about input validation or coercing multiple vectors to
the correct format. The forecast accuracy metrics available in the *lmForc* package
are calculated as follows where:
$$
n = \text{forecast vector length}\\
\quad Y_i = \text{realized values}\\
\hat{Y_i} = \text{forecast values}
$$
MSE is calculated as:
$$
MSE = \frac{1}{n} \sum_{i=1}^{n}{(Y_i - \hat{Y_i})^2}
$$
RMSE is calculated as:
$$ RMSE = \sqrt{MSE} $$
MAE is calculated as:
$$ MAE = \frac{1}{n} \sum_{i=1}^{n}{|\hat{Y_i} - Y_i|} $$
MAPE is calculated as:
$$ MAPE = \frac{1}{n} \sum_{i=1}^{n}\frac{|Y_i - \hat{Y_i}|}{{Y_i}} $$
R2 is calculated as:
$$ R^2 = cor(\hat{Y_i}, \ Y_i)^2 $$
Note that these equations require two inputs, but because both inputs are
already stored in the `Forecast` object we only need to pass one argument to the
`mse()` and `rmse()` functions. Calculating forecast accuracy is a simple use
case of the `Forecast` class. More complex use cases exist in the *lmForc*
package, where many of the functions require inputs to be of the `Forecast`
class. When weighting multiple forecasts or testing a linear model that is
conditional on another forecast, the consistent structure of the class results
in simple functions that execute correctly, no matter the type of forecast
passed to the function. Furthermore, all *lmForc* functions return objects of
the `Forecast` class which creates synergy between functions. One
can take two linear models, test their performance out-of-sample with the
`oos_realized_forc()` function which returns `Forecast` objects, and then pass
these two objects to the `performance_weighted_forc()` function to find the weighted
out-of-sample performance of both models.
One fear may be that the novel `Forecast` class will not play well with
functions and packages that already exist in the R language. The *lmForc*
package provides methods for accessing all of the vectors stored in a `Forecast`
object as well as the `forc2df()` function which returns one or multiple
`Forecast` objects as a data frame. With these methods, one can easily pass the
data in a `Forecast` object to other functions.
```{r}
forc2df(my_forecast)
origin(my_forecast)
future(my_forecast)
forc(my_forecast)
realized(my_forecast)
```
## Example Dataset
Examples throughout the rest of the vignette will use a stylized dataset with a
`date` column of ten quarterly dates, a dependent variable `y`, and two
independent variables `x1` and `x2`. Equations are also written in terms of the
variables `y`, `x1`, and `x2`.
```{r}
date <- as.Date(c("2010-03-31", "2010-06-30", "2010-09-30", "2010-12-31",
"2011-03-31", "2011-06-30", "2011-09-30", "2011-12-31",
"2012-03-31", "2012-06-30"))
y <- c(1.09, 1.71, 1.09, 2.46, 1.78, 1.35, 2.89, 2.11, 2.97, 0.99)
x1 <- c(4.22, 3.86, 4.27, 5.60, 5.11, 4.31, 4.92, 5.80, 6.30, 4.17)
x2 <- c(10.03, 10.49, 10.85, 10.47, 9.09, 10.91, 8.68, 9.91, 7.87, 6.63)
data <- data.frame(date, y, x1, x2)
head(data)
```
To demonstrate the `_general` set of functions which extend standard *lmForc* functions
to accommodate non-linear models, we will use a logit regression and a modified version
of the above dataset in which the dependent variable `y` is binary.
```{r}
date <- as.Date(c("2010-03-31", "2010-06-30", "2010-09-30", "2010-12-31",
"2011-03-31", "2011-06-30", "2011-09-30", "2011-12-31",
"2012-03-31", "2012-06-30", "2012-09-30", "2012-12-31",
"2013-03-31", "2013-06-30", "2013-09-30", "2013-12-31"))
y <- c(1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0)
x1 <- c(8.22, 3.86, 4.27, 3.37, 5.88, 3.34, 2.92, 1.80, 3.30, 7.17, 3.22, 3.86, 4.27, 3.37, 5.88, 3.34)
x2 <- c(4.03, 2.46, 2.04, 2.44, 6.09, 2.91, 1.68, 2.91, 3.87, 1.63, 4.03, 2.46, 2.04, 2.44, 6.09, 2.91)
dataLogit <- data.frame(date, y, x1, x2)
head(dataLogit)
```
## is_forc
The `is_forc()` function produces an in-sample forecast based on a linear model.
The function takes a linear model call and an optional vector of time data
associated with the linear model as inputs. The linear model is estimated once over the
entire sample period and the coefficients are multiplied by the realized values
in each period of the sample. This is functionally identical to the `predict()`
function in the *stats* package, but it returns a `Forecast` object instead of a
numeric vector.
For all observations *i* in the sample, coefficients are estimated as:
$$
Y_i = \beta_0 + \beta_1 X1_i + \beta_2 X2_i + \epsilon_i
\qquad \text{for all } i
$$
And forecasts are estimated as:
$$
forecast_i = \beta_0 + \beta_1 X1_i + \beta_2 X2_i \qquad
$$
```{r}
is_forc(
lm_call = lm(y ~ x1 + x2, data),
time_vec = data$date
)
```
Note that because we are creating an in-sample forecast, `h_ahead` is set to 0
and the `origin` time equals the `future` time. This test evaluates how well a
linear forecast model fits the historical data.
## is_forc_general
The `is_forc_general()` function produces an in-sample forecast based on a model function
and prediction function specified by the user. The `is_forc_general` function takes a model function, a
prediction function, input data for estimating the model, and an optional vector of time
data associated with the model. The model is estimated once over the entire sample period
using the input data and model function. Model parameters are then combined with the input
data using the prediction function to generate in-sample forecasts. This is functionally similar
to the `is_forc()` function, but allows for usage of non-linear models such as GLMs, logit
regressions, decision trees, or any custom model.
For all observations *i* in the sample, coefficients are estimated as:
$$
Y_i = \text{model_function}(X_i)
\qquad \text{for all } i
$$
And forecasts are estimated as:
$$
forecast_i = \text{prediction_function}(\text{model_function}, X_i) \qquad
$$
```{r}
is_forc_general(
model_function = function(data) {glm(y ~ x1 + x2, data = data, family = binomial)},
prediction_function = function(model_function, data) {as.vector(predict(model_function, data, type = "response"))},
data = dataLogit,
realized = dataLogit$y,
time_vec = dataLogit$date
)
```
Note that because we are creating an in-sample forecast, `h_ahead` is set to 0
and the `origin` time equals the `future` time. This test evaluates how well the
model fits the historical data.
## oos_realized_forc
The `oos_realized_forc()` function produces an *h* period ahead out-of-sample
forecast that is conditioned on realized values. The function takes a linear
model call, an integer number of periods ahead to forecast, a period to end the
initial coefficient estimation and begin forecasting, an optional vector of
time data associated with the linear model, and an optional integer number of
past periods to estimate the linear model over. The linear model is originally
estimated with data up to `estimation_end` minus the number of periods specified
in the `estimation_window` argument. For instance, if the linear model is being
estimated on quarterly data and the `estimation_window` is set to `20L`, coefficients
will be estimated using five years of data up to `estimation_end`. If
`estimation_window` is set to `NULL` then the linear model is estimated with all
available data up to `estimation_end`. Coefficients are multiplied by realized
values of the covariates `h_ahead` periods ahead to create an `h_ahead` period
ahead forecast. This process is iteratively repeated for each period after
`estimation_end` with coefficients updating in each period as more information
would have become available to the forecaster. In each period, coefficients are
updated based on all available information if `estimation_window` is set to
`NULL`, or a rolling window of past periods if `estimation_window` is set to an
integer value.
In the sample *i*, for each period *p* greater than or equal to
`estimation_end`, coefficients are updated as:
$$
Y_i = \beta_0 + \beta_1 X1_i + \beta_2 X2_i + \epsilon_i
\qquad \text{for all} \quad p-w \leq i \leq \text{p}
$$
Where *w* is the `estimation_window`. `h_ahead` *h* forecasts are estimated as:
$$
forecast_{p+h} = \beta_0 + \beta_1 X1_{p+h} + \beta_2 X2_{p+h} \qquad
$$
```{r}
oos_realized_forc(
lm_call = lm(y ~ x1 + x2, data),
h_ahead = 2L,
estimation_end = as.Date("2011-03-31"),
time_vec = data$date,
estimation_window = NULL,
return_betas = FALSE
)
```
Note that the `oos_realized_forc` function returns an out-of-sample forecast
that conditions on realized values that **would not** have been available to the
forecaster at the forecast origin. This test evaluates the performance of a
linear forecast model had it been conditioned on perfect information.
## oos_realized_forc_general
The `oos_realized_forc_general()` function produces an *h* period ahead out-of-sample
forecast that is conditioned on realized values. The function takes a model function,
a prediction function, input data for estimating the model, realized values of the
dependent variable, an integer number of periods ahead to forecast, a period to end
the initial coefficient estimation and begin forecasting, a vector of time data associated
with the model, and an optional integer number of past periods to estimate the model over.
The model is originally estimated using the input data and model function with data
up to `estimation_end` minus the the number of periods specified in `estimation_window`.
If `estimation_window` is left `NULL` then the model is estimated with all
available data up to `estimation_end`. Model parameters are then combined with realized
values of the input data `h_ahead` periods ahead to generate an `h_ahead` period
ahead forecast. This process is iteratively repeated for each period after `estimation_end`
with model parameters updating in each period.
In the sample *i*, for each period *p* greater than or equal to
`estimation_end`, coefficients are updated as:
$$
Y_i = \text{model_function}(X_i)
\qquad \text{for all} \quad p-w \leq i \leq \text{p}
$$
Where *X* is the input data and *w* is the `estimation_window`. `h_ahead` *h* forecasts are estimated as:
$$
forecast_{p+h} = \text{prediction_function}(\text{model_function}, X_{p+h}) \qquad
$$
```{r}
forc <- oos_realized_forc_general(
model_function = function(data) {glm(y ~ x1 + x2, data = data, family = binomial)},
prediction_function = function(model_function, data) {
as.vector(predict(model_function, data, type = "response"))
},
data = dataLogit,
realized = dataLogit$y,
h_ahead = 2L,
estimation_end = as.Date("2012-06-30"),
time_vec = dataLogit$date,
estimation_window = NULL
)
```
Note that the `oos_realized_forc_general` function returns an out-of-sample forecast
that conditions on realized values that **would not** have been available to the
forecaster at the forecast origin. This test evaluates the performance of a
forecasting model had it been conditioned on perfect information.
## oos_vintage_forc
The `oos_vintage_forc()` function produces an out-of-sample forecast conditioned
on *h* period ahead forecasts of the linear model covariates. The function takes
a linear model call, a vector of time data associated with the linear model,
a vintage forecast for each covariate in the linear model, and an optional
integer number of past periods to estimate the linear model over. For each period
in the vintage forecasts, coefficients are updated based on information that would
have been available to the forecaster at the forecast origin. Coefficients are
estimated over information from the last `estimation_window` number of periods.
If `estimation_window` is left `NULL` then coefficients are estimated over all
of the information that would have been available to the forecaster. Coefficients
are then multiplied by vintage forecast values to produce a replication of real
time forecasts.
In the sample *i*, for each period *p* in the vintage forecasts *VF1* and *VF2*,
coefficients are updated as:
$$
Y_i = \beta_0 + \beta_1 X1_i + \beta_2 X2_i + \epsilon_i
\qquad \text{for all} \quad p-w \leq i \leq \text{p}
$$
And `h_ahead` *h* forecasts are estimated as:
$$
forecast_{p+h} = \beta_0 + \beta_1 VF1_p + \beta_2 VF2_p \qquad
$$
We introduce stylized vintage forecasts of X1 and X2 to demonstrate the
`oos_vintage_forc()` function. Using four quarter ahead forecasts of the
covariates X1 and X2, we create an out-of-sample forecast based on the
coefficients and covariate forecasts that the forecaster would have used in
each period.
```{r}
x1_forecast_vintage <- Forecast(
origin = as.Date(c("2010-09-30", "2010-12-31", "2011-03-31", "2011-06-30")),
future = as.Date(c("2011-09-30", "2011-12-31", "2012-03-31", "2012-06-30")),
forecast = c(6.30, 4.17, 5.30, 4.84),
realized = c(4.92, 5.80, 6.30, 4.17),
h_ahead = 4L
)
x2_forecast_vintage <- Forecast(
origin = as.Date(c("2010-09-30", "2010-12-31", "2011-03-31", "2011-06-30")),
future = as.Date(c("2011-09-30", "2011-12-31", "2012-03-31", "2012-06-30")),
forecast = c(7.32, 6.88, 6.82, 6.95),
realized = c(8.68, 9.91, 7.87, 6.63),
h_ahead = 4L
)
oos_vintage_forc(
lm_call = lm(y ~ x1 + x2, data),
time_vec = data$date,
x1_forecast_vintage, x2_forecast_vintage,
estimation_window = NULL,
return_betas = FALSE
)
```
This test replicates the forecasts that a linear model conditional on forecasts
of covariates would have produced in real-time. Here we see the strength of the
`Forecast` class. Because the vintage forecasts of X1 and X2 share the same data
structure, we can calculate a forecast that is conditional on these objects
without fearing inconsistency across inputs.
## oos_vintage_forc_general
The `oos_vintage_forc_general()` function produces an out-of-sample forecast conditioned
on *h* period ahead forecasts of the model parameters. The function takes
a model function, prediction function, input data for estimating the model, realized values
of the dependent variable, a vector of time data associated with the model, a forecast for
each parameter in the model, and an optional integer number of past periods to estimate the
model over. For each period in the vintage forecasts, model parameters are estimated with
data up to the current period minus the number of periods specified in `estimation_window`.
If `estimation_window` is left `NULL` then the model is estimated with all available
data up to the current period. Model parameters are then combined with vintage forecast values
to generate a forecast. *Note* that `data` input to the `prediction_function` takes
the form of a data.frame with a number of columns equal to the number of input vintage forecasts
and the `prediction_function` needs to be able to take this input format and generate
a forecast based on it. Returns an out-of-sample forecast conditional on vintage forecasts that
**would** have been available at the forecast origin. Replicates the forecasts that a conditional
forecasting model would have produced in real time.
In the sample *i*, for each period *p* in the vintage forecasts *VF1* and *VF2*,
coefficients are updated as:
$$
Y_i = \text{model_function}(X_i)
\qquad \text{for all} \quad p-w \leq i \leq \text{p}
$$
Where *X* is the input data and *w* is the `estimation_window`. `h_ahead` *h* forecasts are estimated as:
$$
forecast_{p+h} = \beta_0 + \beta_1 VF1_p + \beta_2 VF2_p \qquad
forecast_{p+h} = \text{prediction_function}(\text{model_function}, [VF1_p, VF2_p]) \qquad
$$
We introduce stylized vintage forecasts of X1 and X2 to demonstrate the
`oos_vintage_forc()` function. Using four quarter ahead forecasts of the
covariates X1 and X2, we create an out-of-sample forecast based on the
coefficients and covariate forecasts that the forecaster would have used in
each period.
```{r}
x1_forecast_vintageLogit <- Forecast(
origin = as.Date(c("2012-09-30", "2012-12-31", "2013-03-31", "2013-06-30")),
future = as.Date(c("2013-09-30", "2013-12-31", "2014-03-31", "2014-06-30")),
forecast = c(6.34, 4.17, 2.98, 1.84),
realized = c(5.88, 3.34, 2.92, 1.80),
h_ahead = 4L
)
x2_forecast_vintageLogit <- Forecast(
origin = as.Date(c("2012-09-30", "2012-12-31", "2013-03-31", "2013-06-30")),
future = as.Date(c("2013-09-30", "2013-12-31", "2014-03-31", "2014-06-30")),
forecast = c(7.32, 3.22, 2.21, 2.65),
realized = c(6.09, 2.91, 1.68, 2.91),
h_ahead = 4L
)
oos_vintage_forc_general(
model_function = function(data) {glm(y ~ x1 + x2, data = data, family = binomial)},
prediction_function = function(model_function, data) {
names(data) <- c("x1", "x2")
as.vector(predict(model_function, data, type = "response"))
},
data = dataLogit,
realized = dataLogit$y,
time_vec = dataLogit$date,
x1_forecast_vintageLogit, x2_forecast_vintageLogit,
estimation_window = NULL
)
```
*Note* that in the example above the `prediction_function` is adapted to take a specific data.frame
as input and generate a forecast based on it. The form of this specific data.frame is printed to
the console for reference. This test replicates the forecasts that a model conditional on forecasts
of the model parameters would have produced in real-time. Here we see the strength of the
`Forecast` class. Because the vintage forecasts of X1 and X2 share the same data
structure, we can calculate a forecast that is conditional on these objects
without fearing inconsistency across inputs.
## conditional_forc
The `conditional_forc()` function produces a forecast conditioned on forecasts
of the linear model covariates. The function takes a linear model call, a vector
of time data associated with the linear model, and a forecast for each covariate
in the linear model. The linear model is estimated once over the entire sample
period and the coefficients are multiplied by the forecasts of each covariate.
For all observations *i* in the sample, coefficients are estimated as:
$$
Y_i = \beta_0 + \beta_1 X1_i + \beta_2 X2_i + \epsilon_i
\qquad \text{for all } i
$$
And for all periods *p* in the covariate forecasts *F1* and *F2*, forecasts are estimated as:
$$
forecast_{p+h} = \beta_0 + \beta_1 F1_p + \beta_2 F2_p \qquad
$$
The difference between `conditional_forc()` and `oos_vintage_forc()` is that in
the `conditional_forc()` function coefficients are only estimated once over all
observations. Coefficients do not update based on what information would have been
available to the forecaster at any given point in time. We introduce stylized
forecasts of X1 and X2 to demonstrate the `conditional_forc()` function. Because
in this example we are making a conditional forecast for the future instead of testing past forecasts, we can condition on a horizon of forecasts. This is in contrast
to the `oos_vintage_forc()` example where we test the performance of four quarter
ahead vintage forecasts.
```{r}
x1_forecast <- Forecast(
origin = as.Date(c("2012-06-30", "2012-06-30", "2012-06-30", "2012-06-30")),
future = as.Date(c("2012-09-30", "2012-12-31", "2013-03-31", "2013-06-30")),
forecast = c(4.14, 4.04, 4.97, 5.12),
realized = NULL,
h_ahead = NULL
)
x2_forecast <- Forecast(
origin = as.Date(c("2012-06-30", "2012-06-30", "2012-06-30", "2012-06-30")),
future = as.Date(c("2012-09-30", "2012-12-31", "2013-03-31", "2013-06-30")),
forecast = c(6.01, 6.05, 6.55, 7.45),
realized = NULL,
h_ahead = NULL
)
conditional_forc(
lm_call = lm(y ~ x1 + x2, data),
time_vec = data$date,
x1_forecast, x2_forecast
)
```
This function is used to create a forecast for the present period or replicate
a forecast made at a specific period in the past. Note that because we are
forecasting into the future, `realized` is `NULL`. Also, because we are
forecasting a horizon of dates, `h_ahead` is `NULL`.
## conditional_forc_general
The `conditional_forc_general()` function produces a forecast conditioned on forecasts
of the model parameters. The function takes a model function, prediction function, input
data for estimating the model, a vector of time data associated with the model, and forecasts
for each parameter in the model. The model is estimated once over the entire sample period
and the model parameters are combined with the forecasts of each parameter to generate a forecast.
For all observations *i* in the sample, coefficients are estimated as:
$$
Y_i = \text{model_function}(X_i)
\qquad \text{for all } i
$$
And for all periods *p* in the covariate forecasts *F1* and *F2*, forecasts are estimated as:
$$
forecast_{p+h} = \text{prediction_function}(\text{model_function}, [VF1_p, VF2_p]) \qquad
$$
The difference between `conditional_forc_general()` and `oos_vintage_forc_general()` is that in
the `conditional_forc_general()` function coefficients are only estimated once over all
observations. Coefficients do not update based on what information would have been
available to the forecaster at any given point in time. We introduce stylized
forecasts of X1 and X2 to demonstrate the `conditional_forc_general()` function. Because
in this example we are making a conditional forecast for the future instead of testing past forecasts, we can condition on a horizon of forecasts. This is in contrast
to the `oos_vintage_forc_general()` example where we test the performance of four quarter
ahead vintage forecasts.
```{r}
# Parameter Forecasts.
x1_forecastLogit <- Forecast(
origin = as.Date(c("2013-12-31", "2013-12-31", "2013-12-31", "2013-12-31")),
future = as.Date(c("2014-03-31", "2014-06-30", "2014-09-30", "2014-12-31")),
forecast = c(2.11, 6.11, 6.75, 4.30),
realized = NULL,
h_ahead = NULL
)
x2_forecastLogit <- Forecast(
origin = as.Date(c("2013-12-31", "2013-12-31", "2013-12-31", "2013-12-31")),
future = as.Date(c("2014-03-31", "2014-06-30", "2014-09-30", "2014-12-31")),
forecast = c(1.98, 7.44, 7.86, 5.98),
realized = NULL,
h_ahead = NULL
)
conditional_forc_general(
model_function = function(data) {glm(y ~ x1 + x2, data = data, family = binomial)},
prediction_function = function(model_function, data) {
names(data) <- c("x1", "x2")
as.vector(predict(model_function, data, type = "response"))
},
data = dataLogit,
time_vec = dataLogit$date,
x1_forecastLogit, x2_forecastLogit
)
```
This function is used to create a forecast for the present period or replicate
a forecast made at a specific period in the past. Note that because we are
forecasting into the future, `realized` is `NULL`. Also, because we are
forecasting a horizon of dates, `h_ahead` is `NULL`.
## oos_lag_forc
The `oos_lag_forc()` function produces an *h* period ahead out-of-sample
forecast conditioned on present period values. The function takes a linear model
call, an integer number of periods ahead to forecast, a period to end the
initial coefficient estimation and begin forecasting, an optional vector of
time data associated with the linear model, and an optional integer number of
past periods to estimate the linear model over. The linear model data is lagged by
`h_ahead` periods and the linear model is re-estimated with data up to
`estimation_end` minus the number of periods specified in the `estimation_window`
argument to create a lagged linear model. If `estimation_window` is left `NULL`
then the linear model is estimated with all available lagged data up to
`estimation_end`. Coefficients are multiplied by present period realized values
of the covariates to create a forecast for `h_ahead` periods ahead. This process
is iteratively repeated for each period after `estimation_end` with coefficients
updating in each period as more information would have become available to the
forecaster. In each period, coefficients are updated based on all available
information if `estimation_window` is set to `NULL`, or a rolling window of past
periods if `estimation_window` is set to an integer value.
In the sample *i*, for each period *p* greater than or equal to
`estimation_end`, coefficients are updated as:
$$
Y_i = \beta_0 + \beta_1 X1_{i-h} + \beta_2 X2_{i-h} + \epsilon_i
\qquad \text{for all} \quad p-w \leq i \leq \text{p}
$$
Where *w* is the `estimation_window` and *h* is `h_ahead`. `h_ahead` forecasts are estimated as:
$$
forecast_{p+h} = \beta_0 + \beta_1 X1_{p} + \beta_2 X2_{p} \qquad
$$
```{r}
oos_lag_forc(
lm_call = lm(y ~ x1 + x2, data),
h_ahead = 2L,
estimation_end = as.Date("2011-03-31"),
time_vec = data$date,
estimation_window = NULL,
return_betas = FALSE
)
```
This test evaluates the performance of a lagged linear model had it been
conditioned on present values that **would** have been available to the
forecaster at the forecast origin. This is in contrast to conditioning on
realized values or a forecast of the covariates.
## historical_average_forc
The `historical_average_forc()` function produces an *h* period ahead forecast based
on the historical average of the series that is being forecasted. The function takes
an average function, a vector of realized values, an integer number of periods ahead
to forecast, a period to end the initial average estimation and begin forecasting,
an optional vector of time data associated with the realized values, and an optional
integer number of past periods to estimate the average over. The historical average
is originally calculated with realized values up to `estimation_end` minus the number
of periods specified in `estimation_window`. If `estimation_window` is left `NULL`
then the historical average is calculated with all available realized values up to
`estimation_end`. In each period the historical average is set as the `h_ahead` period
ahead forecast. This process is iteratively repeated for each period after `estimation_end`
with the historical average updating in each period as more information would have become
available to the forecaster.
If `avg_function` is set to `mean`, in the sample *i*, for each period *p* greater than or
equal to `estimation_end`, `h_ahead` *h* forecasts are calculated as:
$$
forecast_{p+h} = \frac{1}{p-w} \sum_{i=p-w}^{p}{Y_i} \qquad
$$
Where *Y* is the series being forecasted and *w* is the `estimation_window`.
```{r}
historical_average_forc(
avg_function = "mean",
realized_vec = data$y,
h_ahead = 2L,
estimation_end = as.Date("2011-03-31"),
time_vec = data$date,
estimation_window = 4L
)
```
`historical_average_forc()` returns a historical average forecast where the `h_ahead`
period ahead forecast is simply the historical average or rolling window average of the
series being forecasted. This replicates the historical average forecast that would
have been produced in real-time and can serve as a benchmark for other forecasting models.
## random_walk_forc
The `random_walk_forc()` function produces an *h* period ahead forecast based
on the last realized value in the series that is being forecasted. The function
takes a vector of realized values, an integer number of periods ahead to forecast,
and an optional vector of time data associated with the realized values. In each
period, the current period value of the `realized_vec` series is set as the
`h_ahead` period ahead forecast.
In the sample *i*, for each period *p*, `h_ahead` *h* forecasts are calculated as:
$$
forecast_{p+h} = Y_p
$$
Where *Y* is the series being forecasted.
```{r}
random_walk_forc(
realized_vec = data$y,
h_ahead = 6L,
time_vec = data$date
)
```
`random_walk_forc()` returns a random walk forecast where the `h_ahead`
period ahead forecast is simply the present value of the series being forecasted.
This replicates the random walk forecast that a forecaster would have produced
in real-time and can serve as a benchmark for other forecasting models.
## autoreg_forc
The `autoreg_forc()` function produces an *h* period ahead forecast based
on an autoregressive (AR) model. The function takes a vector of realized values,
an integer number of periods ahead to forecast, an integer number of lags to include
in the autoregressive model, a period to end the initial model estimation and begin
forecasting, an optional vector of time data associated with the realized values,
and an optional integer number of past periods to estimate the autoregressive model
over. An AR(`ar_lags`) autoregressive model is originally estimated with realized values up to
`estimation_end` minus the number of periods specified in `estimation_window`. If
`estimation_window` is left `NULL` then the autoregressive model is estimated with
all realized values up to `estimation_end`. The AR(`ar_lags`) model is estimated by
regressing the vector of realized values on vectors of the same realized values that
have been lagged by one to `ar_lags` steps. The AR coefficients of this model are
multiplied by lagged values and the present period realized value to create a one
period ahead forecast. If `h_ahead` is greater than one, the one period ahead forecasting
process is iteratively repeated so that the two period ahead forecast conditions on
the one period ahead forecasted value. This process of rolling one period ahead forecasts
forward continues until an `h_ahead` forecast is obtained. The `h_ahead` forecasting process
is repeated for each period after `estimation_end` with AR model coefficients updating as
more information would have become available to the forecaster. In each period, coefficients
are updated based on all available realized values if `estimation_window` is set to `NULL`,
or a rolling window of past periods if `estimation_window` is set to an integer value.
In the sample *i* with `ar_lags` set to two and `h_ahead` set to two. For each period *p*
greater than or equal to `estimation_end`, coefficients are calculated as:
$$
Y_i = \beta_0 + \beta_1 Y_{i-1} + \beta_2 Y_{i-2}
\qquad \text{for all} \quad p-w \leq i \leq \text{p}
$$
Where *Y* is the series being forecasted and *w* is the `estimation_window`.
`h_ahead` two step ahead forecasts are estimated as:
$$
Y_{p+1} = \beta_0 + \beta_1 Y_p + \beta_2 Y_{p-1} \\
forecast_{p+h} = Y_{p+2} = \beta_0 + \beta_1 Y_{p+1} + \beta_2 Y_{p}
$$
```{r}
autoreg_forc(
realized_vec = data$y,
h_ahead = 2L,
ar_lags = 2L,
estimation_end = as.Date("2011-06-30"),
time_vec = data$date,
estimation_window = NULL,
return_betas = FALSE
)
```
`autoreg_forc()` returns an autoregressive forecast based on information that would
have been available at the forecast origin. This function replicates the forecasts
than an AR model would have produced in real-time and can serve as a
benchmark for other forecasting models.
## performance_weighted_forc
The `performance_weighted_forc()` function produces a weighted average of multiple
forecasts based on the recent performance of each forecast. The function takes
two or more forecasts of the `Forecast` class, an evaluation window, and an
error function. For each forecast period, the error function is used to calculate
forecast accuracy over the past `eval_window` number of periods. The forecast
accuracy of each forecast is then used to weight forecasts based on a weighting function.
In each period, weights are calculated and used to create a weighted average forecast.
We use a stylized example in which we create a weighted forecast of two forecasts:
Y1 and Y2.
For all periods *p* in the *k* number of forecasts *Y*, weights *W* are calculated over the `eval_window` *e* as:
$$
W_k = \frac{1/MSE(Y_{ki})}{1/\sum_{k=1}^{k}MSE(Y_{ki})} \qquad \text{where}
\quad i = p-e \leq i \leq p
$$
Forecasts are estimated as:
$$
forecast_p = Y1_pW_1 + Y2_pW_2
$$
```{r}
y1_forecast <- Forecast(
origin = as.Date(c("2009-03-31", "2009-06-30", "2009-09-30", "2009-12-31",
"2010-03-31", "2010-06-30", "2010-09-30", "2010-12-31",
"2011-03-31", "2011-06-30")),
future = as.Date(c("2010-03-31", "2010-06-30", "2010-09-30", "2010-12-31",
"2011-03-31", "2011-06-30", "2011-09-30", "2011-12-31",
"2012-03-31", "2012-06-30")),
forecast = c(1.33, 1.36, 1.38, 1.68, 1.60, 1.55, 1.32, 1.22, 1.08, 0.88),
realized = c(1.09, 1.71, 1.09, 2.46, 1.78, 1.35, 2.89, 2.11, 2.97, 0.99),
h_ahead = 4L
)
y2_forecast <- Forecast(
origin = as.Date(c("2009-03-31", "2009-06-30", "2009-09-30", "2009-12-31",
"2010-03-31", "2010-06-30", "2010-09-30", "2010-12-31",
"2011-03-31", "2011-06-30")),
future = as.Date(c("2010-03-31", "2010-06-30", "2010-09-30", "2010-12-31",
"2011-03-31", "2011-06-30", "2011-09-30", "2011-12-31",
"2012-03-31", "2012-06-30")),
forecast = c(0.70, 0.88, 1.03, 1.05, 1.01, 0.82, 0.95, 1.09, 1.07, 1.06),
realized = c(1.09, 1.71, 1.09, 2.46, 1.78, 1.35, 2.89, 2.11, 2.97, 0.99),
h_ahead = 4L
)
performance_weighted_forc(
y1_forecast, y2_forecast,
eval_window = 2L,
errors = "mse",
return_weights = FALSE
)
```
The `performance_weighted_forc()` function returns a weighted forecast of the Y1 and Y2
forecasts based on performance in recent periods. The weights used in each
period can be returned to the Global Environment by setting `return_weights` to
`TRUE`. Note that although we were only weighting performance over the past two
periods, we have five `NA` forecasts. This reflects the *lmForc* philosophy of
replicating what it would be like to forecast in real-time. If a forecaster was
making a forecast at `2010-06-30`, they would only have access to realized values
up to `2010-06-30`, in this case the first two rows. This is why a weighted forecast
with an `eval_window` of two can only be computed once the forecast origin becomes
`2010-06-30` and the forecaster has access to two realized values. This function
can be used to compute a weighted forecast for the present period or to test how
a weighted forecast would have performed historically. The weighted forecasts are
based on information that **would** have been available to the forecaster at the
forecast origin.
## states_weighted_forc
The `states_weighted_forc()` function produces a weighted average of multiple
forecasts based on how each forecast performed during the past state of the world that is
most similar to the current state of the world. The function takes two or more
forecasts, a data frame, matrix, or array of matching variables, an optional
vector of time data associated with the matching variables, a matching window
size, a matching function, and an error function. The first step of the weighted
forecast process is to match the current state of the world to a similar past
state of the world. For each forecast period, the `matching_vars` are standardized
and the past `matching_window` periods of the matching variables are considered
as the current state of the world. This current state of the world is compared to
all past `matching_window` size periods of the matching variables. The current
state is matched to the past state that minimizes the user selected matching function.
For example, if `matching` is set to `euclidean` then the matched past state is
the past state which has the minimum euclidean distance to the current state of
the world. The objective is to select a past period that is similar to the current
state of the world as given by the matching variables. Once a past state has been
matched, the accuracy of each forecast is calculated over the periods of the past
state according to the user selected error function. Forecast weights are then computed
based on forecast accuracy during the past state. The objective is to give more weight to
the forecasts that perform better in conditions that reflect the current state
of the world. The forecast weights are then used to create a weighted forecast for
the current period.
We use a stylized example with one matching variable *x* as well as two forecasts
*Y1* and *Y2*.
The matching variable *x* is first standardized using the function:
$$
x_i = \frac{x_i - mean(x)}{sd(x)}
$$
For all periods *p*, the current state of the world *c* and past states of the
world *p* are calculated as:
$$
c_i = x_i \qquad \text{where} \quad p-w \leq i \leq p \\
\qquad \qquad \qquad \qquad p_{id} = x_i \qquad \text{where} \quad d-w \leq i \leq d \quad \text{for all} \quad d \lt p-w
$$
Where *w* is the `matching_window` size and *d* are all periods that occur before
the beginning of the current state.
All possible past states are passed through the matching function and the matched
past state is selected as the past state that minimizes the matching function.
If `matching` is set to `euclidean`, the matched past state *p* is the past
state that satisfies the following:
$$
p = \min \sqrt{\sum_{i=1}^{n}(c_i - p_{id})^2} \qquad \text{for all past states } d
$$
Forecast accuracy and forecast weights are computed over observations from the
matched past state *p*. If `errors` is set to `mse` then the forecast weights
*W* for each forecast *k* are calculated as.
$$
W_k = \frac{1/MSE(Y_{kp})}{1/\sum_{k=1}^{k}MSE(Y_{kp})}
$$
The current period forecast is then calculated as:
$$
forecast_p = Y1_pW_1 + Y2_pW_2
$$
```{r}
date <- as.Date(c("2010-03-31", "2010-06-30", "2010-09-30", "2010-12-31",
"2011-03-31", "2011-06-30", "2011-09-30", "2011-12-31",
"2012-03-31", "2012-06-30"))
future <- as.Date(c("2011-03-31", "2011-06-30", "2011-09-30", "2011-12-31",
"2012-03-31", "2012-06-30", "2012-09-30", "2012-12-31",
"2013-03-31", "2013-06-30"))
y <- c(1.09, 1.71, 1.09, 2.46, 1.78, 1.35, 2.89, 2.11, 2.97, 0.99)
x1 <- c(4.22, 3.86, 4.27, 5.60, 5.11, 4.31, 4.92, 5.80, 6.30, 4.17)
x2 <- c(10.03, 10.49, 10.85, 10.47, 9.09, 10.91, 8.68, 9.91, 7.87, 6.63)
data <- data.frame(date, y, x1, x2)
matching_vars <- data[, c("x1", "x2")]
y1_forecast <- Forecast(
origin = date,
future = future,
forecast = c(1.33, 1.36, 1.38, 1.68, 1.60, 1.55, 1.32, 1.22, 1.08, 0.88),
realized = c(1.78, 1.35, 2.89, 2.11, 2.97, 0.99, 1.31, 1.41, 1.02, 1.05),
h_ahead = 4L
)
y2_forecast <- Forecast(
origin = date,
future = future,
forecast = c(0.70, 0.88, 1.03, 1.05, 1.01, 0.82, 0.95, 1.09, 1.07, 1.06),
realized = c(1.78, 1.35, 2.89, 2.11, 2.97, 0.99, 1.31, 1.41, 1.02, 1.05),
h_ahead = 4L
)
states_weighted_forc(
y1_forecast, y2_forecast,
matching_vars = matching_vars,
time_vec = data$date,
matching_window = 2L,
matching = "euclidean",
errors = "mse",
return_weights = FALSE
)
```
The `states_weighted_forc()` function returns a weighted forecast of the Y1 and
Y2 forecasts based on how these forecasts performed in past states of the world
that resemble the current state of the world. The weights used in each period
and a list of the matched states can be returned to the Global Environment by
setting `return_weights` to `TRUE`. This function can be used to compute a
states weighted forecast for the present period or to test how a states weighted
forecast would have performed historically. The states weighted forecasts are
based on information that **would** have been available to the forecaster
at the forecast origin.
## Subsetting Functions Overview
The Forecast class comes with a built in method for subsetting a single Forecast object. This subsetting method takes numeric or logical values and follows subsetting conventions that are present throughout the R language.
`forc[2:4]`
`forc[origin(forc1_1h) >= as.Date("2010-12-31")]`
However, one often ends up working with multiple Forecast objects. Examples
include working with different model forecasts for the same forecast horizon, one model forecast for varying forecast horizons, or both. The lmForc convention for working with multiple Forecast objects is to put them into a list. The following functions provide a way to subset lists of Forecast objects by various conditions.
### Subsetting Example Dataset
Examples of lmForc subsetting functions utilize the following stylized dataset.
This example dataset contains one-quarter ahead forecasts produced by two different
models, `forc1` and `forc2`. Note that both forecasts have identical `future`,
`realized`, and `h_ahead` values, but that the `origin` dates of the last two forecasts differ. This becomes relevant when both forecast models are subset
to identical `origin` values.
```{r example101, message=FALSE}
forc1_1h <- Forecast(
origin = as.Date(c("2010-02-17", "2010-05-14", "2010-07-22", "2010-12-05", "2011-03-10")),
future = as.Date(c("2010-06-30", "2010-09-30", "2010-12-31", "2011-03-31", "2011-06-30")),
forecast = c(4.27, 3.36, 4.78, 5.45, 5.12),
realized = c(4.96, 4.17, 4.26, 4.99, 5.38),
h_ahead = 1
)
forc2_1h <- Forecast(
origin = as.Date(c("2010-02-17", "2010-05-14", "2010-07-22", "2010-12-22", "2011-03-27")),
future = as.Date(c("2010-06-30", "2010-09-30", "2010-12-31", "2011-03-31", "2011-06-30")),
forecast = c(4.01, 3.89, 3.31, 4.33, 4.61),
realized = c(4.96, 4.17, 4.26, 4.99, 5.38),
h_ahead = 1
)
```
## subset_forcs
The simplest way to subset multiple Forecast objects is via the `subset_forcs()`
function. This function takes a list of Forecast objects and a numeric or logical
index. All forecasts in the list are subset by the numerical or logical values
that are passed to the `index` argument.
For example, a list of Forecast objects can be subset by a condition:
`subset_forcs(forcs, origin(forc1_1h) >= as.Date("2010-12-31"))`
```{r example104, message=FALSE}
forcs <- list(forc1_1h, forc2_1h)
subset_forcs(forcs, 2:3)
```
## subset_bytime
One may want to compare forecasts over a specific time horizon. The `subset_bytime()` function allows the user to subset multiple forecasts based on `origin` or `future` values. After using the `slot` argument to choose whether to subset by `origin` or `future` values, the user passes a single time object
or vector of time objects to the `values` argument. All forecasts in the list of
Forecast objects are subset by `values`.
For example, to see all of the forecasts that were made on a specific date:
`subset_bytime(forcs, values = as.Date("2010-05-14"), slot = "origin")`
```{r example105, message=FALSE}
forcs <- list(forc1_1h, forc2_1h)
subset_bytime(
forcs,
values = as.Date(c("2010-09-30", "2010-12-31", "2011-03-31")),
slot = "future"
)
```
## subset_identical
When comparing multiple forecasts, it is imperative that forecast accuracy is
compared across an identical time period. This can become complicated if the `origin` and `future` values of multiple forecasts do not always align. The `subset_identical()` function finds all overlapping `origin` or `future` values in a list of Forecast objects and subsets each of the forecasts to these overlapping values. This leaves the user with a list of Forecasts that have either identical `origin` values or identical `future` values depending on what the user passes to the `slot` argument.
```{r example106, message=FALSE}
forcs <- list(forc1_1h, forc2_1h)
subset_identical(forcs, slot = "origin")
```
## Transformation Functions Overview
Forecasts are often produced for multiple `h_ahead` horizons into the future. For
example, a model may produce a 1-quarter ahead, 2-quarter ahead, 3-quarter ahead,
and 4-quarter ahead forecast during each quarter of the year. In this example, multiple Forecast objects are needed to capture the forecast made during each quarter. As per lmForc convention, one would work with these forecasts by putting them into a list. When working with list of forecasts for multiple `h_ahead` horizons into the future, there are two general formats in which the forecasts can be organized. These two formats are: **Time Format** and **h_ahead Format**.
### Time Format
**Time Format** consists of a list of Forecast objects where each forecast has
homogenous `origin` or `future` values. Each Forecast object in the list was made at the
same time or contains forecasts for the same future time. However, the `h_ahead`
forecast horizon differs within each Forecast object. **Time Format** is used to
represent forecasts made at a single `origin` time for multiple `h_ahead` horizons.
The following is an example of forecasts in **Time Format**. Each Forecast object
represents a set of 1-quarter, 2-quarter, and 3-quarter ahead forecasts
made at a single `origin` time during each quarter of 2010. Note that because each Forecast object contains forecasts for multiple `h_ahead` horizons, `h_ahead` is set to `NA`. We place all of these forecasts into a list of Forecast objects that is
in **Time Format** and assign it to `forcs_time_format`.
```{r example108, message=FALSE}
forc1_t1 <- Forecast(
origin = as.Date(c("2010-02-17", "2010-02-17", "2010-02-17")),
future = as.Date(c("2010-06-30", "2010-09-30", "2010-12-31")),
forecast = c(4.27, 3.77, 3.52),
realized = c(4.96, 4.17, 4.26),
h_ahead = NA
)
forc1_t2 <- Forecast(
origin = as.Date(c("2010-05-14", "2010-05-14", "2010-05-14")),
future = as.Date(c("2010-09-30", "2010-12-31", "2011-03-31")),
forecast = c(3.36, 3.82, 4.22),
realized = c(4.17, 4.26, 4.99),
h_ahead = NA
)
forc1_t3 <- Forecast(
origin = as.Date(c("2010-07-22", "2010-07-22", "2010-07-22")),
future = as.Date(c("2010-12-31", "2011-03-31", "2011-06-30")),
forecast = c(4.78, 4.53, 5.03),
realized = c(4.26, 4.99, 5.33),
h_ahead = NA
)
forc1_t4 <- Forecast(
origin = as.Date(c("2010-12-22", "2010-12-22", "2010-12-22")),
future = as.Date(c("2011-03-31", "2011-06-30", "2011-09-30")),
forecast = c(5.45, 4.89, 5.78),
realized = c(4.99, 5.33, 5.21),
h_ahead = NA
)
forcs_time_format <- list(forc1_t1, forc1_t2, forc1_t3, forc1_t4)
```
### h_ahead Format
**h_ahead Format** consists of a list of Forecast objects where each forecast has
homogenous `h_ahead` values but the `origin` and `future` values vary. The
**h_ahead Format** is useful for analyzing forecasts at a specific `h_ahead`
horizon. For example, one may want to calculate the forecast accuracy of
4-quarter ahead forecasts only. In this case it is useful to have multiple forecasts
arranged by homogenous `h_ahead` values.
The following is an example of forecasts in **h_ahead Format**. Each Forecast object
represents all of the 1-quarter, 2-quarter, and 3-quarter ahead forecasts made during different quarters of 2010. Note that because each Forecast object has a homogenous `h_ahead` horizon we can now set `h_ahead` to the appropriate value. These forecasts are collected into a list of Forecast objects that is in **h_ahead Format** and assigned to `forcs_h_ahead_format`.
```{r example109, message=FALSE}
forc1_1h <- Forecast(
origin = as.Date(c("2010-02-17", "2010-05-14", "2010-07-22", "2010-12-22")),
future = as.Date(c("2010-06-30", "2010-09-30", "2010-12-31", "2011-03-31")),
forecast = c(4.27, 3.36, 4.78, 5.45),
realized = c(4.96, 4.17, 4.26, 4.99),
h_ahead = 1
)
forc1_2h <- Forecast(
origin = as.Date(c("2010-02-17", "2010-05-14", "2010-07-22", "2010-12-22")),
future = as.Date(c("2010-09-30", "2010-12-31", "2011-03-31", "2011-06-30")),
forecast = c(3.77, 3.82, 4.53, 4.89),
realized = c(4.17, 4.26, 4.99, 5.33),
h_ahead = 2
)
forc1_3h <- Forecast(
origin = as.Date(c("2010-02-17", "2010-05-14", "2010-07-22", "2010-12-22")),
future = as.Date(c("2010-12-31", "2011-03-31", "2011-06-30", "2011-09-30")),
forecast = c(3.52, 4.22, 5.03, 5.78),
realized = c(4.26, 4.99, 5.33, 5.21),
h_ahead = 3
)
forcs_h_ahead_format <- list(forc1_1h, forc1_2h, forc1_3h)
```
## convert_bytime
Given a list of forecasts in **h_ahead Format**, one may want to convert one or
multiple of these forecasts into **Time Format**. The function `convert_bytime()`
takes a list of Forecast objects in **h_ahead Format** and converts the forecasts
made on the time specified in the `value` and `slot` arguments into Forecast objects
that are in **Time Format**. Note that because we are converting to **Time Format**, the `h_ahead` value in each Forecast object is changed to `NA`.
```{r example110, message=FALSE}
convert_bytime(
forcs_h_ahead_format,
value = as.Date(c("2010-07-22", "2010-12-22")),
slot = "origin"
)
```
## transform_bytime
Given a list of forecasts in **h_ahead Format** one can convert all of the forecasts
to **Time Format** using the `transform_bytime()` function. This function transforms
all Forecast objects in `forcs` to a list of **Time Format** Forecast objects that
have homogenous `origin` or `future` values depending on what the user specifies
in the `slot` argument. The difference between `transform_bytime()` and
`convert_bytime()` is that transforming automatically converts all forecasts
in the list while converting only converts the forecasts specified by the user.
```{r example111, message=FALSE}
transform_bytime(forcs_h_ahead_format, slot = "origin")
```
Note that the output of `transform_bytime()` above is identical to the list of
Forecast objects in `forcs_time_format`. One can continually transform
between **Time Format** and **h_ahead Format** without losing information. This is evidenced by the fact that:
`transform_bytime(forcs_h_ahead_format, slot = "origin") == forcs_time_format`
and
`transform_byh(forcs_time_format, h_aheads = c(1, 2, 3)) == forcs_h_ahead_format`
## convert_byh
The inverse of `convert_bytime()` is `convert_byh()`. Given a list of forecasts
in **Time Format** `convert_byh()` converts one or multiple of these forecasts into **h_ahead Format**. The functions takes a list of Forecast objects in **Time Format**
and converts the forecasts specified by the `index` argument into Forecast objects
in **h_ahead Format**. Because forecasts that are in **Time Format** do not have
`h_ahead` values, the function allows the user to assign `h_ahead` values to the
converted Forecast objects via the `h_aheads` argument.
```{r example112, message=FALSE}
convert_byh(forcs_time_format, index = 1:2, h_aheads = c(1, 2))
```
## transform_byh
Given a list of forecasts in **Time Format** one can convert all of the forecasts
to **h_ahead Format** using the `transform_byh()` function. This function transforms
all Forecast objects in `forcs` to a list of **h_ahead Format** Forecast objects that
have homogenous `h_ahead` values. `h_ahead` values are assigned to each converted
Forecast object based on the values passed to the `h_aheads` argument.
The difference between `transform_byh()` and `convert_byh()` is that transforming automatically converts all forecasts in the list while converting only converts the forecasts specified by the user.
```{r example113, message=FALSE}
transform_byh(forcs_time_format, h_aheads = c(1, 2, 3))
```
Note that the output of `transform_byh()` above is identical to the list of
Forecast objects in `forcs_h_ahead_format`. One can continually transform
between **Time Format** and **h_ahead Format** without losing information. This is evidenced by the fact that:
`transform_byh(forcs_time_format, h_aheads = c(1, 2, 3)) == forcs_h_ahead_format`
and
`transform_bytime(forcs_h_ahead_format, slot = "origin") == forcs_time_format`