--- title: "Replicating Agustini et al. (2026) with accuracylevel" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Replicating Agustini et al. (2026) with accuracylevel} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4 ) ``` This vignette reproduces the main results of > Agustini, M., Fithriasari, K., & Prastyo, D.D. (2026). An accuracy-level > method for robust evaluation in predictive analytics. *Decision Analytics > Journal*, 18, 100661. It covers the **simple case** (Table 4--6), the **regression-with-outliers** simulation, and the **time-series** case, plus the integration helpers for `caret`, `tidymodels`, and `forecast`. The **imputation** case study is intentionally omitted: it relies on confidential firm-level microdata from BPS-Statistics Indonesia that cannot be redistributed. ## A note on data The package ships **no datasets**. The article's data come from public sources that are referenced by link only: * simple linear regression and candy-production series on Kaggle (the candy series originates from FRED series `IPG3113N`, public domain); * firm turnover microdata from BPS-Statistics Indonesia (confidential, not redistributable). To keep this vignette fully reproducible and free of downloads, the regression and time-series sections below use **small simulated datasets generated inline** with a fixed seed. Code for downloading the real public series is shown but not executed. ```{r load} library(accuracylevel) ``` # 1. Simple case (Table 4--6) The ten observations of Table 4 are reproduced exactly. Model 1 is the baseline and the threshold uses the second quartile (Q2) of the error, where the APE is closest to 0.10. ```{r simple-data} actual <- c(7.00, 6.03, 2.02, 5.10, 9.00, 1.00, 3.00, 4.38, 1.00, 8.07) model1 <- c(6.05, 5.02, 1.32, 5.15, 8.00, 2.20, 2.70, 3.48, 1.00, 7.56) model2 <- c(8.10, 7.04, 2.12, 5.20, 9.10, 1.00, 3.08, 4.40, 1.00, 6.15) model3 <- c(7.01, 6.04, 2.09, 5.11, 9.01, 5.10, 3.01, 4.39, 1.00, 8.10) ``` ### Conventional and robust metrics (Table 5) ```{r simple-conv} conv <- rbind( Model1 = conventional_metrics(actual, model1), Model2 = conventional_metrics(actual, model2), Model3 = conventional_metrics(actual, model3) ) round(conv, 4) rob <- rbind( Model1 = robust_metrics(actual, model1), Model2 = robust_metrics(actual, model2), Model3 = robust_metrics(actual, model3) ) round(rob, 4) ``` ### Accuracy-level metrics (Table 6) ```{r simple-al} thresh <- calculate_threshold(actual, model1, error_type = "ape", quartile = 2) thresh al1 <- accuracy_level(actual, model1, threshold = thresh) al2 <- accuracy_level(actual, model2, threshold = thresh) al3 <- accuracy_level(actual, model3, threshold = thresh) al1$metrics al2$metrics al3$metrics ``` These match Table 6 of the paper exactly: Model 3 reaches 90% at Level 1 for every metric, while conventional metrics favour Model 2. ```{r simple-compare} res <- compare_models( Model1 = list(actual = actual, predicted = model1), Model2 = list(actual = actual, predicted = model2), Model3 = list(actual = actual, predicted = model3), metric = "cape", threshold = thresh ) res$optimal_model res$comparison[, c("Model", "L1", "L2", "L3", "L4")] ``` # 2. Regression with outliers The article injects outliers into a clean regression and shows that the proposed metrics degrade gradually, unlike RMSE/MAPE. Here we mimic that with a small simulated dataset. ```{r reg-sim} set.seed(2026) n <- 200 x <- runif(n, 0, 100) y <- 1.5 * x + rnorm(n, 0, 3) # clean response pred_clean <- 1.5 * x # model prediction # 5% outliers injected into the response y_out <- y idx <- sample(n, size = 0.05 * n) y_out[idx] <- y_out[idx] + 80 baseline <- calculate_threshold(y, pred_clean, error_type = "ape", quartile = 3) clean <- conventional_metrics(y, pred_clean) dirty <- conventional_metrics(y_out, pred_clean) rbind(clean = round(clean, 3), with_outliers = round(dirty, 3)) ``` Conventional metrics (especially MAPE/RMSE) jump sharply. The proposed Level-1 accuracy, evaluated against a fixed baseline threshold, moves far less: ```{r reg-al} al_clean <- accuracy_level(y, pred_clean, threshold = baseline) al_dirty <- accuracy_level(y_out, pred_clean, threshold = baseline) data.frame( scenario = c("clean", "with_outliers"), CSE_L1 = c(al_clean$metrics$CSE[1], al_dirty$metrics$CSE[1]), CAPE_L1 = c(al_clean$metrics$CAPE[1], al_dirty$metrics$CAPE[1]) ) ``` # 3. Time-series case The paper forecasts U.S. candy production (FRED `IPG3113N`). That series is public; you could load it directly: ```{r ts-download, eval = FALSE} # Public source (not run during vignette build): # https://www.kaggle.com/code/goldens/candy-production-time-series-analysis candy <- read.csv("candy_production.csv") ``` For a self-contained demonstration we simulate a seasonal series and compare two naive forecasts. ```{r ts-sim} set.seed(1) m <- 48 trend <- seq(80, 120, length.out = m) season <- 10 * sin(2 * pi * (1:m) / 12) candy <- trend + season + rnorm(m, 0, 3) fc_seasonal <- c(candy[1:12], candy[1:(m - 12)]) # seasonal-naive-like fc_mean <- rep(mean(candy), m) # mean forecast base_ts <- calculate_threshold(candy, fc_seasonal, error_type = "ape", quartile = 2) data.frame( model = c("seasonal", "mean"), CSE_L1 = c(accuracy_level(candy, fc_seasonal, threshold = base_ts)$metrics$CSE[1], accuracy_level(candy, fc_mean, threshold = base_ts)$metrics$CSE[1]) ) ``` # 4. Framework integration The integration helpers require optional packages; the chunks below run only when those packages are installed. ### caret ```{r caret, eval = requireNamespace("caret", quietly = TRUE)} library(caret) dat <- data.frame(y = y, x = x) ctrl <- trainControl(method = "cv", number = 3, summaryFunction = caret_summary()) fit <- train(y ~ x, data = dat, method = "lm", trControl = ctrl, metric = "CSE_L1", maximize = TRUE) fit$results[, c("RMSE", "CSE_L1", "CAE_L1")] ``` ### tidymodels / yardstick ```{r yardstick, eval = requireNamespace("yardstick", quietly = TRUE)} library(yardstick) df <- data.frame(truth = actual, estimate = model3) accuracy_level_metrics(df, truth, estimate) al_set <- al_metric_set(include_traditional = TRUE) al_set(df, truth = truth, estimate = estimate) ``` ### forecast ```{r forecast, eval = requireNamespace("forecast", quietly = TRUE)} library(forecast) train_ts <- ts(candy[1:36], frequency = 12) fc <- forecast(ets(train_ts), h = 12) al_forecast_accuracy(fc, candy[37:48])$metrics ``` # Session info ```{r session} sessionInfo() ```