Introduction to ArvindSt

Overview

The ArvindSt package provides a unified implementation of the Arvind distribution and five novel stochastic regression models that replace the traditional Gaussian error assumption with Arvind-distributed errors.

The Arvind distribution is a flexible single-parameter continuous distribution on \((0, \infty)\) with PDF:

\[f(x; \theta) = \frac{\theta(1 + 2x + 2\theta x^2)}{(1 + \theta x)^2} \exp(-\theta x^2), \quad x > 0\]

Distribution Functions

library(ArvindSt)

# PDF at several points
darvind(c(0.5, 1, 2), theta = 1)
#> [1] 0.86533420 0.45984930 0.02645592

# CDF
parvind(1, theta = 2)
#> [1] 0.9548882

# Quantiles
qarvind(c(0.25, 0.5, 0.75), theta = 1)
#> [1] 0.2515668 0.5223750 0.8715184

# Random generation
set.seed(42)
x <- rarvind(1000, theta = 2)
summary(x)
#>      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
#> 0.0001702 0.1495542 0.3086652 0.3835834 0.5492802 1.8545692

Visualising the Arvind Distribution

x_seq <- seq(0.01, 4, length.out = 300)
thetas <- c(0.5, 1, 2, 5)

plot(NULL, xlim = c(0, 4), ylim = c(0, 1.5),
     xlab = "x", ylab = "f(x)", main = "Arvind PDF Family")
cols <- c("red", "blue", "darkgreen", "purple")
for (i in seq_along(thetas)) {
  lines(x_seq, darvind(x_seq, thetas[i]), col = cols[i], lwd = 2)
}
legend("topright", paste("theta =", thetas),
       col = cols, lwd = 2, cex = 0.8)

MLE Estimation

set.seed(42)
x <- rarvind(500, theta = 2)
fit <- fit_arvind_mle(x)
cat("Estimated theta:", fit$theta, "\n")
#> Estimated theta: 2.236184
cat("True theta: 2\n")
#> True theta: 2

Model Fitting Example

# Generate simulated data
dat <- simulate_arvind_data(n = 60, seed = 1)

# Fit RW1 model
m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42)
cat("Model:", m1$model_type, "\n")
#> Model: RW1-approx
cat("Theta:", m1$theta, "\n")
#> Theta: 0.1631118
cat("R-squared:", 1 - sum(m1$residuals^2) / sum((m1$Y - mean(m1$Y))^2), "\n")
#> R-squared: 0.935366

Diagnostics

d1 <- diagnostics_arvind(m1)
d1[, c("Model", "RMSE", "R2", "AIC", "KS_pvalue")]
#>        Model     RMSE       R2      AIC KS_pvalue
#> 1 RW1-approx 4.954889 0.935366 179.2226  0.614732

Cross-Validation

cv1 <- cv_arvind(m1, k_folds = 3, rolling = FALSE, seed = 42)
#> Warning: 'newdata' had 20 rows but variables found have 40 rows
#> Warning: 'newdata' had 20 rows but variables found have 40 rows
#> Warning: 'newdata' had 20 rows but variables found have 40 rows
cat("Mean CV RMSE:", cv1$mean_cv_rmse, "\n")
#> Mean CV RMSE: 27.39963

Summary

The ArvindSt package provides:

Distribution functions: darvind(), parvind(), qarvind(), rarvind()
Simulated data generation: simulate_arvind_data()
Five model-fitting functions: fit_rw1(), fit_tvlm(), fit_simex(), fit_mixed(), fit_hmm()
Diagnostics: diagnostics_arvind(), plot_arvind()
Forecasting: forecast_arvind()
Cross-validation: cv_arvind()
Model comparison: summary_arvind()