tamd: Transcendental Algorithm for Mixtures of Distributions

](https://github.com/efokoue/tamd) License: GPL v3

Dedicated to the memory of Professor D.M. Titterington (1945–2023),
University of Glasgow, whose foundational contributions to mixture
model theory made every question here worth asking.

Overview

tamd implements the Transcendental Algorithm for Mixtures of Distributions (TAMD), a penalized likelihood framework for fitting finite Gaussian mixture models. TAMD augments the EM algorithm with analytic barrier terms built from the Hellinger affinity between components, which diverge precisely on the singular locus of the mixture likelihood, preventing component coalescence and weight degeneracy.

Key features

Installation

# From GitHub
devtools::install_github("efokoue/tamd")

Quick start

library(tamd)

# Simulate a 3-component mixture
X <- simulate_gmm(
  n     = 400,
  K     = 3,
  pi    = c(0.3, 0.4, 0.3),
  mu    = matrix(c(-3, 0,  0, 0,  3, 0), nrow = 2),
  Sigma = array(rep(diag(2), 3), dim = c(2, 2, 3)),
  seed  = 42
)

# Fit TAMD
fit <- tamd(X, K = 3, seed = 42)
print(fit)
#  K = 3 components  |  n = 400  |  d = 2
#  Regularity index rho = 0.9412  [RELIABLE]
#  TAC = 3241.7  |  BIC = 3298.4

# Automated model selection via TAC
sel <- tamd_select(X, K_min = 1, K_max = 6, seed = 42)
cat("TAC selects K =", sel$K_hat, "\n")  # K = 3

# Visualize
plot(fit, X = X)

The regularity index ρ

The regularity index ρ ∈ (0,1) summarizes how well-separated the fitted components are (Titterington Theorem, Fokoué 2024):

ρ range Interpretation
ρ > 0.90 Reliable: trust parameter estimates and clustering
0.70 < ρ ≤ 0.90 Moderate: verify K is appropriate
ρ ≤ 0.70 Caution: near-singular; use for density estimation only

TAC vs BIC vs WBIC

Criterion Penalty Consistent? Cost
BIC (dθ/2) log n No (overcounts) Free
WBIC λ_unpen log n Yes (Bayesian) MCMC
TAC p_eff log n Yes (frequentist) Free

TAC uses a geometry-corrected effective parameter count p_eff ∈ (2λ_unpen, dθ), data-adaptive, computed directly from the Hellinger affinities at the fitted solution.

Package structure

tamd/
├── R/
│   ├── tamd.R          # Core fitting function
│   ├── hellinger.R     # Affinity and gradient computations
│   ├── criteria.R      # TAC, TIC, BIC, AIC, rho
│   ├── methods.R       # print/summary/plot, simulate_gmm
│   └── utils.R         # Internal helpers
├── tests/
│   └── testthat/
│       └── test-tamd.R # 18 unit tests
├── vignettes/
│   └── tamd-intro.Rmd  # Full introduction
└── inst/reproduce/     # Paper reproduction scripts
    ├── E1_regularity.R
    ├── E2_transversality.R
    ├── E3_rlct.R
    ├── E4_tac.R
    ├── E5_rho.R
    ├── E6_failures.R
    ├── real_oldfaithful.R
    └── real_galaxy.R

Reproducing the paper

All simulation experiments in Fokoué (2024) are exactly reproducible:

# From the R console:
source(system.file("reproduce/E1_regularity.R",    package = "tamd"))
source(system.file("reproduce/E4_tac.R",           package = "tamd"))
source(system.file("reproduce/real_oldfaithful.R", package = "tamd"))

Or from the command line:

Rscript inst/reproduce/E1_regularity.R

Citation

@article{fokoue2024tamd,
  author  = {Fokou\'{e}, Ernest},
  title   = {The Transcendental Algorithm for Mixtures
             of Distributions},
  journal = {Annals of Statistics},
  year    = {2024},
  note    = {Submitted}
}

@Manual{tamd2024package,
  title  = {tamd: Transcendental Algorithm for Mixtures
            of Distributions},
  author = {Fokou\'{e}, Ernest},
  year   = {2024},
  note   = {R package version 1.0.0},
  url    = {https://github.com/efokoue/tamd}
}

References

Fokoué, E. (2024). The Transcendental Algorithm for Mixtures of Distributions. Annals of Statistics, submitted.

Titterington, D.M., Smith, A.F.M., and Makov, U.E. (1985). Statistical Analysis of Finite Mixture Distributions. Wiley.

Watanabe, S. (2009). Algebraic Geometry and Statistical Learning Theory. Cambridge University Press.

Yamazaki, K. and Watanabe, S. (2003). Singularities in mixture models and upper bounds of stochastic complexity. Neural Networks, 16(7), 1029–1038.