Robust Changepoint Detection for Time Series
January 29, 2026
{r setup, include=FALSE} knitr::opts_chunk$set( echo = TRUE, eval = FALSE, message = FALSE, warning = FALSE, fig.align = "center" )
RegimeChange is an R package for detecting structural breaks and regime changes in time series data. It provides a unified interface to both frequentist and Bayesian methods, with particular emphasis on robust performance in challenging scenarios including low signal-to-noise ratios, subtle variance changes, autocorrelated data, and heavy-tailed distributions.
Regime change detection addresses a fundamental philosophical question:
How do we distinguish a real change from the inherent variability of the world?
Every observable system exhibits fluctuations. The central question is: is this observed fluctuation “noise” (random variation within the same regime) or “signal” (evidence that the system transitioned to a qualitatively different state)?
We assume the world can be described through “regimes” or “states” characterized by parameters \(\theta\). A system is in state \(\theta_0\) until, at some unknown moment, it transitions to \(\theta_1\). This is a discrete ontology of change: there is no graduality, there is rupture.
We do not observe \(\theta\) directly, but noisy manifestations \(y_i\). The epistemology here is probabilistic: each observation is a “clue” about the true state, but none is conclusive by itself.
The statistic \(S = \sum s_i\) embodies a fundamental epistemological principle: evidence accumulates. Each observation contributes a “vote” in favor of \(H_0\) or \(H_1\), and the sum represents the total weight of evidence.
There exists an irresolvable epistemological tension between two types of error:
| Error | Technical Name | Consequence |
|---|---|---|
| Detecting change where there is none | False alarm (Type I) | Acting unnecessarily |
| Not detecting a real change | Delayed detection (Type II) | Failing to react in time |
The decision threshold represents precisely this negotiation: how much evidence do we demand before declaring that the world has changed?
This structure appears across diverse domains:
| Domain | \(\theta_0\) | \(\theta_1\) | Question |
|---|---|---|---|
| Economics | Expansion | Recession | When did the cycle change? |
| Medicine | Stable patient | Deterioration | When did the crisis begin? |
| Finance | Stable market | Bubble/crash | When did the regime change? |
| Ecology | Balanced ecosystem | Disturbed | When did the collapse occur? |
| Manufacturing | Process under control | Defective | When did it go out of adjustment? |
| Climatology | Stationary climate | Climate change | When did the trend begin? |
| Epidemiology | Endemic | Outbreak | When did the epidemic start? |
| Electrical engineering | Normal operation | Failure/transient | When did the event occur? |
The likelihood ratio:
\[s_i = \ln \left( \frac{p_{\theta_1}(y_i)}{p_{\theta_0}(y_i)} \right)\]
has a deep informational interpretation: it measures how surprising observation \(y_i\) is under each hypothesis. If \(s_i > 0\), the observation is “more expected” under \(\theta_1\) than under \(\theta_0\). It is, literally, a quantification of relative surprise.
Changepoint detection is a fundamental problem in time series analysis with applications in finance, quality control, climatology, and genomics. While several R packages address this problem, practitioners frequently encounter difficulties with:
RegimeChange addresses these challenges through adaptive variance floors for numerical stability, optional pre-whitening for autocorrelated series, robust M-estimation for heavy-tailed data, and ensemble methods that leverage multiple algorithms.
| Package | Language | Strengths | Limitations |
|---|---|---|---|
changepoint |
R (C backend) | Optimized PELT, stable, mature | Frequentist only, no online, limited uncertainty |
tidychangepoint |
R | Tidyverse interface, unifies methods | Wrapper, adds no new algorithms |
ruptures |
Python | Very complete for offline | Not Bayesian, not online |
bayesian_changepoint_detection |
Python | BOCPD with PyTorch | Bayesian only, doesn’t integrate other methods |
Kats (Meta) |
Python | BOCPD, well documented | Meta ecosystem, heavy dependencies |
RegimeChange provides:
# Install from GitHub
# install.packages("devtools")
devtools::install_github("IsadoreNabi/RegimeChange")library(RegimeChange)
# Generate data with a changepoint at t=200
set.seed(42)
data <- c(rnorm(200, 0, 1), rnorm(200, 2, 1))
# Detect using PELT
result <- detect_regimes(data, method = "pelt")
print(result)
# For autocorrelated data, enable AR correction
result_ar <- detect_regimes(data, method = "pelt", correct_ar = TRUE)
# For data with outliers or heavy tails, enable robust estimation
result_robust <- detect_regimes(data, method = "pelt", robust = TRUE)The library detects changes in different statistical properties:
| Type | Parameter | Hypothesis | Example |
|---|---|---|---|
| Mean | \(\mu\) | \(H_0: \mu = \mu_0\) vs \(H_1: \mu = \mu_1\) | Level change in GDP |
| Variance | \(\sigma^2\) | \(H_0: \sigma^2 = \sigma_0^2\) vs \(H_1: \sigma^2 = \sigma_1^2\) | Volatility increase |
| Mean and Variance | \((\mu, \sigma^2)\) | Joint change | Economic regime transition |
| Trend | \(\beta\) (slope) | \(H_0: \beta = 0\) vs \(H_1: \beta \neq 0\) | Start of inflationary trend |
| Autocorrelation | \(\rho\) | \(H_0: \rho = \rho_0\) vs \(H_1: \rho = \rho_1\) | Change in persistence |
| Distribution | \(F\) | \(H_0: F = F_0\) vs \(H_1: F \neq F_0\) | General non-parametric change |
Simple change (one point):
A single changepoint \(\tau\) divides the series into Regime A and Regime B.
Multiple changes (k points):
Multiple changepoints \(\tau_1, \tau_2, \ldots, \tau_k\) divide the series into \(k+1\) regimes.
# The user can specify what to look for
detect_regimes(
data,
type = c("mean", "variance", "both", "trend", "distribution"),
n_changepoints = c("single", "multiple", "unknown"),
min_segment_length = 30 # Minimum observations per regime
)The detection methods are organized into three main families:
The foundational method. Accumulates deviations from a target value.
Statistic: \[S_n = \sum_{i=1}^{n} (x_i - \mu_0)\]
Decision: Alarm when \(|S_n| > h\)
Complexity: \(O(n)\)
Use: Mean change detection, single changepoint.
detect_regimes(data, method = "cusum", threshold = 4)The gold standard for multiple changepoints.
Idea: Dynamic programming with pruning to find optimal segmentation.
Objective function: \[\sum_{i=1}^{m+1} \left[ \mathcal{C}(y_{(\tau_{i-1}+1):\tau_i}) \right] + \beta m\]
where \(\mathcal{C}\) is segment cost and \(\beta\) is the penalty for number of changes.
Complexity: \(O(n)\) in favorable cases, \(O(n^2)\) worst case.
Supported penalties: AIC, BIC, MBIC, MDL, manual.
# PELT (default) - O(n) optimal partitioning
detect_regimes(data, method = "pelt", penalty = "MBIC")
# With AR correction for autocorrelated data
detect_regimes(data, method = "pelt", correct_ar = TRUE)
# With robust estimation for outliers/heavy tails
detect_regimes(data, method = "pelt", robust = TRUE)Recursive divide and conquer.
Algorithm:
Complexity: \(O(n \log n)\)
Limitation: Does not guarantee global optimum (greedy).
detect_regimes(data, method = "binseg", n_changepoints = 3)Improvement on Binary Segmentation with random intervals.
Idea: Instead of searching the entire interval, search M random intervals and take the best.
Advantage: More robust to nearby changes.
detect_regimes(data, method = "wbs", n_intervals = 100)Optimized version of PELT with functional pruning.
Complexity: \(O(n)\) guaranteed for certain cost functions.
fpop_detect(data, penalty = "BIC")Non-parametric method using energy distance.
Idea: Use energy distance to detect changes in complete distribution.
Advantage: No parametric distribution assumption.
edivisive_detect(data, min_size = 30)Idea: Map data to Hilbert space and detect changes in that space.
Advantage: Captures changes in complex data structure.
kernel_cpd_detect(data, kernel = "rbf")Reference: Adams & MacKay (2007)
Central idea: Maintain a posterior distribution over the “run length” (time since last change).
Joint distribution: \[P(r_t, x_{1:t}) = \sum_{r_{t-1}} P(x_t | r_t, x^{(r)}) \cdot P(r_t | r_{t-1}) \cdot P(r_{t-1}, x_{1:t-1})\]
Components:
Supported conjugate models:
Complexity per observation: \(O(t)\) naïve, \(O(1)\) with truncation.
# Bayesian Online Changepoint Detection
detect_regimes(data, method = "bocpd")
# Custom prior specification
prior <- normal_gamma(mu0 = 0, kappa0 = 0.1, alpha0 = 1, beta0 = 1)
detect_regimes(data, method = "bocpd", prior = prior)Statistic: \[R_n = \sum_{k=1}^{n} \prod_{i=k}^{n} \frac{p_{\theta_1}(x_i)}{p_{\theta_0}(x_i)}\]
Interpretation: Sum of likelihood ratios from all possible past changepoints.
Advantage: Asymptotically optimal for minimizing detection delay.
shiryaev_roberts(data, threshold = 100)Note: These methods require optional dependencies
(keras, tensorflow).
Idea: Train autoencoder on “normal” data, detect change when reconstruction error increases.
autoencoder_detect(data, window_size = 50, threshold = 2.0)Convolutional architecture for sequence modeling.
tcn_detect(data, window_size = 100)Attention-based architecture for capturing long-range dependencies.
transformer_detect(data, window_size = 100)Self-supervised representation learning for change detection.
cpc_detect(data, window_size = 50)Combine multiple deep learning detectors.
ensemble_dl_detect(data, methods = c("autoencoder", "tcn"))Context: Retrospective analysis. Complete dataset is available.
Question: “When did the changes occur?”
Characteristics:
Appropriate algorithms: PELT, Binary Segmentation, WBS, FPOP, E-Divisive
Use cases:
Context: Real-time surveillance. Data arrives sequentially.
Question: “Is a change occurring now?”
Characteristics:
Appropriate algorithms: CUSUM, BOCPD, Shiryaev-Roberts
Key metrics:
Use cases:
# Initialize detector
detector <- regime_detector(method = "bocpd")
# Process streaming data
for (x in new_observations) {
result <- update(detector, x)
if (result$alarm) {
message("Changepoint detected at time ", result$time)
detector <- reset(detector)
}
}# Offline mode (default)
result <- detect_regimes(data, mode = "offline", method = "pelt")
# Online mode via regime_detector
detector <- regime_detector(method = "bocpd", prior = normal_gamma())
for (x in data_stream) {
result <- update(detector, x)
if (result$alarm) {
handle_regime_change(result)
}
}Every changepoint estimate must be accompanied by an uncertainty measure.
This is what distinguishes a scientifically rigorous library from a purely practical tool.
How confident are we that the change occurred exactly at \(\hat{\tau}\)?
Representation:
How confident are we that there is a change at all?
Representation:
How many changes are there really?
Representation:
bootstrap_changepoint <- function(data, method, B = 1000) {
estimates <- numeric(B)
for (b in 1:B) {
# Resample blocks to preserve dependence
data_boot <- block_bootstrap(data)
estimates[b] <- detect(data_boot, method)$changepoint
}
# Confidence interval
ci <- quantile(estimates, c(0.025, 0.975))
return(list(
estimate = detect(data, method)$changepoint,
ci = ci,
se = sd(estimates),
distribution = estimates
))
}BOCPD naturally produces \(P(r_t | x_{1:t})\), from which one can derive:
# Probability of change at each point
prob_change <- bocpd_result$posterior[, 1] # run length = 0
# Most probable changepoint
map_changepoint <- which.max(prob_change)
# Credible interval
credible_interval <- hpd_interval(prob_change, prob = 0.95)# Every detection function returns:
result <- list(
# Point estimates
changepoints = c(45, 120, 380),
# Location uncertainty
confidence_intervals = list(
c(42, 48),
c(115, 125),
c(375, 385)
),
# Existence uncertainty
existence_probability = c(0.95, 0.88, 0.72),
# Segment information
segments = list(
list(start = 1, end = 45, params = list(mean = 2.3, var = 0.5)),
list(start = 46, end = 120, params = list(mean = 5.1, var = 0.8))
# ...
),
# Diagnostics
information_criterion = list(BIC = 234.5, AIC = 228.1),
# Complete posterior (if applicable)
posterior = matrix(...) # P(change at t) for each t
)With \(d\) simultaneous series, the following can occur:
Let \(\mathbf{X}_t \in \mathbb{R}^d\) be a vector of observations at time \(t\).
Before change: \(\mathbf{X}_t \sim N(\boldsymbol{\mu}_0, \boldsymbol{\Sigma}_0)\)
After change: \(\mathbf{X}_t \sim N(\boldsymbol{\mu}_1, \boldsymbol{\Sigma}_1)\)
The change can be in:
Use Normal-Wishart predictive distribution:
Prior: \((\boldsymbol{\mu}, \boldsymbol{\Sigma}) \sim \text{NIW}(\mu_0, \kappa_0, \nu_0, \boldsymbol{\Psi}_0)\)
Update: Conjugate formulas for updating hyperparameters.
prior <- normal_wishart(mu0 = rep(0, d), kappa0 = 1,
nu0 = d + 1, Psi0 = diag(d))
result <- bocpd(data_matrix, prior = prior)For high dimension (\(d >> n\)), project to sparse directions before detecting.
Idea: Find direction \(\mathbf{v}\) that maximizes change signal, with \(||\mathbf{v}||_0 \leq s\).
sparse_projection_cpd(data_matrix, sparsity = 5)# Data: n × d matrix
data <- matrix(...)
# Detect synchronous changes
result <- detect_regimes(data,
type = "multivariate",
sync = TRUE)
# Detect changes per series
result <- detect_regimes(data,
type = "multivariate",
sync = FALSE,
share_changepoints = FALSE)
# Detect correlation change
result <- detect_regimes(data,
type = "correlation",
method = "normal_wishart")We conducted comprehensive benchmarks comparing RegimeChange against
established R packages: changepoint (Killick & Eckley,
2014), wbs (Fryzlewicz, 2014), not (Baranowski
et al., 2019), and ecp (James & Matteson, 2015). Tests
were performed with 50 replications per scenario, using synthetic data
under maximum difficulty conditions.
17 scenarios \(\times\) 50 reps = 850 tests. Tolerance = 15 observations.
| Rank | Method | Mean F1 | Package |
|---|---|---|---|
| 1 | RC_auto | 0.804 | RegimeChange |
| 2 | ECP_pkg | 0.788 | ecp |
| 3 | CP_binseg | 0.777 | changepoint |
| 4 | RC_cusum | 0.761 | RegimeChange |
| 5 | NOT_pkg | 0.722 | not |
| 6 | CP_pelt | 0.720 | changepoint |
| 7 | RC_pelt | 0.719 | RegimeChange |
| 8 | RC_wbs_mean | 0.681 | RegimeChange |
| 9 | RC_wbs | 0.654 | RegimeChange |
| 10 | RC_binseg | 0.652 | RegimeChange |
| 11 | WBS_pkg | 0.634 | wbs |
| 12 | RC_ediv | 0.612 | RegimeChange |
| 13 | RC_not | 0.514 | RegimeChange |
RegimeChange’s automatic method selector (RC_auto)
achieves the highest overall F1 score, outperforming all reference
packages.
| Scenario | Best RC | Best Reference | Winner |
|---|---|---|---|
| Mean δ=3 (easy) | 1.000 | 1.000 | Tie |
| Mean δ=2 (medium) | 1.000 | 1.000 | Tie |
| Mean δ=1 (hard) | 0.980 | 0.980 | Tie |
| Mean δ=0.5 (very hard) | 0.773 | 0.760 | RC (+0.01) |
| Variance 4:1 | 0.980 | 0.980 | Tie |
| Variance 2:1 | 0.667 | 0.400 | RC (+0.27) |
| Multi-CP (3 changes) | 1.000 | 1.000 | Tie |
| Multi-CP (5 changes) | 1.000 | 1.000 | Tie |
| AR(0.3)+change | 1.000 | 1.000 | Tie |
| AR(0.5)+change | 1.000 | 0.983 | RC (+0.02) |
| AR(0.7)+change | 0.900 | 0.720 | RC (+0.18) |
| 2% outliers | 1.000 | 0.967 | RC (+0.03) |
| 5% outliers | 1.000 | 0.963 | RC (+0.04) |
| 10% outliers | 0.993 | 0.973 | RC (+0.02) |
| t-dist (df=5) | 1.000 | 1.000 | Tie |
| t-dist (df=3) | 1.000 | 0.993 | RC (+0.01) |
| t-dist (df=2) | 0.967 | 0.983 | Ref (+0.02) |
Summary: RegimeChange wins or ties in 16 of 17 scenarios. The only loss is marginal (0.017 in t-dist df=2).
| Scenario | RegimeChange | Reference | Improvement |
|---|---|---|---|
| Subtle variance (2:1) | 0.667 | 0.400 | +67% |
| Strong AR (ρ=0.7) | 0.900 | 0.720 | +25% |
| 5% outliers | 1.000 | 0.963 | +4% |
| Heavy tails t(df=3) | 1.000 | 0.993 | +1% |
10 scenarios \(\times\) 50 reps = 500 tests. This benchmark specifically evaluates performance under data contamination.
| Scenario | RC_pelt | CP_pelt | RC_robust | RC_ediv | ECP_pkg |
|---|---|---|---|---|---|
| Clean (baseline) | 1.00±0.00 | 1.00±0.00 | 0.98±0.08 | 0.77±0.20 | 0.99±0.07 |
| 2% outliers | 0.24±0.12 | 0.52±0.26 | 1.00±0.00 | 0.81±0.17 | 0.99±0.07 |
| 5% outliers | 0.14±0.04 | 0.23±0.12 | 0.99±0.05 | 0.82±0.17 | 0.99±0.07 |
| 10% outliers | 0.11±0.08 | 0.17±0.19 | 0.99±0.05 | 0.85±0.18 | 1.00±0.00 |
| 15% outliers | 0.12±0.09 | 0.17±0.25 | 1.00±0.00 | 0.75±0.31 | 0.96±0.12 |
| t-dist (df=5) | 0.89±0.19 | 0.97±0.12 | 1.00±0.00 | 0.78±0.18 | 1.00±0.00 |
| t-dist (df=3) | 0.50±0.21 | 0.72±0.26 | 1.00±0.00 | 0.78±0.17 | 0.97±0.12 |
| t-dist (df=2) | 0.26±0.17 | 0.38±0.32 | 1.00±0.00 | 0.82±0.18 | 0.97±0.12 |
| Mixture (10%) | 0.30±0.14 | 0.64±0.33 | 0.99±0.05 | 0.77±0.18 | 0.97±0.12 |
| 5% Cauchy | 0.41±0.20 | 0.51±0.22 | 1.00±0.00 | 0.77±0.19 | 0.97±0.10 |
| Rank | Method | Mean F1 |
|---|---|---|
| 1 | RC_robust | 0.998 |
| 2 | ECP_pkg | 0.979 |
| 3 | RC_ediv | 0.795 |
| 4 | CP_pelt | 0.479 |
| 5 | RC_pelt | 0.331 |
Key Finding: RC_robust achieves
near-perfect performance (F1 = 0.998) across all contamination
scenarios, with perfect scores (F1 = 1.00) in 6 of 9 challenging
scenarios. This represents a 108% improvement over the
best reference package (changepoint PELT at F1 = 0.479) on
contaminated data.
Where RegimeChange excels:
Where RegimeChange ties:
Where references have slight advantage:
| Data Characteristics | Recommended Settings | Expected F1 |
|---|---|---|
| Clean iid data | default | 0.95+ |
| Mild contamination (2-5%) | robust = TRUE |
0.99+ |
| Heavy contamination (10%+) | robust = TRUE |
0.99+ |
| Heavy tails (t-dist, Cauchy) | robust = TRUE |
0.99+ |
| Autocorrelated (AR) | correct_ar = TRUE |
0.95+ |
| Subtle variance changes | type = "both" |
0.65+ |
| Unknown data quality | detect_regimes(data, method = "auto") |
0.80+ |
Recommendation: For real-world data where
contamination or heavy tails are possible, robust = TRUE is
strongly recommended. The computational overhead is modest, and the
robustness gains are substantial.
library(RegimeChange)
library(changepoint)
# Example 1: AR(0.7) scenario
set.seed(123)
ar_data <- as.numeric(arima.sim(list(ar = 0.7), n = 400))
ar_data[201:400] <- ar_data[201:400] + 2 # Add mean shift
cp_result <- cpt.meanvar(ar_data, method = "PELT", penalty = "MBIC")
rc_result <- detect_pelt(ar_data, "both", "MBIC", 5)
rc_result_cusum <- cusum(ar_data)
# RC_cusum typically closer to true changepoint at 200
# Example 2: Contaminated data (5% outliers)
set.seed(456)
clean_data <- c(rnorm(200, 0), rnorm(200, 2))
outlier_idx <- sample(400, 20)
contaminated <- clean_data
contaminated[outlier_idx] <- contaminated[outlier_idx] +
sample(c(-10, 10), 20, replace = TRUE)
# Standard methods fail
cp_result <- cpts(cpt.mean(contaminated, method = "PELT", penalty = "MBIC"))
# Often misses the true changepoint or finds spurious ones
# Robust method succeeds
rc_result <- detect_pelt(contaminated, "mean", "MBIC", 5, robust = TRUE)
rc_result$changepoints # Typically close to 200Hausdorff Distance: \[d_H(\hat{\mathcal{T}}, \mathcal{T}) = \max\left( \max_{\hat{\tau} \in \hat{\mathcal{T}}} \min_{\tau \in \mathcal{T}} |\hat{\tau} - \tau|, \max_{\tau \in \mathcal{T}} \min_{\hat{\tau} \in \hat{\mathcal{T}}} |\tau - \hat{\tau}| \right)\]
Interpretation: Maximum distance between an estimated point and the nearest true one.
Rand Index: \[RI = \frac{TP + TN}{TP + TN + FP + FN}\]
where TP, TN, FP, FN are defined over pairs of points:
Adjusted Rand Index: Corrected for chance.
\[Cover(\hat{\mathcal{S}}, \mathcal{S}) = \frac{1}{n} \sum_{S \in \mathcal{S}} |S| \cdot \max_{\hat{S} \in \hat{\mathcal{S}}} \frac{|S \cap \hat{S}|}{|S \cup \hat{S}|}\]
Interpretation: Weighted average of IoU (Intersection over Union) for each segment.
Given a tolerance margin \(M\):
\[F_1 = \frac{2 \cdot \text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}}\]
ARL\(_0\) (under \(H_0\)): Average time to false alarm. We want ARL\(_0\) high.
ARL\(_1\) (under \(H_1\)): Average time from change to detection. We want ARL\(_1\) low.
Trade-off: Increasing threshold \(\rightarrow\) \(\uparrow\)ARL\(_0\), \(\uparrow\)ARL\(_1\)
\[P_D = P(\text{detect change} | \text{change occurred})\]
Measured at different time horizons after the change.
\[P_{FA} = P(\text{detect change} | \text{no change occurred})\]
Per time window.
# Evaluate result against ground truth
result <- detect_regimes(data, method = "pelt")
metrics <- evaluate(result, true_changepoints = c(100, 250), tolerance = 10)
print(metrics)
# Available metrics
hausdorff_distance(result$changepoints, true_cps)
f1_score(result$changepoints, true_cps, tolerance = 10)
rand_index(result, true_cps)
adjusted_rand_index(result, true_cps)
covering_metric(result, true_cps)
precision_score(result$changepoints, true_cps, tolerance = 10)
recall_score(result$changepoints, true_cps, tolerance = 10)
mean_absolute_error(result$changepoints, true_cps)
rmse_changepoints(result$changepoints, true_cps)The library includes:
# Compare multiple methods
benchmark <- compare_methods(
data = benchmark_data,
methods = c("pelt", "bocpd", "wbs", "cusum"),
true_changepoints = true_cps
)
# Automatic visualization
plot(benchmark)robust = TRUE)When enabled, RegimeChange uses:
This provides a breakdown point of 50% (maximum theoretical) while maintaining high efficiency for clean data.
correct_ar = TRUE)Pre-whitening transformation: \(y'_t = y_t - \rho \cdot y_{t-1}\)
robust = TRUE)“R Usability, Julia Performance, Statistical Rigor”
Principles:
RegimeChange/
├── DESCRIPTION
├── NAMESPACE
├── LICENSE
├── README.md
├── R/
│ ├── data.R # Dataset documentation
│ ├── detect.R # Main function detect_regimes()
│ ├── evaluation.R # Metrics and benchmarking
│ ├── imports.R # Package imports
│ ├── julia_backend.R # Julia connection via JuliaCall
│ ├── methods_advanced.R # FPOP, Kernel CPD, E-Divisive
│ ├── methods_bayesian.R # BOCPD, Shiryaev-Roberts
│ ├── methods_deeplearning.R # Autoencoder, TCN, Transformer
│ ├── methods_frequentist.R # PELT, CUSUM, BinSeg, WBS
│ ├── priors.R # Prior distributions
│ ├── visualization.R # Plotting functions
│ └── zzz.R # Package initialization
├── inst/
│ └── julia/
│ └── RegimeChangeJulia.jl
├── man/
├── tests/
│ ├── testthat.R
│ └── testthat/
├── vignettes/
│ ├── introduction.Rmd
│ ├── offline-detection.Rmd
│ └── bayesian-methods.Rmd
└── data/
The optional Julia backend provides high-performance implementations:
# Initialize Julia (optional, for better performance)
init_julia()
# Check availability
julia_available()
julia_status()
# Benchmark R vs Julia backends
benchmark_backends(data, method = "pelt")Julia module exports: pelt_detect,
fpop_detect, bocpd_detect,
cusum_detect, kernel_cpd_detect,
wbs_detect, segneigh_detect,
pelt_multivariate
#' Detect regime changes in time series
#'
#' @param data Numeric vector, ts, or matrix for multivariate
#' @param method Algorithm: "pelt", "bocpd", "cusum", "wbs",
#' "shiryaev", etc.
#' @param type Change type: "mean", "variance", "both", "trend",
#' "distribution"
#' @param mode Operation mode: "offline" (default), "online"
#' @param penalty For offline methods: "BIC", "AIC", "MBIC", "MDL",
#' or numeric
#' @param min_segment Minimum segment length
#' @param prior List with prior specification (for Bayesian methods)
#' @param threshold Detection threshold (for online/CUSUM methods)
#' @param correct_ar Logical: apply AR(1) correction?
#' @param robust Logical: use robust estimation?
#' @param ... Additional method-specific arguments
#'
#' @return Object of class "regime_result"
detect_regimes <- function(data, method = "pelt", ...)# Normal-Gamma prior for BOCPD
normal_gamma(mu0 = 0, kappa0 = 1, alpha0 = 1, beta0 = 1)
# Normal prior with known variance
normal_known_var(mu0 = 0, sigma0 = 1, known_var = 1)
# Normal-Wishart prior for multivariate
normal_wishart(mu0, kappa0, nu0, Psi0)
# Poisson-Gamma prior
poisson_gamma(alpha0 = 1, beta0 = 1)
# Inverse-Gamma prior for variance
inverse_gamma_var(alpha0 = 1, beta0 = 1)
# Hazard rate priors
geometric_hazard(lambda = 0.01)
constant_hazard(rate = 0.01)
negbin_hazard(alpha = 1, beta = 1)# Create online detector
detector <- regime_detector(method = "bocpd", prior = normal_gamma())
# Update with new observation
result <- update(detector, x)
# Reset after detection
detector <- reset(detector)# Plot detection results
plot(result) # S3 method
plot(result, type = "posterior") # Posterior probabilities
plot(result, type = "segments") # Segment visualization
# Additional plots
plot_summary(result)
plot_interactive(result) # Requires plotly
plot_compare(result1, result2) # Compare methods# Evaluate against ground truth
evaluate(result, true_changepoints, tolerance = 5)
# Individual metrics
f1_score(detected, true, tolerance)
hausdorff_distance(detected, true)
rand_index(result, true)
adjusted_rand_index(result, true)
covering_metric(result, true)
precision_score(detected, true, tolerance)
recall_score(detected, true, tolerance)
# Compare methods
compare_methods(data, methods = c("pelt", "bocpd"),
true_changepoints = true_cps)RegimeChange is designed to complement equation discovery workflows:
| Library | Question |
|---|---|
| RegimeChange | When do the system’s properties change? |
| EmpiricalDynamics | What is the equation governing the system? |
Potential workflow: Instead of searching for ONE equation for the entire dataset, discover different equations for each regime:
\[\text{Raw data} \rightarrow \text{RegimeChange} \rightarrow \text{Segments} \rightarrow \text{EmpiricalDynamics} \rightarrow \text{Switching system}\]
robust = TRUE alone typically works better than
combining both corrections| Term | Definition |
|---|---|
| Changepoint | Point in time where the statistical properties of a series change |
| Regime | Time period with homogeneous statistical properties |
| Run length | Time elapsed since the last changepoint |
| ARL | Average Run Length: average time until a decision |
| Hazard rate | Instantaneous probability that a change occurs |
| Posterior | Probability distribution after observing the data |
| PELT | Pruned Exact Linear Time: optimal segmentation algorithm |
| BOCPD | Bayesian Online Changepoint Detection |
| CUSUM | Cumulative Sum: cumulative statistic for detection |
@software{regimechange2024,
title = {RegimeChange: Robust Changepoint Detection for Time Series},
author = {Gómez Julián, José Mauricio},
year = {2024},
url = {https://github.com/IsadoreNabi/RegimeChange}
}MIT © José Mauricio Gómez Julián