mcp: Regression with Multiple Change Points

mcp Github Actions status mcp Coveralls status mcp CRAN status mcp CRAN downloads

mcp does regression with one or Multiple Change Points (MCP) between Generalized and hierarchical Linear Segments using Bayesian inference. mcp aims to provide maximum flexibility for analyses with a priori knowledge about the number of change points and the form of the segments in between.

Change points are also called switch points, break points, broken line regression, broken stick regression, bilinear regression, piecewise linear regression, local linear regression, segmented regression, and (performance) discontinuity models. mcp aims to be be useful for all of them. See how mcp compares to other R packages.

Under the hood, mcp takes a formula-representation of linear segments and turns it into JAGS code. mcp leverages the power of tidybayes, bayesplot, coda, and loo to make change point analysis easy and powerful.

Install

  1. Install the latest version of JAGS. Linux users can fetch binaries here.

  2. Install from CRAN:

    or install the development version from GitHub:

At a glance

Here are some example mcp models. mcp takes a list of formulas - one for each segment. The change point(s) are the x at which data changes from being better predicted by one formula to the next. The first formula is just response ~ predictors and the most common formula for segment 2+ would be ~ predictors (more details here).

Scroll down to see brief introductions to each of these, or browse the website articles for more thorough worked examples and discussions.

Brief worked example

Fit a model

The following model infers the two change points between three segments.

library(mcp)

# Define the model
model = list(
  response ~ 1,  # plateau (int_1)
  ~ 0 + time,    # joined slope (time_2) at cp_1
  ~ 1 + time     # disjoined slope (int_3, time_3) at cp_2
)

# Get example data and fit it
ex = mcp_example("demo")
fit = mcp(model, data = ex$data)

Plot and summary

The default plot includes data, fitted lines drawn randomly from the posterior, and change point(s) posterior density for each chain:

plot(fit)

Use summary() to summarise the posterior distribution as well as sampling diagnostics. They were simulated using mcp (see mcp_example("demo")$call) so the summary include the “true” values in the column sim and the column match show whether this true value is within the interval:

summary(fit)
Family: gaussian(link = 'identity')
Iterations: 9000 from 3 chains.
Segments:
  1: response ~ 1
  2: response ~ 1 ~ 0 + time
  3: response ~ 1 ~ 1 + time

Population-level parameters:
    name match  sim  mean lower  upper Rhat n.eff
    cp_1    OK 30.0 30.27 23.19 38.760    1   384
    cp_2    OK 70.0 69.78 69.27 70.238    1  5792
   int_1    OK 10.0 10.26  8.82 11.768    1  1480
   int_3    OK  0.0  0.44 -2.49  3.428    1   810
 sigma_1    OK  4.0  4.01  3.43  4.591    1  3852
  time_2    OK  0.5  0.53  0.40  0.662    1   437
  time_3    OK -0.2 -0.22 -0.38 -0.035    1   834

rhat is the Gelman-Rubin convergence diagnostic, eff is the effective sample size. You may also want to do a posterior predictive check using pp_check(fit).

plot_pars(fit) can be used to inspect the posteriors and convergence of all parameters. See the documentation of plot_pars() for many other plotting options. Here, we plot just the (population-level) change points. They often have “strange” posterior distributions, highlighting the need for a computational approach:

plot_pars(fit, regex_pars = "cp_")

Use fitted(fit) and predict(fit) to get fits and predictions for in-sample and out-of-sample data.

Tests and model comparison

We can test (joint) probabilities in the model using hypothesis() (see more here). For example, what is the evidence (given priors) that the first change point is later than 25 against it being less than 25?

hypothesis(fit, "cp_1 > 25")
     hypothesis mean lower upper     p   BF
1 cp_1 - 25 > 0 5.27 -1.81 13.76 0.917 11.1

For model comparisons, we can fit a null model and compare the predictive performance of the two models using (approximate) leave-one-out cross-validation (see more here). Our null model omits the first plateau and change point, essentially testing the credence of that change point:

# Define the model
model_null = list(
  response ~ 1 + time,  # intercept (int_1) and slope (time_1)
  ~ 1 + time            # disjoined slope (int_2, time_1)
)

# Fit it
fit_null = mcp(model_null, ex$data)

Leveraging the power of loo::loo, we see that the two-change-points model is preferred (it is on top), but the elpd_diff / se_diff ratio ratio indicate that this preference is not very strong.

fit$loo = loo(fit)
fit_null$loo = loo(fit_null)

loo::loo_compare(fit$loo, fit_null$loo)
       elpd_diff se_diff
model1  0.0       0.0
model2 -7.6       4.6

Highlights from in-depth guides

The articles on the mcp website go in-depth with the functionality of mcp. Here is an executive summary, to give you a quick sense of what mcp can do.

About mcp models and simulating data: * Parameter names are int_i (intercepts), cp_i (change points), x_i (slopes), phi_i (autocorrelation), and sigma_* (variance). * The change point model is basically an ifelse model. * Use rel() to specify that parameters are relative to those corresponding in the previous segments. * Generate data for all supported models using fit$simulate(). See examples in, e.g., mcp_examples("demo")$call.

Using priors: * See priors in fit$prior. * Set priors using mcp(..., prior = list(cp_1 = "dnorm(0, 1)", cp_1 = "dunif(0, 45)"). * The default prior for change points is fast for estimation but is mathematically “messy”. The Dirichlet prior (cp_i = "dirichlet(1)") is slow but beautiful. * Fix parameters to specific values using cp_1 = 45. * Share parameters between segments using slope_1 = "slope_2". * Truncate priors using T(lower, upper), e.g., int_1 = "dnorm(0, 1) T(0, )". mcp applies this automatically to change point priors to enforce order restriction. This is true for varying change points too. * Do prior predictive checks using mcp(model, data, sample = "prior").

Varying change points: * Get posteriors using ranef(fit). * Plot using plot(fit, facet_by = "my_group") and plot_pars(fit, pars = "varying", type = "dens_overlay", ncol = 3). * The default priors restrict varying change points to lie between the two adjacent change points.

Supported families and link functions: * mcp currently supports specific combinations of families (gaussian(), binomial(), bernoulli(), and poisson()) and link functions (identity, logit, probit, and log). * Use informative priors to avoid issues when using non-default priors. * Use binomial(link = "logit") for binomial change points in mcp. Also relevant for bernoulli(link = "logit"). * Use poisson(link = "log") for Poisson change points in mcp. * Get results on the parameter scale rather than the observed scale using plot(fit, scale = "linear") or predict(fit, scale = "linear").

Model comparison and hypothesis testing: * Do Leave-One-Out Cross-Validation using loo(fit) and loo::loo_compare(fit1$loo, fit2$loo). * Compute Savage-Dickey density rations using hypothesis(fit, "cp_1 = 40"). * Leverage directional and conditional tests to assess interval hypotheses (hypothesis(fit, "cp_1 > 30 & cp_1 < 50")), combined other hypotheses (hypothesis(fit, "cp_1 > 30 & int_1 > int_2")), etc.

Modeling variance and autoregression: * ~ sigma(1) models an intercept change in variance. ~ sigma(0 + x) models increasing/decreasing variance. * ~ ar(N) models Nth order autoregression on residuals. ~ar(N, 0 + x) models increasing/decreasing autocorrelation. * You can model anything for sigma() and ar(). For example, ~ x + sigma(1 + x + I(x^2)) models polynomial change in variance with x on top of a slope on the mean.

Get fitted and predicted values and intervals: * fitted(fit) and predict(fit) take many arguments to predict in-sample and out-of-sample values and intervals. * Forecasting with prior knowledge about future change points.

Tips, tricks, and debugging * Speed up fitting using mcp(..., cores = 3) / options(mcp_cores = 3), and/or fewer iterations, mcp(..., adapt = 500). * Help convergence along using mcp(..., inits = list(cp_1 = 20, int_2 = -3)). * Most errors will be caused by circularly defined priors.

Some examples

mcp aims to support a wide variety of models. Here are some example models for inspiration.

Means

Find the single change point between two plateaus (simulated using mcp_example("intercepts")$call).

model = list(
    y ~ 1,  # plateau (int_1)
    ~ 1     # plateau (int_2)
)
ex = mcp_example("intercepts")
fit = mcp(model, ex$data, par_x = "x")
plot(fit)

Varying change points

Here, we find the single change point between two joined slopes. While the slopes are shared by all participants, the change point varies by id. Read more about varying change points in mcp.

model = list(
  y ~ 1 + x,          # intercept + slope
  1 + (1|id) ~ 0 + x  # joined slope, varying by id
)
ex = mcp_example("varying")
fit = mcp(model, ex$data)
plot(fit, facet_by = "id")

Summarise the varying change points using ranef() or plot them using plot_pars(fit, "varying"). Again, this data was simulated using mcp (see mcp_example("varying")$call) so the columns match and sim are added to show simulation values and whether they are inside the interval. Set the width wider for a more lenient criterion.

ranef(fit, width = 0.98)
           name match   sim  mean   lower   upper Rhat n.eff
 cp_1_id[Benny]    OK -17.5 -18.1 -21.970 -14.877    1   895
  cp_1_id[Bill]    OK -10.5  -7.6 -10.658  -4.451    1   420
  cp_1_id[Cath]    OK  -3.5  -2.8  -5.634   0.027    1   888
  cp_1_id[Erin]    OK   3.5   3.1   0.041   5.952    1  3622
  cp_1_id[John]    OK  10.5  11.3   7.577  14.989    1  2321
  cp_1_id[Rose]    OK  17.5  14.1  10.485  18.079    1  5150

Generalized linear models

mcp supports Generalized Linear Modeling. See extended examples using binomial() and poisson().

Here is a binomial change point model with three segments (see simulation code: mcp_example("binomial")$call). We plot the 95% HDI too:

model = list(
  y | trials(N) ~ 1,  # constant rate
  ~ 0 + x,            # joined changing rate
  ~ 1 + x             # disjoined changing rate
)
ex = mcp_example("binomial")
fit = mcp(model, ex$data, family = binomial())
plot(fit, q_fit = TRUE)

Use plot(fit, rate = FALSE) if you want the points and fit lines on the original scale of y rather than divided by N.

Time series

mcp allows for flexible time series analysis with autoregressive residuals of arbitrary order. Below, we model a change from a plateau with strong positive AR(2) residuals to a slope with medium AR(1) residuals. These data were simulated with mcp (see simulation code: mcp_example("ar")$call) and the generating values are in the sim column. You can also do regression on the AR coefficients themselves using e.g., ar(1, 1 + x). Read more here.

model = list(
  price ~ 1 + ar(2),
  ~ 0 + time + ar(1)
)
ex = mcp_example("ar")
fit = mcp(model, ex$data)
summary(fit)

The AR(N) parameters on intercepts are named ar[order]_[segment]. All parameters, including the change point, are well recovered:

Population-level parameters:
    name match   sim    mean     lower   upper Rhat n.eff
   ar1_1    OK   0.7   0.741  5.86e-01   0.892 1.01   713
   ar1_2    OK  -0.4  -0.478 -6.88e-01  -0.255 1.00  2151
   ar2_1    OK   0.2   0.145 -6.56e-04   0.284 1.01   798
    cp_1       120.0 117.313  1.14e+02 118.963 1.05   241
   int_1        20.0  17.558  1.51e+01  19.831 1.02   293
 sigma_1    OK   5.0   4.829  4.39e+00   5.334 1.00  3750
  time_2    OK   0.5   0.517  4.85e-01   0.553 1.00   661

The fit plot shows the inferred autocorrelated nature:

plot(fit_ar)

Variance change and prediction intervals

You can model variance by adding a sigma() term to the formula. The inside sigma() can take everything that the formulas outside do. Read more in the article on variance. The example below models two change points. The first is variance-only: variance abruptly increases and then declines linearly with x. The second change point is the stop of the variance-decline and the onset of a slope on the mean.

Effects on variance is best visualized using prediction intervals. See more in the documentation for plot.mcpfit().

model = list(
  y ~ 1,
  ~ 0 + sigma(1 + x),
  ~ 0 + x
)
ex = mcp_example("variance")
fit = mcp(model, ex$data, cores = 3, adapt = 5000, iter = 5000)
plot(fit, q_predict = TRUE)

Quadratic and other exponentiations

Write exponents as I(x^N). E.g., quadratic I(x^2), cubic I(x^3), or some other power function I(x^1.5). The example below detects the onset of linear + quadratic growth. This is often called the BLQ model (Broken Line Quadratic) in nutrition research.

model = list(
  y ~ 1,
  ~ 0 + x + I(x^2)
)
ex = mcp_example("quadratic")
fit = mcp(model, ex$data)
plot(fit)

Trigonometric and others

You can use sin(x), cos(x), and tan(x) to do trigonometry. This can be useful for seasonal trends and other periodic data. You can also do exp(x), abs(x), log(x), and sqrt(x), but beware that the two latter will currently fail in segment 2+. Raise an issue if you need this.

model = list(
  y ~ 1 + sin(x),
  ~ 1 + cos(x) + x
)
ex = mcp_example("trigonometric")
fit = mcp(model, ex$data)
plot(fit)

Using rel() and priors

Read more about formula options and priors.

Here we find the two change points between three segments. The slope and intercept of segment 2 are parameterized relative to segment 1, i.e., modeling the change in intercept and slope since segment 1. So too with the second change point (cp_2) which is now the distance from cp_1.

Some of the default priors are overwritten. The first intercept (int_1) is forced to be 10, the slopes are in segment 1 and 3 is shared. It is easy to see these effects in the (simulated) data because they violate it somewhat. The first change point has to be at x = 20 or later.

model = list(
  y ~ 1 + x,
  ~ rel(1) + rel(x),
  rel(1) ~ 0 + x
)

prior = list(
  int_1 = 10,  # Constant, not estimated
  x_3 = "x_1",  # shared slope in segment 1 and 3
  int_2 = "dnorm(0, 20)",
  cp_1 = "dunif(20, 50)"  # has to occur in this interval
)
ex = mcp_example("rel_prior")
fit = mcp(model, ex$data, prior, iter = 10000)
plot(fit)

Comparing the summary to the fitted lines in the plot, we can see that int_2 and x_2 are relative values. We also see that the “wrong” priors made it harder to recover the parameters used to simulate this code (see mcp_example("rel_prior")$simulated which is represented in the match and sim columns):

summary(fit)
Population-level parameters:
    name match   sim  mean  lower upper Rhat n.eff
    cp_1    OK  25.0 23.15  20.00 25.81 1.00   297
    cp_2        40.0 51.85  47.06 56.36 1.02   428
   int_1        25.0 10.00  10.00 10.00  NaN     0
   int_2    OK -10.0 -6.86 -21.57 11.89 1.03   190
 sigma_1         7.0  9.70   8.32 11.18 1.00  7516
     x_1         1.0  1.58   1.24  1.91 1.07   120
     x_2    OK  -3.0 -3.28  -3.61 -2.96 1.04   293
     x_3         0.5  1.58   1.24  1.91 1.07   120

Do much more with the MCMC samples

Don’t be constrained by these simple mcp functions. fit$samples is a regular mcmc.list object and all methods apply. You can work with the MCMC samples just as you would with brms, rstanarm, jags, or other samplers using the always excellent tidybayes:

library(tidybayes)

# Extract all parameters:
tidy_draws(fit$samples) %>%
  # tidybayes stuff here

# Extract some parameters:
fit$pars$model  # check out which parameters are inferred.
spread_draws(fit$samples, cp_1, cp_2, int_1, year_1) %>%
 # tidybayes stuff here

It may be convenient to use fitted(fit, summary = FALSE) or predict(fit, summary = FALSE) which return draws in tidybayes format, extended with additional columns for fits and predictions. For example:

head(fitted(fit, summary = FALSE))

Citation

This preprint formally introduces mcp. Find citation info at the link, call citation("mcp") or copy-paste this into your reference manager:

  @Article{,
    title = {mcp: An R Package for Regression With Multiple Change Points},
    author = {Jonas Kristoffer Lindeløv},
    journal = {OSF Preprints},
    year = {2020},
    doi = {10.31219/osf.io/fzqxv},
    encoding = {UTF-8},
  }