--- title: "odin" author: "Rich FitzJohn" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{odin} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ``` {r setup, echo = FALSE, results = "hide"} lang_output <- function(x, lang) { cat(c(sprintf("```%s", lang), x, "```"), sep = "\n") } c_output <- function(x) lang_output(x, "cc") r_output <- function(x) lang_output(x, "r") plain_output <- function(x) lang_output(x, "plain") knitr::opts_chunk$set( fig.width = 7, fig.height = 5) options(odin.verbose = FALSE) ``` # Single variable: Logistic growth We'll first start with a very simple model; logistic growth. We have one variable `N` which grows towards a carrying capacity `K`. The growth rate is `r` when `N` is very small (note that this equation can be easily solved analytically and it's only included here for demonstration!) ``` {r } path_logistic <- system.file("examples/logistic.R", package = "odin") ``` ``` {r echo = FALSE, results = "asis"} r_output(readLines(path_logistic)) ``` The time derivatives are indicated by `deriv(N) <- `; the right hand side of this is the differential equation. Initial conditions are always given (future versions may make them optional) and can be in terms of variables. Note that the definition of declaration is not important; the derivative calculation references parameters `r` and `K` which have not been defined and the initial conditions reference `N0`. **NOTE** while this _looks_ like R (and must *parse* as R) it is not actually R code. Copying and pasting this code into an R session would throw an error To compile this as a standalone model (not suitable for inclusion in a package) use `odin::odin`; ``` {r } generator <- odin::odin(path_logistic) ``` This returns an R6 generator that can be used to create an instance of the model: ``` {r } mod <- generator$new() mod ``` This is an `ode_system` object and can be used to run the system of differential equations, query the internal state, and so on. Most of the listed elements above do not need to be accessed ever (I would make them private but that incurs a very slight performance cost). The initial conditions: ``` {r } mod$init ``` dN/dt evaluated at the time zero, with the initial conditions: ``` {r } mod$deriv(0, mod$initial(0)) ``` or with arbitrary conditions: ``` {r } mod$deriv(0, 50) ``` To see the contents of all the variables, intermediates and parameters tracked by odin use `contents()`: ``` {r } mod$contents() ``` To run the model, provide a set of times: ``` {r } tt <- seq(0, 30, length.out = 101) y <- mod$run(tt) plot(y, xlab = "Time", ylab = "N", las = 1, main = "") ``` While initial conditions are computed, you may pass in arbitrary initial conditions; ``` {r } y2 <- mod$run(tt, 50) plot(y, xlab = "Time", ylab = "N", las = 1, main = "") lines(y2, col = "red") ``` ## Specifying parameters The above is about the simplest possible useful model I could come up with. But it's not that useful because all the parameters are hard coded into the model. Here's a slightly tweaked version where the parameters may be specified at runtime: ``` {r } generator <- odin::odin({ deriv(N) <- r * N * (1 - N / K) initial(N) <- N0 N0 <- user(1) K <- user(100) r <- user() }) ``` when used as `user(1)` (as for `N` and `K`) it means we allow a user parameter called `N0` to be specified, but if omitted it has a default value of 1. When used as `user()` (as for `r`) it means that there is no default value and `r` **must** be provided. Note this example takes the code of the model as its first argument. It can also be specified as text, but the form above should be nicer to write than text, especially in editors with support for things like autocompletion. Because `r` is required, it is an error to initialise the model without any parameters: ``` {r error = TRUE, purl = FALSE} generator$new() ``` Providing `r`: ``` {r } mod <- generator$new(r = 1) ``` The model contains the set value of r ``` {r } mod$contents()$r ``` Running the model shows the effect of doubling the growth rate: ``` {r } y3 <- mod$run(tt) plot(y, xlab = "Time", ylab = "N", las = 1, main = "") lines(y3, col = "red") ``` Once created a model can have the parameters re-set: ``` {r } mod$set_user(r = 0.25, K = 75, N0 = 10) y4 <- mod$run(tt) plot(y, xlab = "Time", ylab = "N", las = 1, main = "") lines(y3, col = "red") lines(y4, col = "blue") ``` # More than one variable: the Lorenz attractor Once more than one variable is involved, there is a bit more book-keeping to take care of. ``` {r } path_lorenz <- system.file("examples/lorenz.R", package = "odin") ``` The Lorenz attractor is a set of coupled ODEs that displays chaotic behaviour. It was found by reducing a set of equations describing atmospheric convection in the 1960s. Mostly I just think it looks pretty. ``` {r echo = FALSE, results = "asis"} r_output(readLines(path_lorenz)) ``` The system is not chaotic with all parameters. The initial conditions are fairly arbitrary here as the system will settle into its characteristic shape from most points (though with completely different realisations). Building the generator, and from that a system, is the same as the above: ``` {r } generator <- odin::odin(path_lorenz) mod <- generator$new() ``` Because of the rapid changes that characterise this model, we'll take a _lot_ of samples. ``` {r } tt <- seq(0, 100, length.out = 20000) system.time(y <- mod$run(tt)) pairs(y[, -1L], panel = lines, lwd = .2, col = "#00000055") ``` # Delay models Models with lags and delays represent a challenge in solving and representing the system, see [this wikipedia article](https://en.wikipedia.org/wiki/Delay_differential_equation) for some of the issues. In the general case this gives a set of partial differential equations, which are much harder to solve generally than ordinary differential equations. Like Berkeley Madonna, `odin` allows solving equations with lags, using the machinery available in `deSolve`. One can specify a lag in a state variable or an expression. The lag times can also be variables. ``` {r } gen <- odin::odin({ ylag <- delay(y, tau) initial(y) <- 0.5 deriv(y) <- 0.2 * ylag * 1 / (1 + ylag^10) - 0.1 * y tau <- user(10) output(ylag) <- ylag }) dde <- gen$new() ``` The first line here ```r ylag <- delay(y, tau) ``` specifies the delay; it says that `ylag` should be equal to `y` from `tau` ago. Or; `ylag(t) = y(t - tau)`. `delay` expressions must be the only expressions on the rhs of an assignment. The first argument can be a single variable (like here) or it can be an expression. Running the model will automatically use the `deSolve::dede` function. ``` {r } t <- seq(0, 300, length.out = 301) y1 <- dde$run(t) plot(y1, ylab = "y", mfrow = c(1, 2), which = 1) plot(y1[, -1L], xlab = "y", ylab = "ylag", mfrow = NULL, type = "l") ``` Increasing the lag time creates much more complicated trajectories: ``` {r } dde$set_user(tau = 20) y2 <- dde$run(t) plot(y2, ylab = "y", mfrow = c(1, 2), which = 1) plot(y2[, -1L], xlab = "y", ylab = "ylag", mfrow = NULL, type = "l") ``` # Arrays Much of the above, especially the non-delay equations, can be expressed easily in deSolve. Even where the target functions are written in R they will tend to be fast enough to solve (unless strongly non-linear) so that the gains for writing out in C become small. `odin` was really designed for cases where the variables involved are _arrays_, used here to mean vectors, matrices or 3d matrices. There are quite a few restrictions on how arrays are used, how indexing works, etc. Some of the current implementation requires creation of intermediate variables to make everything work (and naming things is one of the [two hard problems in computer science](https://martinfowler.com/bliki/TwoHardThings.html). Hopefully in future versions some of these restrictions can be relaxed, so [file issues](https://github.com/mrc-ide/odin/issues) if you find yourself writing boilerplate code that could be replaced by more intelligent generation. ## Generalised Lotka-Volterra model This example comes from the [wikipedia page](https://en.wikipedia.org/wiki/Competitive_Lotka%E2%80%93Volterra_equations#4-dimensional_example), in turn from [Vano et al 2006](https://sprott.physics.wisc.edu/pubs/paper288.pdf). This model has four non-linear differential equations that have chaotic behaviour in fairly low dimension. ``` {r } gen <- odin::odin({ deriv(y[]) <- r[i] * y[i] * (1 - sum(ay[i, ])) initial(y[]) <- y0[i] y0[] <- user() r[] <- user() a[, ] <- user() ay[, ] <- a[i, j] * y[j] dim(r) <- user() n_spp <- length(r) dim(y) <- n_spp dim(y0) <- n_spp dim(a) <- c(n_spp, n_spp) dim(ay) <- c(n_spp, n_spp) config(base) <- "lv4" }) ``` These are the parameters from the wikipedia article, with `y0` being close to the (unstable) equilibrium. ``` {r } pars <- list(r = c(1.00, 0.72, 1.53, 1.27), a = rbind(c(1.00, 1.09, 1.52, 0.00), c(0.00, 1.00, 0.44, 1.36), c(2.33, 0.00, 1.00, 0.47), c(1.21, 0.51, 0.35, 1.00)), y0 = c(0.3013, 0.4586, 0.1307, 0.3557)) ``` Note that we pass in `r` as a matrix, from which the number of species is determined. The competitive matrix `a` is passed through as a matrix -- both dimensions must be the same as the length of `r` (you can't just pass in a vector and hope it gets unpacked in the right order). ``` {r } mod <- gen$new(user = pars) t <- seq(0, 2000, length.out = 10001) y <- mod$run(t) pairs(y[, -1], panel = lines, col = "#00000055", lwd = 0.2) ``` # Interpolating functions It may be useful to have functions that are driven by user data in your model. To allow this, odin allows interpolating functions of a few different types; * step functions: constant piecewise functions, perhaps suitable for modelling shifts in regime, binary on/off switches, etc. * linear interpolation: between a set of points use a linear interpolating function. * cubic spline functions: what most people probably think of when thinking interpolating function; creates a smooth function that is twice differentiable (and smooth enough for most purposes). In contrast with the splines R uses with `spline()`, these use the *natural spline* end conditions, and are equal to `spline(t, y, method = "natural")`. The `interpolate` takes as its first argument a vector of times. If you're using constant or linear interpolating functions it is important that you include these times within times that the integrator passes through because otherwise the integrator may struggle needlessly with the discontinuities in the function or its derivatives (NOTE: future versions may handle this automatically, and this will not work at all with `dde` which does not yet support stopping at critical times). The second argument of each interpolating function is a vector, matrix or array of values for the target function. The dimensionality (rank) of the structure must be one greater than the rhs. So if you want a classic 1d function, you'd write: ```r z <- interpolate(tt, y) tt[] <- user() y[] <- user() dim(tt) <- user() dim(y) <- length(tt) ``` and `z` will be a scalar double, and `t` and `y` are both vectors with lengths that must match. The check for matching is done at runtime (when the model is initialised) so you can always just leave the dimensions as `user()`. Future versions may allow autogeneration of the boilerplate here, but for now you must specify the four lines required to initialise `t` and `y`. Note that `t` cannot be used as the first argument because it's a reserved variable! If you want to interpolate a *vector* of `y` values over time, then you'd write: ```r z[] <- interpolate(tt, y) dim(z) <- 4 tt[] <- user() y[, ] <- user() dim(tt) <- user() dim(y) <- c(length(tt), length(z)) ``` In this case, `z` is a vector (here length 4), so `y` must be a matrix with `length(t)` rows and 4 columns. This does not do anything clever - the interpolation happens for each of the four variables independently. If you want to drive the size of this off of `y` then specify: ```r dim(z) <- ncol(y) dim(z) <- user() # (other entries same as above) ``` in which case the size of `z` is determined from the number of columns of `y`. By default, `odin` assumes you want do do cubic interpolation, but you can alter that with the third argument. Specify `"constant"` to indicate constant interpolation and `"linear"` to indicate linear interpolation. All modes of interpolation use a fast lookup (bisection search that remembers where it started from on the last lookup) to try and make interpolation as fast as possible. Performance should be *O(log(n))* in the number of points used to create the interpolating function, so the penalty for using many points in your interpolation should be low. No care is taken about making sure that the integration stops at painful parts of your interpolating function. So, especially for the constant interpolation, be sure to include the switch points in the time vector (currently this will not work for `dde` which never stops at a specified point except for the start and end point). Future versions, especially once events are handled, may do this automatically. As an example, here is the flux model from the deSolve compiledCode vignette (see `vignette("compiledCode")`, section 6). The model here is: dC/dt = flux(t) - k * C ``` {r } flux_model <- odin::odin({ deriv(C) <- flux - kk * C initial(C) <- C0 flux <- interpolate(flux_t, flux_y, "linear") C0 <- user() kk <- user() output(deposition) <- kk * C ## Fair bit of boilerplate here that may be removed in future ## versions: flux_t[] <- user() flux_y[] <- user() dim(flux_t) <- user() dim(flux_y) <- user() }) ``` Here we arrange for the components of the interpolating function to be passed through as user vectors, which is the only way of specifying these models at present. The interpolation type, here "linear", is hard coded and cannot be changed at runtime. ``` {r } flux_t <- c(1, 11, 21, 41, 73, 83, 93, 103, 113, 123, 133, 143, 153, 163, 173, 183, 194, 204, 214, 224, 234, 244, 254, 264, 274, 284, 294, 304, 315, 325, 335, 345, 355, 365) flux_y <- c(0.654, 0.167, 0.06, 0.07, 0.277, 0.186, 0.14, 0.255, 0.231, 0.309, 1.127, 1.923, 1.091, 1.001, 1.691, 1.404, 1.226, 0.767, 0.893, 0.737, 0.772, 0.726, 0.624, 0.439, 0.168, 0.28, 0.202, 0.193, 0.286, 0.599, 1.889, 0.996, 0.681, 1.135) plot(flux_t, flux_y, type = "l", ylab = "Flux", xlab = "Time") ``` Following the deSolve vignette, the initial conditions are derived from the parameters and the flux over time: ``` {r } k <- 0.01 C0 <- mean(approx(flux_t, flux_y, xout = 1:365)$y) / k ``` Initialise the model: ``` {r } mod <- flux_model$new(kk = k, C0 = C0, flux_t = flux_t, flux_y = flux_y) ``` And run over one year: ``` {r } t <- seq(1, 365) y <- mod$run(t, tcrit = 365) ``` The output in all its glory: ``` {r } plot(flux_t, flux_y, type = "l", col = "red", ylab = "Flux & deposition") lines(t, y[, 3], col = "blue") ``` Alternatively, we could have used constant or spline interpolation. This (at least at present) requires re-specifying and recompiling the entire model. So, for a constant model: ``` {r } flux_model_c <- odin::odin({ deriv(C) <- flux - kk * C initial(C) <- C0 flux <- interpolate(flux_t, flux_y, "constant") C0 <- user() kk <- user() output(flux) <- flux output(deposition) <- kk * C ## Fair bit of boilerplate here that may be removed in future ## versions: flux_t[] <- user() flux_y[] <- user() dim(flux_t) <- user() dim(flux_y) <- user() }) mod_c <- flux_model_c$new(kk = k, C0 = C0, flux_t = flux_t, flux_y = flux_y) y_c <- mod_c$run(t, tcrit = 365) mod_c$output_order plot(t, y_c[, 3], type = "s", col = "red", ylab = "Flux & deposition") lines(t, y_c[, 4], col = "blue") ``` Or with cubic splines: ``` {r } flux_model_s <- odin::odin({ deriv(C) <- flux - kk * C initial(C) <- C0 flux <- interpolate(flux_t, flux_y, "spline") C0 <- user() kk <- user() output(flux) <- flux output(deposition) <- kk * C ## Fair bit of boilerplate here that may be removed in future ## versions: flux_t[] <- user() flux_y[] <- user() dim(flux_t) <- user() dim(flux_y) <- user() }) mod_s <- flux_model_s$new(kk = k, C0 = C0, flux_t = flux_t, flux_y = flux_y) y_s <- mod_s$run(t, tcrit = 365) plot(t, y_s[, 3], type = "l", col = "red", ylab = "Flux & deposition") lines(t, y_s[, 4], col = "blue") ```