---
title: "Getting started with Rage"
author:
  - Patrick Barks
  - William K. Petry
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
    depth: 2
vignette: >
  %\VignetteIndexEntry{Getting started with Rage}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```


# Overview

`Rage` provides tools for manipulating and analysing matrix population models (MPMs). This vignette introduces the structure of the input and highlights key analysis functions across the package.

# Recap: Assumed knowledge
The functions in `Rage` assume that the reader is familiar with the basic structure of MPMs and how they are used to project population change and basic demographic statistics (e.g., the equilibrium population growth rate, $\lambda$). This section is intended as a recap of these concepts--readers ready to begin calculating more advanced life history metrics may want to skip ahead to [the next section](#mpms_in_r). We refer readers to [Caswell (2001)](#references) for a comprehensive introduction to MPMs.

## R.1. Basic anatomy of life cycles and MPMs
Matrix population models are a mathematical tool that integrate population dynamics and population structure to model the dynamics of populations with different 'types' of individuals, whether due to age, life stage, sex, genotype or any other attribute that causes differences in demographic rates. Consider a plant species with three different types of individuals: **seedlings** that have newly recruited into the population from seeds, non-reproductive **rosettes**, and **flowering** adults that make seeds and also reproduce asexually by vegetative budding. We can represent the biology of this life cycle graphically (Fig. 1) where each life stage is a node and the life stages are connected with arrows showing the paths (and probabilities) that individuals move between stages.

```{r,echo=FALSE,fig.align='center',fig.height=2,fig.width=6}
# hidden code to produce figures
library(DiagrammeR)
matA <- rbind(
  c(0.0, 0.0, 3.2),
  c(0.5, 0.3, 0.8),
  c(0.0, 0.4, 0.9)
)
stages <- c("seedling", "rosette", "flowering")
title <- NULL
graph <- expand.grid(to = stages, from = stages)
graph$trans <- round(c(matA), 3)
graph <- graph[graph$trans > 0, ]
nodes <- paste(paste0("'", stages, "'"), collapse = "; ")
graph$min_len <- (as.numeric(graph$to) - as.numeric(graph$from)) * 3
graph$col <- c(
  "PaleGreen4", "PaleGreen4", "PaleGreen4", "Goldenrod1",
  "MediumOrchid4", "PaleGreen4"
)
edges <- paste0("'", graph$from, "'", " -> ", "'", graph$to, "'",
  "[minlen=", graph$min_len,
  ",fontsize=", 10,
  ",color=", graph$col,
  ",xlabel=", paste("\"", graph$trans),
  "\"]\n",
  collapse = ""
)
grViz(
  paste(
    "
digraph {
  {
    graph[overlap=false];
    rank=same;
    node [shape=", "egg", ", fontsize=", 12, "];",
    nodes, "
  }",
    "ordering=out
  x [style=invis]
  x -> {", nodes, "} [style=invis]", edges,
    "labelloc=\"t\";
  label=\"", title, "\"
}"
  )
)
```
**Figure 1.** Life cycle diagram for a hypothetical plant with three life stages (nodes). Arrows are coloured by conventional groupings of demographic processes: growth and survival (green), sexual reproduction via seeds (gold) and asexual reproduction via budding (purple). The numbers along each arrow indicate the transition probability--how many individuals will there be in the stage at the arrow's end at the next time step for each individual at the arrow's start?

The **transition** or **projection matrix** (Fig. 2; called __A__ in matrix notation) for this life cycle is a square matrix where the rows and columns correspond to the life stages (nodes) and the elements correspond to transition probabilities (arrows). By convention, the columns reflect the current life stage (i.e., the arrow's start) and the rows reflect the arrow's end.

```{r,echo=FALSE,fig.align='center',fig.height=4,fig.width=4}
library(ggplot2)
ggdat <- merge(graph,
  expand.grid(to = stages, from = stages),
  by = c("to", "from"),
  all.x = TRUE, all.y = TRUE
)
ggdat$trans[is.na(ggdat$trans)] <- 0
ggdat$col[is.na(ggdat$col)] <- "transparent"
ggdat$to <- factor(ggdat$to, levels = c("flowering", "rosette", "seedling"))
ggdat$from <- factor(ggdat$from, levels = c("seedling", "rosette", "flowering"))
ggplot(ggdat, aes(x = from, y = to, label = trans)) +
  geom_tile(color = "black", fill = "white", linewidth = 0.25, show.legend = FALSE) +
  geom_text(size = 6) +
  scale_x_discrete(position = "top") +
  labs(x = "current life stage", y = "life stage at time t+1") +
  coord_equal(expand = FALSE) +
  theme_bw(base_size = 18) +
  theme(panel.border = element_blank())
```
**Figure 2.** Transition (projection) matrix corresponding to the life cycle in Fig. 1.

For basic MPM projection and equilibrium analyses, this single matrix is all that is needed. However, more advanced analyses like many of those in `Rage` use decompositions of the __A__ matrix into its components of growth and survival (__U__) and reproduction (__R__). Reproduction can be further decomposed into offspring produced sexually (__F__) and asexually/clonally (__C__; Fig. 3). The submatrices combine additively to create the transition matrix,

$$
\begin{aligned}
\mathbf{A} &= \mathbf{U} + \mathbf{R} \\
&= \mathbf{U} + \left(\mathbf{F} + \mathbf{C}\right).
\end{aligned}
$$
In our example, all of the elements of __A__ are found in only one of the submatrices. However, other life cycles may have multiple pathways from one life stage to another.

```{r,echo=FALSE,fig.align='center',fig.height=4,fig.width=9,out.width='100%'}
blankdat <- expand.grid(to = stages, from = stages, trans = 0)
blankdat$to <- factor(blankdat$to, levels = c(
  "flowering", "rosette",
  "seedling"
))
blankdat$from <- factor(blankdat$from,
  levels = c("seedling", "rosette", "flowering")
)
ggdat$col <- factor(ggdat$col,
  levels = c(
    "PaleGreen4", "Goldenrod1",
    "MediumOrchid4", "transparent"
  ),
  labels = c("U", "F", "C", "t")
)
ggplot(
  ggdat[ggdat$col != "t", ],
  aes(x = from, y = to, fill = col, label = trans)
) +
  geom_tile(
    data = blankdat,
    aes(fill = NULL), fill = "white", color = "black", linewidth = 0.25
  ) +
  geom_text(data = blankdat, aes(fill = NULL), size = 6) +
  geom_tile(color = "black", linewidth = 0.25, show.legend = FALSE) +
  geom_text(size = 6) +
  scale_x_discrete(position = "top") +
  scale_fill_manual(values = c(
    "F" = "goldenrod1",
    "C" = "mediumorchid4",
    "U" = "palegreen4",
    "t" = "white"
  )) +
  labs(x = "current life stage", y = "life stage at time t+1") +
  coord_equal(expand = FALSE) +
  facet_wrap(~col, nrow = 1) +
  theme_bw(base_size = 18) +
  theme(
    panel.border = element_blank(),
    strip.text = element_text(face = "bold"),
    strip.placement = "outside"
  )
```
**Figure 3.** Decomposition of the transition matrix, __A__, into __U__, __F__ and __C__ submatrices. The coloured elements correspond to the coloured arrows in Fig. 1.

## R.2. Projecting population change
Simple matrix multiplication allows the full transition matrix, __A__, to be projected forward in time for any starting population. We initiate the model with a *population vector*, $\mathbf{n}_{t=0}$, that contains the number of individuals in each life stage. Right multiplying the initial population vector to the transition matrix, __A__, yields the population vector at the next time step via the projection equation,

$$
\mathbf{n}_{t+1} = \mathbf{A}\mathbf{n}_{t}.
$$
Suppose we initialize the example plant population, $\mathbf{n}_{t=0}$, with five seedlings, 10 rosettes, and 15 flowering individuals. We expect that the population in the next time step will be,

$$
\begin{aligned}
\mathbf{n}_{t=1} &= \mathbf{A}\mathbf{n}_{t=0}, \\
\left[
\begin{matrix}
48.0 \\
17.5 \\
17.5
\end{matrix}
\right]
&=
\left[
\begin{matrix}
0 & 0 & 3.2 \\
0.5 & 0.3 & 0.8 \\
0 & 0.4 & 0.9
\end{matrix}
\right]
\left[
\begin{matrix}
5 \\
10 \\
15
\end{matrix}
\right].
\end{aligned}
$$
Summing the population vectors gives us the total population size ($N_{t=0} =\sum\mathbf{n}_{t=0} = 30$ and $N_{t=1} = 83$). Likewise, we can measure the instantaneous per-capita population growth rate as $N_{t+1}/N_{t} = 83/30 = 2.7\overline{6}$). Recursively substituting the new $\mathbf{n}_{t+1}$ into the projection equation in place of $\mathbf{n}_{t}$ allows us to continue projecting the population dynamics and life stage composition forward in time.

## R.3. Analysis of population equilibrium (eigenanalysis)
Provided that the transition matrix, __A__, meets certain conditions, iterative projections from any starting population vector $\mathbf{n}_{t=0}$ will eventually converge to an equilibrium per-capita population growth rate ($\lambda$) and stable stage distribution ($\mathbf{w}$). We can determine $\lambda$ and $\mathbf{w}$ by finding the dominant eigenvalue and its associated (right) eigenvector, respectively. The `popdemo` package provide functions to do these calculations (the package `popbio` also provides extensive functionality for matrix model analysis):

```{r,message=FALSE}
library(popdemo)

# define the transition matrix, A
A <- rbind(
  c(0.0, 0.0, 3.2),
  c(0.5, 0.3, 0.8),
  c(0.0, 0.4, 0.9)
)

# lambda: equilibrium per-capita population growth rate
popdemo::eigs(A = A, what = "lambda")

# w: stable stage distribution (relative frequencies)
popdemo::eigs(A = A, what = "ss")
```
From $\lambda$, we infer that the population would eventually grow at an equilibrium rate of 51% per time step. Likewise, from $\mathbf{w}$ we infer that nearly half of individuals will be seedlings, about a third will be rosettes and a fifth will be flowering at equilibrium.

Finally, we are reminded that MPM *projections* rely on consistency of the demographic vital rates (i.e., transition matrix elements) over time. Projections tell us what would happen if this strong assumption were to be met. Projections differ radically from *forecasts*, which are designed to predict what will happen. Forecasting models may use MPMs to drive population change, but will also typically require a component that predicts changes in the underlying demographic vital rates. [Keyfitz (1972)](#references) unpacks these differences further, with special emphasis on the value of projection as a window into the fundamental processes that drive the behaviour of populations. Much of the functionality of `Rage` rests on this ethos.

# Representing and loading MPMs in `Rage` {#mpms_in_r}

We'll begin by loading one of the example MPMs included in the `Rage` package, `mpm1`, which can be retrieved using the base 'data' function.

```{r}
library(Rage) # load Rage
data(mpm1) # load data object 'mpm1'
mpm1 # display the contents
```

`mpm1` is a list containing the decomposition of an MPM projection matrix into
its components of growth and survival (__U__) and reproduction (__R__ = __F__ + __C__). Many
`Rage` functions accept one or more of these components in analyses. The
biological meaning of each list element is indicated by its name:

- `matU` is the __U__ matrix, which is the __growth/survival__ component of an MPM,
containing transitions related to progression, stasis and retrogression. This is how individuals move among ages or life stages.
- `matR` is the __R__ matrix, which is the __reproductive__ component of an MPM,
containing transitions due to reproduction; either sexual, clonal, or both.
- `matF` is the __F__ matrix, which is the __sexual reproduction__ or fecundity 
component of the MPM, containing transitions due to sexual reproduction.
- `matC` is the __C__ matrix, which is the __clonal__ component of an MPM,
containing transitions due to clonal reproduction.

When the mode of reproduction is known (e.g., a species only reproduces sexually or the sexual and clonal offspring have been counted separately), `matR` can be substituted with `matF` or `matC`.

For any life history, we can reconstruct the full projection matrix, __A__, by adding together the components. For `mpm1`,  __A__ = __U__ + __F__. A population with separately counted sexual and clonal offspring production would be __A__ = __U__ + __F__ + __C__.

`Rage` functions currently accept MPM components as arguments; therefore, it is not necessary to group the matrix components together in a list. We anticipate that future package releases will define methods that accept objects of classes `matrix`, `CompadreMat`, and `CompadreDB` for compatibility with the [COM(P)ADRE databases](https://compadre-db.org/) and its accessor package [`Rcompadre`](https://github.com/jonesor/Rcompadre).

# Families of `Rage` functions for life history analysis
The functions in `Rage` fall into five broad categories, and are detailed in the subsections below:

```{r, echo=FALSE, message=FALSE, warnings=FALSE, results='asis'}
tabl <- "
| Function category                       | Stand-alone vignette             |
|-----------------------------------------|----------------------------------|
| 1. [Vital rates](#vitalrates)  | [VitalRates](a02_VitalRates.html)     |
| 2. [Life tables](#lifetable) | [AgeFromStage](a04_AgeFromStage.html) |
| 3. [Perturbation analysis](#perturb)    | n/a |
| 4. [Deriving life history traits](#lifehist) | [LifeHistoryTraits](a03_LifeHistoryTraits.html) |
| 5. [Transformation of matrices](#maniptransform) | n/a                                   |
"
cat(tabl)
```

A full list of functions by category is available at the [documentation webpage](https://jonesor.github.io/Rage/reference/index.html).

## 1. Standardized vital rates {#vitalrates}

The elements of an MPM transition matrix (__A__) generally are composites of two
or more vital rates (sometimes called ‘lower-level vital rates’). For example,
the transition from stage 1 to stage 2 (element $a_{21}$) may colloquially be thought
of as 'growth,' but importantly this transition is growth *conditional* on the 
individual surviving. Assuming a post-breeding census design, we can retroactively
partition each transition in MPM submatrices into survival (using the column sums 
of __U__) and one of the following: growth (using the lower triangle of __U__), 
shrinkage (upper triangle of __U__), stasis (diagonal of __U__), dormancy (from 
__U__ with user-indicated dormant stages), fecundity (from __F__) or clonality 
(from __C__).

The `vr_vec_*` family of functions provide a means to calculate lower-level vital 
rates for each life stage in the input matrix or matrices. For example, we can
calculate stage-specific survival and stasis from our example __U__ matrix:

```{r}
vr_vec_survival(matU = mpm1$matU)
vr_vec_stasis(matU = mpm1$matU)
```

Multiplying these two together yields the probability of stasis conditional on
survival, in other words the diagonal of our example __U__ matrix.

```{r}
# product of Pr(survival) and Pr(stasis) yields Pr(stasis|survived)
vr_vec_survival(matU = mpm1$matU) * vr_vec_stasis(matU = mpm1$matU)
diag(mpm1$matU) # equivalent to the diagonal of U matrix
```

`Rage` also provides functions to summarize these vital rates _across_ stage 
classes using the `vr_*` family of functions. These functions return a single value that
is the mean of the corresponding stage-specific vital rate vector from `vr_vec_*`.
Life stages may be excluded, allowing the user to tailor these calculations to 
the life history of the organism and their working definition of each vital rate.
Similarly, custom weights are allowed to control the contributions of life stages 
to the average. This functionality is more fully described in the 
[Vital Rates vignette](a02_VitalRates.html).

```{r}
vr_survival(matU = mpm1$matU, exclude_col = 1) # exclude 'seed' stage
mean(vr_vec_survival(mpm1$matU)[-1]) # equivalent to the mean without 'seed'
```

## 2. Deriving life tables from MPMs for age-from-stage analyses {#lifetable}

Classical demographic analyses were based on age-structured populations, in which 
surviving individuals always progress to the next age class. However, MPMs allow 
for demographic rates to be structured by any number of biological criteria, 
whether stage, sex, size, or condition. In these alternatively-structured models, 
individuals may remain in the same state or even 'regress' to states they had 
previously. Here there is no one-to-one mapping between state and age. Each 
individual still has an age, and understanding their age-specific demographic 
trajectories nevertheless offers a useful window into their life history. `Rage`
provides functions that decompose MPMs into life tables, thereby allowing for a 
wide array of demographic analyses based on age-specific trajectories (described 
below in [section 4](#lifehist)).

Following the age-from-stage methods of [Caswell (2001, ch. 5.3)](#references), 
we can convert an MPM to a full, age-structured life table or any of its component
columns.

```{r}
lt <- mpm_to_table(matU = mpm1$matU, matF = mpm1$matF) # full life table
lt
lx <- mpm_to_lx(matU = mpm1$matU) # survivorship to start of each age class
lx
```

`Rage` also supports translation between the different ways of expressing
age-specific mortality/survival, whether proportional survivorship (`lx`),
survival probability (`px`) or mortality hazard (`hx`), following an `x_to_y`
function name pattern.

```{r}
lx_to_px(lx = lx) # survivorship to survival probability
lx_to_hx(lx = lx) # survivorship to mortality hazard
```

When individuals can remain in the last life stage class, age-specific trajectories
of mortality and fertility can appear to plateau as a cohort asymptotically 
approaches its stable stage distribution (SSD). For these MPMs, we can choose
a quasi-stationary distribution (QSD) of stages that is within an arbitrarily 
close threshold of the SSD. Subsetting age-specific demographic trajectories to 
time points before a cohort reaches the QSD will minimise these plateau artefacts. 
In the example MPM, `mpm1`, a cohort of germinated individuals quickly converges 
towards the SSD:

```{r, warning=FALSE, message=FALSE, fig.align='center', fig.height=5, fig.width=6}
# project a germinated cohort through the U matrix
cohort <- popdemo::project(A = mpm1$matU, vector = c(0, 1, 0, 0, 0), time = 10)
popStructure <- vec(cohort) / rowSums(vec(cohort))

matplot(popStructure,
  type = "l", xlab = "time",
  ylab = "proportion in stage class"
)
```

Using `qsd_converge`, we can find the time at which the cohort reaches the QSD
and subset the life table or any component age-specific trajectory accordingly.

```{r}
# calculate time to QSD from the U matrix of an MPM
(q <- qsd_converge(mat = mpm1$matU, start = "small"))

# subset the life table rows to ages prior to the QSD
lt_preQSD <- lt[1:q, ]

# plot mortality trajectory from the life table subset (blue),
# showing plateau effect if the trajectory (grey) was allowed to continue to the
# QSD (dashed vertical line) and beyond
plot(qx ~ x,
  data = lt, type = "l", col = "darkgrey", ylim = c(0, 1),
  xlab = "age"
)
lines(qx ~ x, data = lt_preQSD, type = "l", col = "blue", lwd = 4)
abline(v = q, lty = "dashed")
```

We refer readers to the [AgeFromStage vignette](a04_AgeFromStage.html) for further
detail on these methods.

## 3. Perturbation analyses {#perturb}

Perturbation analyses of MPMs measure the response of demographic statistics, such 
as the equilibrium per-capita population growth rate ($\lambda$), to perturbations 
typically applied at the level of individual matrix elements and which may be perturbations of
a fixed or proportionate amount (**sensitivities** and **elasticities**, 
respectively). By extension, the vital rates (survival, growth, etc.) or
transition type (stasis, retrogression, etc.) underlying multiple matrix elements 
may be similarly perturbed. `Rage` provides functions for each scope of
perturbation, allowing the user to specify the desired demographic statistic and 
type of perturbation.

```{r}
# construct the transition matrix A = U + F (+ C when present)
mpm1$matA <- with(mpm1, matU + matF)

# sensitivity of lambda to...
# ...matrix element perturbations
perturb_matrix(
  matA = mpm1$matA,
  type = "sensitivity", demog_stat = "lambda"
)
# ...vital rate perturbations
perturb_vr(
  matU = mpm1$matU, matF = mpm1$matF,
  type = "sensitivity", demog_stat = "lambda"
)
# ...transition type perturbations
perturb_trans(
  matU = mpm1$matU, matF = mpm1$matF,
  type = "sensitivity", demog_stat = "lambda"
)
```

## 4. Deriving life history traits {#lifehist}

What is the life expectancy of an individual? At what age will it begin to 
reproduce? How likely is it to survive to reproduction? What is the generation 
time? These high-level questions address the population-level life history traits 
that emerge from aggregating individual-level demographic rates, and tracing 
trajectories through the life cycle.

`Rage` provides functions to calculate a wide range of life history traits from 
both MPMs and life tables. We illustrate the general pattern of these functions 
below, referring the reader to the [LifeHistoryTraits vignette](a03_LifeHistoryTraits.html)
and the [documentation webpage](https://jonesor.github.io/Rage/reference/index.html) 
for a complete list of functionality.

MPM-based analyses take the required submatrix (or submatrices), optionally allowing
the user to specify the life stage at which to start the calculation, choose among 
alternative methods and set cut-off thresholds.
```{r}
# post-germination time steps until post-germination survivorship falls below 5%
longevity(matU = mpm1$matU, start = "small", lx_crit = 0.05)
# expected lifetime production of 'small' offspring by a 'small' individual
net_repro_rate(
  matU = mpm1$matU, matR = mpm1$matF, start = "seed",
  method = "start"
)
```

It can sometimes be useful to calculate age-based trajectories from stage-based matrices. This can be achieved with functions like `mpm_to_lx`.

```{r}
# derive post-germination survivorship trajectory from U matrix
lx <- mpm_to_lx(matU = mpm1$matU, start = "small")
```

## 5. Transforming MPMs {#maniptransform}

`Rage` includes a variety of functions that can be used to manipulate or transform
MPMs. For example, we can collapse an MPM to a smaller number of stage classes
using `mpm_collapse()`.

```{r}
# collapse 'small', 'medium', and 'large' stages into single stage class
col1 <- mpm_collapse(
  matU = mpm1$matU, matF = mpm1$matF,
  collapse = list(1, 2:4, 5)
)
col1$matA
```

Notice that the stage names are lost during this operation. We can re-add new 
stage names using the function `name_stages`, electing to use generic or custom 
names.

```{r}
# automated stage naming
(col1_auto <- name_stages(mat = col1, prefix = "class_"))

# overwrite with custom stages
(col1_cust <- name_stages(
  mat = col1, names = c("seed", "active", "dormant"),
  prefix = NULL
))
```

The transition rates in the collapsed matrix are a weighted average of the
transition rates from the relevant stages of the original matrix, weighted by
the stable distribution at equilibrium. This process guarantees that the
collapsed MPM will retain the same population growth rate as the original.
However, other demographic and life history characteristics will not necessarily
be preserved.

```{r}
# compare population growth rate of original and collapsed MPM (preserved)
popdemo::eigs(A = mpm1$matA, what = "lambda")
popdemo::eigs(A = col1_cust$matA, what = "lambda")

# compare net reproductive rate of original and collapsed MPM (not preserved)
net_repro_rate(matU = mpm1$matU, matR = mpm1$matF)
net_repro_rate(matU = col1_cust$matU, matR = col1_cust$matF)
```

# References {#references}

Caswell, H. (2001). Matrix Population Models: Construction, Analysis, and Interpretation. 2nd edition. Sinauer Associates, Sunderland, MA. ISBN-10: 0878930965

Keyfitz, N. (1972). On future population. Journal of the American Statistical Association 67:347-363.