--- title: "BayesianPower" author: "Fayette Klaassen" # date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{BayesianPower} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Introduction *BayesianPower* can be used for sample size determination (using `bayes_sampsize`) and power calculation (using `bayes_power`) when Bayes factors are used to compare an inequality constrained hypothesis $H_i$ to its complement $H_c$, another inequality constrained hypothesis $H_j$ or the unconstrained hypothesis $H_u$. Power is defined as a combination of controlled error probabilities. The unconditional or conditional error probabilities can be controlled. Four approaches to control these probabilities are available in the methods of this package. **Users are advised to read this vignette and the paper available at 10.17605/OSF.IO/D9EAJ where the four available approaches are presented in detail (Klaassen, Hoijtink \& Gu, unpublished)).** ## Power calculation with `bayes_power()` `bayes_power(n, h1, h2, m1, m2, sd1=1, sd2=1, scale = 1000, bound1 = 1, bound2 = 1/bound1, datasets = 1000, nsamp = 1000, seed = 31)` ### Arguments ` n` A number. The sample size for which the error probabilities must be computed. ` h1` A constraint matrix defining H1, see below for more details. ` h2` A constraint matrix defining H2, or a character `'u'` or `'c'` for the unconstrained or complement hypothesis. `m1` A vector of expected population means under H1 (standardized), see below for more details. ` m2` A vector of expected populations means under H2 (standardized). ` m2` must be of same length as ` m1`. ` sd1` A vector of standard deviations under H1. Must be a single number (equal standard deviation under all populations), or a vector of the same length as `m1` ` sd2` A vector of standard deviations under H2. Must be a single number (equal standard deviation under all populations), or a vector of the same length as `m2` ` scale` A number or use the default `1000` to set the prior scale. ` bound1` A number. The boundary above which BF12 favors H1, see below for more details. ` bound2` A number. The boundary below which BF12 favors H2. ` datasets` A number. The number of datasets to simulate to compute the error probabilities ` nsamp` A number. The number of prior or posterior samples to determine the complexity or fit. ` seed` A number. The random seed to be set. ### Details #### Specifying hypotheses Hypotheses are defined by means of a constraint matrix, that specifies the ordered constraints between the means $\boldsymbol\mu$ using a constraint matrix $R$, such that $R \boldsymbol{\mu} > \bf{0}$, where $R$ is a matrix with $J$ columns and $K$ rows, where $J$ is the number of group means and $K$ is the number of constraints between the means, $\boldsymbol\mu$ is a vector of $J$ means and $\bf{0}$ is a vector of $K$ zeros. The constraint matrix $R$ contains a set of linear inequality constraints. Consider ```{r} R <- matrix(c(1,-1,0,0,1,-1), nrow = 2, byrow = TRUE) mu <- c(.4, .2, 0) R mu R %*% mu (R %*% mu) > 0 ``` The matrix $R$ specifies that the sum of $1 \times \mu_1$ and $-1 \times \mu_2$ and $0 \times \mu_3$ is larger than $0$, **and** the sum of $0 \times \mu_1$ and $1 \times \mu_2$ and $-1 \times \mu_3$ is larger than $0$. This can also be written as: $\mu_1 > \mu_2 > \mu_3$. For more information about the specification of constraint matrices, see for example [@hoijtink12book]. The argument `h1` has to be a constraint matrix as specified above. The argument `h2` can be either a constraint matrix, or the character `'u'` or `'c'` if the goal is to compare $H_1$ with $H_u$, the unconstrained hypothesis, or $H_c$ the complement hypothesis. #### Specifying population means Hypothesized population means have to be defined under $H_1$ and $H_2$, also if $H_u$ or $H_c$ are considered as $H_2$. The group specific standard deviations can be set under `sd1` and `sd2`, by default, all group standard deviations are $1$. #### Prior scale The prior scale can be set using `scale`. By default, a scale of `1000` is used. This implies that the prior covariance matrix is proportional to the standard errors of the sampled data, by a factor of `1000`. #### Setting bounds `bound1` and `bound2` describe the boundary used for interpreting a Bayes factor. If `bound1 = 1`, all $BF_{12} > 1$ are considered to express evidence in favor of $H_1$, if `bound1 = 3`, all $BF_{12} > 3$ are considered to express evidence in favor of $H_1$. Similarly, `bound2` is the boundary *below* which $BF_{12}$ is considered to express evidence in favor of $H_2$. ### Examples #### Example 1. $H_1$ vs $H_c$ An example where three group means are ordered in $H_1: \mu_1 > \mu_2 > \mu_3$ which is compared to its complement. The power is determined for $n = 40$ ```{r, eval = FALSE} h1 <- matrix(c(1,-1,0,0,1,-1), nrow= 2, byrow= TRUE) h2 <- 'c' m1 <- c(.4,.2,0) m2 <- c(.2,0,.1) bayes_power(40, m1, m2, h1, h2) ``` #### Example 2. H1 vs H2 An example where four group means are ordered in $H_1: \mu_1 > \mu_2 > \mu_3 > \mu_4$ and in $H_2: \mu_3 > \mu_2 > \ mu_4 > \mu_1$. Only Bayes factors larger than $3$ are considered evidence in favor of $H_1$ and only Bayes factors smaller than $1/3$ are considered evidence in favor of $H_2$. ```{r, eval = FALSE} h1 <- matrix(c(1,-1,0,0,0,1,-1,0,0,0,1,-1), nrow= 3, byrow= TRUE) h2 <- matrix(c(0,-1,1,0,0,1,0,-1,-1,0,0,1), nrow = 3, byrow= TRUE) m1 <- c(.7,.3,.1,0) m2 <- c(0,.4,.5,.1) bayes_power(34, h1, h2, m1, m2, bound1 = 3, bound2 = 1/3) ``` ## Sample size determination with `bayes_sampsize()` ` bayes_sampsize(h1, h2, m1, m2, sd1 = 1, sd2 = 1, scale = 1000, type = 1, cutoff, bound1 = 1, bound2 = 1 / bound1, datasets = 1000, nsamp = 1000, minss = 2, maxss = 1000, seed = 31) ` ### Arguments The arguments are the same as for `bayes_power()` with the addition of: `type`A character. The type of error to be controlled. The options are: `"1", "2", "de", "aoi", "med.1", "med.2"`. See below for more details. `cutoff` A number. The cutoff criterion for type. If `type` is `"1", "2", "de", "aoi"`, `cutoff` must be between $0$ and $1$. If `type` is `"med.1"` or `"med.2"`, `cutoff` must be larger than $1$. See below for more details. `minss` A number. The minimum sample size. `maxss` A number. The maximum sample size. ### Details `bayes_sampsize()` iteratively uses `bayes_power()` to determine the error probabilities for a sample size, evaluates whether the chosen error is below the cutoff, and adjusts the sample size. #### `type` [@klaassenPIH] describes in detail the different types of controlling error probabilities that can be considered. Specifying `"1"` or `"2"` indicates that the Type 1 or Type 2 error probability has to be controlled, respectively the probability of concluding $H_2$ is the best hypothesis when $H_1$ is true or concluding that $H_1$ is the best hypothesis when $H_2$ is true. Note that when $H_1$ or $H_2$ is considered the best hypothesis depends on the values chosen for `bound1` and `bound2`. Specifying `"de"` or `"aoi" ` indicates that the Decision error probability (average of Type 1 and Type 2) or the probability of Indecision has to be controlled. Finally, specifying `" med.1"` or `"med.2"` indicates the minimum desired median $BF_{12}$ when $H_1$ is true, or the minimum desired median $BF_{21}$ when $H_2$ is true. ### Examples #### Example 1. $H_1$ versus $H_c$, controlling decision error ```{r, eval = FALSE} h1 <- matrix(c(1, -1, 0, 0, 1, -1), nrow= 2, byrow= TRUE) h2 <- 'c' m1 <- c(.4, .2, 0) m2 <- c(.2, 0, .1) bayes_sampsize(h1, h2, m1, m2, type = "de", cutoff = .125) ``` #### Example 2. H1 versus H2, controlling indecision error ```{r, eval = FALSE} h1 <- matrix(c(1, -1, 0, 0, 0, 1, -1, 0, 0, 0, 1, -1), nrow= 3, byrow= TRUE) h2 <- matrix(c(0, -1, 1, 0, 0, 1, 0, -1, -1, 0, 0, 1), nrow = 3, byrow= TRUE) m1 <- c(.7, .3, .1, 0) m2 <- c(0, .4, .5, .1) bayes_sampsize(h1, h2, m1, m2, type = "aoi", cutoff = .2, minss = 2, maxss = 500) ``` #### Example 3. $H_1$ versus $H_u$, controlling median Bayes factor ```{r, eval = FALSE} h1 <- matrix(c(1, -1, 0, 0, 0, 1, -1, 0, 0, 0, 1, -1), nrow= 3, byrow= TRUE) h2 <- 'u' m1 <- c(.3, .2, 0) m2 <- c(0, 0, 0) bayes_sampsize(h1, h2, m1, m2, type = "med.1", cutoff = 3, minss = 2, maxss = 500) ``` ## References Hoijtink, H. (2012). *Informative hypotheses. Theory and practice for behavioral and social scientists.* Boca Raton: Chapman Hall/CRC. Klaassen, F., Hoijtink, H., Gu, X. (unpublished). *The power of informative hypotheses.* Pre-print available at https://doi.org/10.17605/OSF.IO/D9EAJ