\documentclass[a4paper]{article} %\VignetteIndexEntry{Short Introduction - MMCTest} \usepackage{hyperref} \usepackage{natbib} %length of the line in R-outputs <>= options(width=80) @ \title{R-package ``simctest'' \\ R-class ``mmctest'' \\ A Short Introduction} \author{Axel Gandy and Georg Hahn} \begin{document} \maketitle This document describes briefly how to use the class ``mmctest'', included in the R-package ``simctest''. It implements the methods from ``MMCTest-A Safe Algorithm for Implementing Multiple Monte Carlo Tests'', based on \cite{gandy12:mmct}. The class can be used to evaluate the statistical significance of each hypothesis in a multiple testing problem. \section{Installation} The functions described in this document are included in the R-package ``simctest''. Please see the documentation of ``simctest'' on how to install the package. \section{Usage} The package is loaded by typing <<>>= library(simctest) @ This document can be accessed via <>= vignette("simctest-mmctest-intro") @ Documentation of the most useful commands can be obtained as follows: \begin{verbatim} > ? simctest > ? mcp > ? mmctest \end{verbatim} \subsection{Implementing a Monte Carlo multiple testing problem} The following is an artificial example. Implementing a Monte Carlo multiple testing problem consists of two stages. Firstly, an interface to draw samples has to be provided. This can be done in two ways, either by implementing the generic class \texttt{mmctSamplerGeneric} or by directly providing the number $m$ of hypotheses and a function $f$ which generates samples. Both ways are described in the next section. Secondly, an object of type \texttt{mmctest} has to be created. It provides a ``run'' method which uses the \texttt{mmctest}-object and an \texttt{mmctSamplerGeneric}-object to evaluate the multiple testing problem. The algorithm used in class \texttt{mmctest} is the one introduced in \cite{gandy12:mmct}. The multiple testing problem is evaluated until at least one of four stopping criteria is satisfied, see below for a detailed description. Stopped tests can be resumed with the ``cont'' function. Printing an object of type \texttt{mmctest} or \texttt{mmctestres} will display the number of already rejected and non-rejected hypotheses, the number of undecided hypotheses and the total number of samples drawn up to the current stage. \subsubsection{Implementing the sampling interface} An interface for drawing new samples has to be provided for each multiple testing problem. If new samples are simply generated by a function $f$, the derived class \texttt{mmctSampler} provided in \texttt{simctest} can be used as a shortcut. It works as follows: Any function $f$ used to draw new samples has to be able to accept the arguments ``ind'', a vector with indices of hypotheses and a vector ``n'' containing the number of samples to be drawn for each hypothesis in vector ``ind''. The function $f$ has to return a vector containing the number of significant test statistics for each hypothesis specified in ``ind''. For instance, passing a vector ``ind'' of $(2,5)$ and a vector ``n'' of $(5,10)$ as arguments means that $5$ more samples are requested for the hypothesis with index $2$ and $10$ more for the hypothesis with index $5$. The function $f$ might need further data to evaluate the tests. Such data can be passed on to $f$ as third argument \texttt{data}. For instance, <<>>= fun <- function(ind,n,data) sapply(1:length(ind), function(i) sum(runif(n[i])<=data[ind[i]])); @ is a function which draws samples from hypotheses having p-values given in vector \texttt{data}. The package \texttt{mmctest} provides a shortcut which can be used to easily specify the interface. Given a function \texttt{fun} to draw samples and the number \texttt{num} of hypotheses, \texttt{fun} and \texttt{num} (and additional data \texttt{data}) can be passed on to the class \texttt{mmctSampler} which returns a derived object of the generic class \texttt{mmctSamplerGeneric}. For example, <<>>= s <- mmctSampler(fun,num=500,data=c(rep(0,100),runif(400))); @ returns an sampler interface \texttt{s} for the function \texttt{fun} defined above and $500$ p-values used to draw new samples. The class \texttt{mmctSamplerGeneric} can also directly be overwritten with an own sampler interface. Any sampler has to implement the two generic functions \texttt{getSamples} and \texttt{getNumber}: <<>>= # class mmctSampler1, inherited from mmctSamplerGeneric setClass("mmctSampler1", contains="mmctSamplerGeneric", representation=representation(data="numeric")) # get n[i] new samples for every index i in ind setMethod("getSamples", signature(obj="mmctSampler1"), function(obj, ind, n) { sapply(1:length(ind), function(i) { return(sum(runif(n[i])<= obj@data[ind[i]])); }); } ) # get number of hypotheses setMethod("getNumber", signature(obj="mmctSampler1"), function(obj) { return(length(obj@data)); } ) @ In this case, the sampler will be <<>>= s <- new("mmctSampler1", data=c(rep(0,100),runif(400))); @ \subsubsection{A simple run of the algorithm} After having specified the sampler, the main algorithm can be executed. This is done by creating an object of type \texttt{mmctest} using the pseudo-constructor \begin{center} \texttt{mmctest(epsilon=0.01, threshold=0.1, r=10000, h, thompson=F, R=1000)}, \end{center} where \texttt{epsilon} is the overall error on the classification being correct one is willing to spend (see \cite{gandy12:mmct}) and \texttt{threshold} is the multiple testing threshold. The \texttt{MMCTest} algorithm uses a ``spending sequence'' which controls how the overall error probability is spent on each of the $m$ hypotheses in each iteration (see \cite{gandy12:mmct}). The parameter \texttt{r} with default value $r=10000$ controls after which number of samples half the error probability has been spent and can be chosen arbitrarily. The function $h$ is the multiple testing procedure. Thompson sampling can be used to efficiently allocate each new batch of samples per iteration, see \cite{quickmmctest}: it can be activated using the switch \texttt{thompson} (default is \textit{false}), where the parameter $R$ determines the accuracy (default value $1000$ repetitions) with which weights are computed \citep{quickmmctest}. The coming subsubsections contain further details. Any function \begin{center} \texttt{h <- function(p, threshold) {...}} \end{center} can be used as a multiple testing procedure as long as it takes a vector $p$ of p-values and a threshold \texttt{threshold} as arguments and returns the indices of all rejected hypotheses as vector. The Benjamini-Hochberg procedure \texttt{hBH}, its modification by \cite{PoundsCheng2006} \texttt{hPC} and the Bonferroni correction \texttt{hBonferroni} are available by default: <<>>= s <- mmctSampler(fun,num=500,data=c(rep(0,100),runif(400))); m <- mmctest(h=hBH); @ The algorithm can now be started by calling \begin{center} \texttt{run(alg, gensample, maxsteps=list(maxit=0, maxnum=0, undecided=0, elapsedsec=0))} \end{center} which takes an object \texttt{alg} of type \texttt{mmctest}, a sampler object \texttt{gensample} to generate samples and a list \texttt{maxsteps} as stopping criterion. The list \texttt{maxsteps} can include a maximal number of iterations \texttt{maxit} after which the algorithm stops, a maximal total number of samples \texttt{maxnum} drawn until stopping, a number \texttt{undecided} of undecided hypotheses one is willing to tolerate or a time constraint \texttt{elapsedsec} in seconds. Specifying other items in list \texttt{maxsteps} will lead to an error message and an empty list will be reset to the default list \begin{center} \texttt{list(maxit=0, maxnum=0, undecided=0, elapsedsec=0)}. \end{center} As an example, the following lines evaluate the previously created multiple testing problem \texttt{m} using the Benjamini-Hochberg procedure \texttt{hBH} and the previous sampler \texttt{s}. The algorithm stops before reaching more than a total of $1000000$ samples or after all but $20$ hypotheses are classified: <<>>= m <- run(m, s, maxsteps=list(maxnum=1000000,undecided=20)); m @ Printing the object displays the number of already rejected and non-rejected hypotheses, the number of undecided hypotheses and the total number of samples drawn up to the current stage. A formatted summary of the indices belonging to rejected and undecided hypotheses can be printed via \texttt{summary.mmctestres}. All indices not printed belong to non-rejected hypotheses. <<>>= summary.mmctestres(m) @ \subsubsection{Continuing a run of the algorithm} Each run can be continued with the \texttt{cont} function using a new stopping criterion: <<>>= m <- cont(m, steps=list(undecided=10)); m @ Here, the algorithm has been applied again to the previously stopped multiple testing problem \texttt{m}. It has been resumed until all but $10$ hypotheses were classified. As before, \texttt{maxit}, \texttt{maxnum}, \texttt{undecided} and \texttt{elapsedsec} are valid stopping criteria for parameter \texttt{steps} of function \texttt{cont}. \subsubsection{Requesting the test result} The current test result can be requested from any \texttt{mmctestres} object. Calling \texttt{testResult} of class \texttt{mmctestres} will return a list containing indices of rejected hypotheses (vector `\$rejected'), nonrejected hypotheses (vector `\$nonrejected') and undecided hypotheses (vector `\$undecided'). For the previously continued run of object \texttt{m}, the result of the computation can be requested as follows: <<>>= res <- testResult(m); res$undecided length(res$rejected) length(res$nonrejected) @ In the example above, the current computation result of object \texttt{m} is stored in variable \texttt{res}. For object \texttt{m}, the algorithm has been run until all but (at least) $10$ hypotheses have been classified. The indices of the undecided hypotheses as well as the number of rejected and nonrejected hypotheses (i.e. the length of the vectors containing their indices) are displayed. \subsubsection{Confidence intervals and estimates of p-values} At any stage, p-values can be estimated based on the total number of samples drawn for each hypothesis during intial or continued runs: <<>>= estimate <- pEstimate(m); lastindex <- length(estimate); estimate[lastindex] @ The function \texttt{pEstimate} takes an object of type \texttt{mmctest} as argument and returns a vector containing estimates of all p-values. Similarly, the current confidence limits for the exact (Clopper-Pearson) confidence intervals can be requested: <<>>= l <- confidenceLimits(m); l$lowerLimits[lastindex] l$upperLimits[lastindex] @ The function \texttt{confidenceLimits} takes an object of type \texttt{mmctest} as argument and returns a list containing lower confidence limits (vector `lowerLimits') and upper confidence limits (vector `upperLimits') for each p-value. \subsubsection{Switching to QuickMMCTest} The QuickMMCTest algorithm presented in \cite{quickmmctest} is included as a special case of MMCTest. To activate MMCTest, please set the parameter \texttt{thompson} in the \texttt{mmctest} constructor to TRUE. The constructor contains a further value $R$ (default value $R=1000$) which controls the accuracy with which weights are computed in QuickMMCTest, see \cite{quickmmctest} for an explanation. Apart from setting \texttt{thompson} to TRUE, nothing needs to be done to use QuickMMCTest. Typically, QuickMMCTest is run using a maximal total effort and a maximal number of iterations as stopping criteria: if these two are specified in \texttt{maxit} and \texttt{maxnum} (see "A simple run of the algorithm"), QuickMMCTest spreads out the total number of samples over all \texttt{maxit} iterations and uses the weights to allocate samples. If QuickMMCTest is run without a maximal total effort and a maximal number of iterations, it allocates the batch of samples that MMCTest would have allocated, which is geometrically increased in each iteration, see \cite{gandy12:mmct}. QuickMMCTest thus runs until the algorithm is stopped manually or the desired number of undecided hypotheses is reached. \subsubsection{Empirical rejection probabilities} When using QuickMMCTest, hypotheses can be classified after termination using three methods: \texttt{summary.mmctestres} provides sets of rejected, non-rejected and undecided hypotheses as in MMCTest, \texttt{pEstimates} provides estimated p-values computed with a pseudo-count which can be plugged into the multiple testing procedure, and finally <<>>= rej <- rejProb(m)>0.5; rej[1] @ provides empirical rejection probabilities for QuickMMCTest as used in \cite{quickmmctest}. These are obtained by drawing $R$ sets of all the $m$ p-values from the Beta posteriors for all p-values, by evaluating the multiple testing procedure on each set of p-values and by recording the number of times each hypothesis is rejected based on the sampled p-values. The resulting proportion of rejections out of $R$ repetitions are reported by the method \texttt{rejProb} (as a vector of length $m$, one number between $0$ and $1$ per hypothesis) and give an indicator of how stable the decision on each hypothesis is: i.e.\ proportions close to one indicate a very stable decision that a hypothesis is rejected, likewise proportions close to zero indicate non-rejections and values close to $0.5$ indicate very unstable decisions. By thresholding them against, e.g.\ $0.5$, empirical rejections can be obtained as demonstrated in the above code example (all hypotheses with a rejection probability above $0.5$ are rejected). The threshold $0.5$ is arbitrary and can be replaced by higher (lower) values to be more (less) conservative. \subsubsection{An extended example} In this extended example a permutation test is used to determine if two groups $A$ and $B$ have equal means. This is done in $n_{groups}=20$ cases. Each group has size $4$ and both groups $A$ and $B$ are stored together in one row of length $n=8$ in a matrix $G$. <<>>= n <- 8; ngroups <- 20; G <- matrix(rep(0,n*ngroups), nrow=ngroups); for(j in 1:(ngroups/2)) G[j,] <- c(rnorm(n/2,mean=0,sd=0.55),rnorm(n/2,mean=1,sd=0.55)); for(j in (ngroups/2+1):ngroups) G[j,] <- rnorm(n,mean=0,sd=3); @ To implement this test as a Monte-Carlo test, we start by overwriting the generic class \texttt{mmctSamplerGeneric} to specify the sampler. The data stored in an \texttt{ExSampler} object is the matrix $G$. <<>>= # class ExSampler, inherited from mmctSamplerGeneric setClass("ExSampler", contains="mmctSamplerGeneric", representation=representation(data="matrix")) setMethod("getSamples", signature(obj="ExSampler"), function(obj, ind, n) { sapply(1:length(ind), function(i) { v <- obj@data[ind[i],]; s <- matrix(rep(v,n[i]+1), byrow=T, ncol=length(v)); for(j in 1:n[i]) s[j+1,] <- sample(v); means <- abs(rowMeans(s[,1:(length(v)/2)])- rowMeans(s[,(length(v)/2+1):length(v)])); return(sum(means>means[1])); }); } ) setMethod("getNumber", signature(obj="ExSampler"), function(obj) { return(length(obj@data[,1])); } ) @ The \texttt{getSamples} method generates \texttt{n[i]} permutations of each row \texttt{i} in the indices vector \texttt{ind} and counts how many times the generated means exceeded the data mean (stored in row $1$). The sampler is then <<>>= exsampler <- new("ExSampler", data=G); @ As before, the multiple testing problem is set up by creating an object of type \texttt{mmctest} using \texttt{hBH} as multiple testing procedure and the \texttt{exsampler} object as sampler interface. The constructor \texttt{mmctest} uses a default threshold of $0.1$. <<>>= m <- mmctest(h=hBH); m <- run(m, exsampler, maxsteps=list(undecided=0)); @ Algorithm \texttt{mmctest} has been run until all hypotheses were classified. Based on this run, the following hypotheses are rejected: <<>>= testResult(m)$rejected @ Estimates for p-values are <<>>= pEstimate(m) @ To verify this result, exact p-values are computed by enumerating all permutations of each row. This is done using algorithm \texttt{QuickPerm}. <>= quickperm <- function(a) { n <- length(a); m <- matrix(rep(0,n*gamma(n+1)),ncol=n); mi <- 1; m[mi,] <- a; p <- rep(0,n); i <- 1; while(imeans[1])/length(means); } @ Based on the exact p-values, given by <<>>= pexact @ the Benjamini-Hochberg procedure at threshold $0.1$ will give the following set of rejections: <<>>= which(hBH(pexact, threshold=0.1)) @ \bibliographystyle{plainnat} \bibliography{papers} \end{document}