--- title: "Introduction to the EAinference package" author: "Seunghyun Min" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to EAinference} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- `EAinference` package is designed to facilitate simulation based inference of lasso type estimators. The package includes - Fitting the lasso, group lasso, scaleld lasso and scaled group lasso models - Gaussian and wild multiplier bootstrap for lasso, group lasso, scaled lasso, scaled group lasso and their de-biased estimators - Importance sampler for approximating p-values in these methods - Markov chain Monte Carlo lasso sampler with applications in post-selection inference. ## Loss functions Let the unknown true model be $$ y = X\beta_0 + \epsilon,$$ where $\beta_0$ is a unknown true coefficient vector and $\epsilon_i \sim_{iid} N(0,\sigma^2)$. We use following loss functions. $$ \ell_{grlasso}(\beta ; \lambda) = \frac{1}{2}||y-X\beta||^2_2 + n\lambda\sum_j w_j||\beta_{(j)}||_2,$$ $$ \ell_{sgrlasso}(\beta, \sigma ; \lambda) = \frac{1}{2\sigma}||y-X\beta||^2_2 + \frac{\sigma}{2} + n\lambda\sum_j w_j||\beta_{(j)}||_2,$$ where $grlasso$ and $sgrlasso$ represent group lasso and scaled group lasso, respectively, and $w_j$ is the weight factor for the $j$-th group. Loss functions for lasso and scaled lasso can be treated as special cases of group lasso and group scaled lasso when every group is a singleton, respectively. ## Fit lasso, group lasso, scaled lasso and scaled group lasso models `lassoFit` computes lasso, group lasso, scaled lasso and scaled group lasso solutions. Users can either provide the value of $\lambda$ or choose to use cross-validation by setting `lbd = "cv.min"` or `lbd = "cv.1se"`. By letting `lbd = "cv.min"`, users can have $\lambda$ which gives minimum mean-squared error, while `lbd = "cv.1se"` is the largest $\lambda$ within 1 standard deviaion error bound of `"cv.min"`. ```{r} library(EAinference) set.seed(1234) n <- 30; p <- 50 Niter <- 10 group <- rep(1:(p/5), each = 5) weights <- rep(1, p/5) beta0 <- c(rep(1,5), rep(0, p-5)) X <- matrix(rnorm(n*p), n) Y <- X %*% beta0 + rnorm(n) # set lambda = .5 lassoFit(X = X, Y = Y, lbd = .5, group = group, weights = weights, type="grlasso") # use cross-validation lassoFit(X = X, Y = Y, lbd = "cv.1se", group = group, weights = weights, type="grlasso") ``` ## Parametric bootstrap sampler `PBsampler` supports two bootstrap methods; Gaussian bootstrap and wild multiplier bootstrap. Due to the fact that the sampling distirbution of $(\hat{\beta}, S)$, coefficient estimator and its subgradient, is characterized by $(\beta_0, \sigma^2, \lambda)$, users are required to provide `PE`(either the estimate of $\beta_0$ or the estimate of $E(y) = X\beta_0$), `sig2`(or estimate of $\sigma^2$) and `lbd`($\lambda$ from above loss functions). By specifying two sets of arguments, `(PE_1, sig2_1, lbd_1)` and `(PE_2, sig2_2, lbd_2)`, users can sample from the mixture distribution. In this way, samples will be drawn from `(PE_1, sig2_1, lbd_1)` with 1/2 probability and from `(PE_2, sig2_2, lbd_2)` with another 1/2 probability. ```{r} set.seed(1234) n <- 5; p <- 10 Niter <- 3 group <- rep(1:(p/5), each = 5) weights <- rep(1, p/5) beta0 <- c(rep(1,5), rep(0, p-5)) X <- matrix(rnorm(n*p), n) Y <- X %*% beta0 + rnorm(n) # # Using non-mixture distribution # PBsampler(X = X, PE_1 = rep(0, p), sig2_1 = 1, lbd_1 = .5, weights = weights, group = group, type = "grlasso", niter = Niter, parallel = FALSE) # # Using mixture distribution # PBsampler(X = X, PE_1 = rep(0, p), sig2_1 = 1, lbd_1 = .5, PE_2 = rep(1, p), sig2_2 = 2, lbd_2 = .3, weights = weights, group = group, type = "grlasso", niter = Niter, parallel = FALSE) ``` ## Importance sampler for high dimensional data Importance sampler enables users to access the inference of an extreme region. This is done by using a proposal distiribution that is denser around the extreme region. Say that we are interested in computing the expectation of a function of a random variable, $h(X)$. Let $f(x)$ be the true or target distribution and $g(x)$ be the proposal distribution. We can approximate the expectation by a weighted average of samples drawn from the proposal distribution as follows, $$ \begin{eqnarray} E_f\Big[h(X)\Big] &=& E_g \Big[h(X)\frac{f(X)}{g(X)} \Big]\\ &\approx& \frac{1}{N}\sum_{i=1}^N h(x_i)\frac{f(x_i)}{g(x_i)}. \end{eqnarray} $$ By using `hdIS` method, one can easily compute the importance weight which is the ratio of target density over proposal density; $f(x_i)/g(x_i)$ from above equation. Users can simply draw samples from the proposal distribution using `PBsampler` and plug in the result into `hdIS` with target distribution parameters in order to compute the importance weights. ```{r} # Target distribution parameter PETarget <- rep(0, p) sig2Target <- .5 lbdTarget <- .37 # Proposal distribution parameter PEProp1 <- rep(1, p) sig2Prop1 <- .5 lbdProp1 <- 1 # Draw samples from proposal distribution PB <- PBsampler(X = X, PE_1 = PEProp1, sig2_1 = sig2Prop1, lbd_1 = lbdProp1, weights = weights, group = group, niter = Niter, type="grlasso", PEtype = "coeff") # Compute importance weights hdIS(PB, PETarget = PETarget, sig2Target = sig2Target, lbdTarget = lbdTarget, log = TRUE) ``` ## Metropolis-Hastings Lasso Sampler In this section, we introduce `MHLS` method, a Markov Chain Monte Carlo(MCMC) sampler for lasso estimator. Although bootstrapping is one of the most convenient sampling methods, it has a clear limitation which is that sampling from the conditional distribution is impossible. In contrast, MCMC sampler can easily draw samples from the conditional distribution. Here, we introduce `MHLS` function which draws lasso samples under the fixed active set, A. ```{r} weights <- rep(1,p) lbd <- .37 LassoResult <- lassoFit(X = X,Y = Y,lbd = lbd, type = "lasso", weights = weights) B0 <- LassoResult$B0 S0 <- LassoResult$S0 Result <- MHLS(X = X, PE = B0, sig2 = 1, lbd = 1, weights = weights, B0 = B0, S0 = S0, niter = 100, burnin = 0, PEtype = "coeff") Result ``` We provide summary and plot functions for `MHLS` results. ```{r} summary(Result) ``` ```{r, fig.width = 7, fig.height = 4.5} plot(Result, index=c(1,4,9)) ``` ## Post-selection Inference `postInference.MHLS` is a method for post-selection inference. The inference is provided by `MHLS` results from multiple chains. In order to achieve the robust result, different `PE` values are used for each chain. After drawing samples of $(\hat{\beta}, S)$ with MH sampler, we then refit the estimator to remove the bias of the lasso estimator. The final output is the $(1-a)$ quantile of each active coefficient. ```{r} postInference.MHLS(X = X, Y = Y, lbd = lbd, sig2.hat = 1, alpha = .05, method = "coeff", nChain = 5, niterPerChain = 20, parallel = !(.Platform$OS.type == "windows")) ```