--- title: "Getting Started with DMRnet" author: "Szymon Nowakowski" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with DMRnet} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # This document The purpose of this vignette is to introduce readers to `DMRnet` package, bring them up to speed by providing a simple use case example and point interested readers towards more comprehensive informative material. # `DMRnet` package `DMRnet` is an `R` package for regression and classification with a family of model selection algorithms. `DMRnet` package supports both continuous and categorical predictors. The overall number of regressors may exceed the number of observations. The selected model consists of a subset of numerical regressors and partitions of levels of factors. The available model selection algorithms are the following: - `DMRnet` is the default and the most comprehensive model selection algorithm in the package, it can be used both for $p mean(y)) #changing Miete response var y into a binomial factor with 2 classes binomial_models <- DMRnet(X, binomial_y, family="binomial") gic.binomial_model <- gic.DMR(binomial_models) gic.binomial_model$df.min ``` A corresponding `predict` call has a `type` parameter with the default value `"link"`, which returns the linear predictors. To change that behavior one can substitute the default with `type="response"` to get the fitted probabilities or `type="class"` to get the class labels corresponding to the maximum probability. So to get actual values compatible with a binomial `y`, `type="class"` should be used: ```{r predict-binomial} predict(gic.binomial_model, newx=head(X), type="class") ``` Please note that 1 is the value of a target class in the `predict` output. # References 1. Szymon Nowakowski, Piotr Pokarowski, Wojciech Rejchel and Agnieszka Sołtys, 2023. *Improving Group Lasso for High-Dimensional Categorical Data*. In: Computational Science – ICCS 2023. Lecture Notes in Computer Science, vol 14074, p. 455-470. Springer, Cham. 2. Szymon Nowakowski, Piotr Pokarowski and Wojciech Rejchel. 2021. *Group Lasso Merger for Sparse Prediction with High-Dimensional Categorical Data.* arXiv [stat.ME]. 3. Aleksandra Maj-Kańska, Piotr Pokarowski and Agnieszka Prochenka, 2015. *Delete or merge regressors for linear model selection.* Electronic Journal of Statistics 9(2): 1749-1778. 4. Piotr Pokarowski and Jan Mielniczuk, 2015. *Combined l1 and greedy l0 penalized least squares for linear model selection.* Journal of Machine Learning Research 16(29): 961-992. 5. Piotr Pokarowski, Wojciech Rejchel, Agnieszka Sołtys, Michał Frej and Jan Mielniczuk, 2022. *Improving Lasso for model selection and prediction.* Scandinavian Journal of Statistics, 49(2): 831–863. 6. Ludwig Fahrmeir, Rita Künstler, Iris Pigeot, Gerhard Tutz, 2004. Statistik: der Weg zur Datenanalyse. 5. Auflage, Berlin: Springer-Verlag. 7. Dean P. Foster and Edward I. George, 1994. *The Risk Inflation Criterion for Multiple Regression.* The Annals of Statistics 22 (4): 1947–75.