--- title: "Warnings, Messages, and Errors" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Warnings, Messages, and Errors} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This vignette provides documentation for the different warnings and messages that are implemented for the package. Results of calls (eg printing of data frames following `set_contrasts`) will be hidden, but warnings and messages will still be shown. Code chunks will be self-contained, so bear with a little repetitiveness. I will primarily be using the `sum_code` function because it is the shortest one to write. ```{r setup, message=FALSE, warning=FALSE} library(contrastable) library(rlang) ``` ## Common Messages User will be notified if a non-factor column is coerced to a factor with `set_contrasts` and `enlist_contrasts`: ```{r,results='hide'} set_contrasts(mtcars, cyl ~ sum_code) ``` User will be notified if there exist other factors in the dataframe that have not been set in a call to `set_contrasts` or `enlist_contrasts`. Unordered factors are color coded in blue, while ordered factors are colored red. The default contrasts from `options("contrasts")` are also displayed and color coded accordingly. ```{r,results='hide'} my_data <- mtcars my_data$gear <- ordered(my_data$gear) my_data$carb <- factor(my_data$carb) my_data$cyl <- factor(my_data$cyl) set_contrasts(my_data, cyl ~ sum_code) ``` Ordered factors by default use `contr.poly` and return coefficient estimates interpretable as nth-degree polynomial trends. Typically, contrast matrices are set for unordered factors, so ordered factors are a bit of a special case. So, user will be notified if setting the contrasts for an ordered factor with `set_contrasts` or `enlist_contrasts` ```{r,results='hide'} my_data <- mtcars my_data$gear <- ordered(my_data$gear) set_contrasts(my_data, gear ~ sum_code) ``` ## Warnings User will be warned if attempting to set contrasts on a factor with only one level. If the user sets contrasts for multiple variables, any variables with more than one level will successfully be computed. However, the variables with only one level will just show no results. Note the output in the example below. ```{r} my_data <- data.frame(foo = factor("A"), boo = factor(c("B", "C"))) enlist_contrasts(my_data, foo ~ sum_code, boo ~ sum_code) ``` Related to the above, if the user only sets contrasts on factors that have only one level, an error will be thrown. The above warning will also be shown. ```{r} my_data <- data.frame(foo = factor("A"), boo = factor(c("B", "C"))) try(enlist_contrasts(my_data, foo ~ sum_code)) ``` Contrastable provides some flexibility in how contrasts are passed to formulas. Under the hood, methods are defined for the following object classes: - matrix - symbol/name - function - hypr You'll receive a warning if you use something else and the default contrast (depending on ordered or unordered) will be used instead. ```{r,results='hide'} my_matrix <- sum_code(3) # current class is "matrix" "array" class(my_matrix) <- "foo" # now class is "foo" set_contrasts(mtcars, cyl ~ my_matrix) # idk what to do with "foo" objects ``` So long as the object *inherits* one of the above classes though it will work as expected. Accordingly, note that no class coercion happens here. ```{r, results='hide'} class(my_matrix) <- c("foo", "matrix", "array") set_contrasts(mtcars, cyl ~ my_matrix) # idk what "foo" is but i know "matrix"! ``` Note that if you accidentally quote the function name, you will receive an error instead because it thinks you're trying to set contrasts with an atomic character object. ```{r,results='hide'} try(set_contrasts(mtcars, cyl ~ "sum_code")) # sum_code shouldnt be in quotes ``` The same issue can commonly occur if you don't order the right hand side correctly. The first term should always be a contrast-generating function or a variable name that's been assigned an object of the above classes. Failure to do this can lead to other kinds of errors depending on what the first element to the right hand side is. I've tried to document a few cases of illformedness, but I can't do every case. ```{r, results='hide'} try(set_contrasts(mtcars, cyl ~ 4 + sum_code)) # bad! try(set_contrasts(mtcars, cyl ~ sum_code + 4)) # good! # These give different kinds of errors, all are ill-formed try(set_contrasts(mtcars, cyl ~ +4 + sum_code)) try(set_contrasts(mtcars, cyl ~ c("a", "b") + sum_code)) try(set_contrasts(mtcars, cyl ~ 1 + 2 + 3 + sum_code)) ``` ### hypr integration hypr is another package for contrast coding but focuses on setting the desired comparisons for a factor manually. The philosophy there is that *all* comparisons should be explicitly written down, while the philosophy with this package is that well-defined contrast schemes should be implemented in a way that is deterministic and not prone to typing errors. Accordingly, the special syntax this package provides is ignored when using hypr objects. Note that the examples here are set to not be run so that you're not forced to have the hypr package installed in order to install the package/build the vignette. ```{r,eval=FALSE, results='hide'} library(hypr) my_data <- data.frame(foo = factor(c("A", "B", "C"))) hypr_object <- hypr::hypr(A ~ B, A ~ C) set_contrasts(my_data, foo ~ hypr_object + "B" * "B" - "C") ``` ``` Warning messages: 1: In use_contrasts.hypr(factor_col = get(params[["factor_col"]], model_data), : reference_level ignored when using hypr object 2: In use_contrasts.hypr(factor_col = get(params[["factor_col"]], model_data), : set_intercept ignored when using hypr object 3: In use_contrasts.hypr(factor_col = get(params[["factor_col"]], model_data), : drop_trends ignored when using hypr object ``` hypr objects don't need access to the levels in a factor, so there's an opportunity for mismatches in level names to arise. The matrix might still work out to be what you intend, but this isn't guaranteed. ```{r,eval=FALSE, results='hide'} my_data <- data.frame(foo = factor(c("A", "B", "C"))) hypr_object <- hypr::hypr(varA ~ varB, varA ~ varC) set_contrasts(my_data, foo ~ hypr_object)$foo ``` ``` Warning message: In use_contrasts.hypr(factor_col = get(params[["factor_col"]], model_data), : Levels in hypr object not found in factor column `foo`: varA, varB, varC Contrasts may be misspecified. ``` Generally, this package isn't really designed with the hypr-user in mind. But, so long as you're mindful of the level names and the number of levels, then you can plug in hypr objects freely in conjunction with other methods. This is helpful if you need to use a custom matrix for just one variable, but the rest can use "standard" contrasts. ```{r,eval=FALSE,results='hide'} my_data <- data.frame(foo = factor(c("A", "B", "C")), boo = factor(c("A", "B", "C"))) hypr_object <- hypr::hypr(A ~ B, A ~ C) enlist_contrasts(my_data, foo ~ hypr_object, boo ~ sum_code) ``` ### glimpse_contrasts safety checks `glimpse_contrasts` has a fairly extensive set of warnings, so I'm giving it its own subsection. Broadly, these warnings relate to mismatches between what the user defines in a series of formulas and what is actually set on the provided dataframe. Special to these warnings is that they will provide written out R code that the user can copy and paste from the console to fix the issue--- where "fix" refers to "setting the contrasts to the dataframe explicitly with `set_contrasts`". User will be warned if the contrast matrix for a factor in a dataframe does not match the contrasts defined by the provided formulas. ```{r,results='hide'} my_data <- data.frame(foo = factor("A"), boo = factor(c("B", "C"))) glimpse_contrasts(my_data, boo ~ sum_code) ``` Note that you can also define the contrast formulas in a list, and this will be reflected in the warning as well. ```{r,results='hide'} my_data <- data.frame(foo = factor("A"), boo = factor(c("B", "C"))) # Define our contrasts outside the call clist <- list(boo ~ sum_code) glimpse_contrasts(my_data, clist) # Note the final line in the warning ``` Using some of the contrast-generating matrix functions in this package *with* `set_contrasts` or `enlist_contrasts` will automatically set the comparison labels (i.e., the column names of the contrast matrix). However, using these functions on their own will not specify the labels (because they merely return matrices given some arbitrary number of levels). So, if you manually set the contrasts to a factor without also defining the comparison labels, then the labels will be missing. When using the same contrast generating function in `glimpse_contrast`, the user will be warned that the *comparison labels don't match* (even though the *contrast matrices* do). Here's an example when the labels are missing: ```{r,results='hide'} my_data <- mtcars my_data$cyl <- factor(my_data$cyl) # This will erase the column names contrasts(my_data$cyl) <- helmert_code(3) glimpse_contrasts(my_data, cyl ~ helmert_code) ``` And here's an example when they exist but don't match ```{r,results='hide'} my_data <- mtcars my_data$cyl <- factor(my_data$cyl) # contr.treatment by default glimpse_contrasts(my_data, cyl ~ treatment_code | c("6vs4", "8vs4")) ``` The user will be warned when specifying contrasts with `glimpse_contrasts` for a variable that isn't a factor. So, `set_contrasts` will coerce the variable to a factor, but `glimpse_contrasts` doesn't modify the dataframe it's given. ```{r,results='hide'} my_data <- mtcars glimpse_contrasts(my_data, cyl ~ sum_code) ``` ALL of the above described warnings can occur at the same time. They are combined into a single warning. Note that when warned about the *comparison labels* not matching, the numeric matrices are nonetheless the same. Typically, when the matrices are different, the labels will *also* be different, so the latter isn't reported when the matrices differ to save space. ```{r,results='hide'} my_data <- mtcars my_data$cyl <- factor(my_data$cyl) # contr.treatment by default my_data$carb <- factor(my_data$carb) contrasts(my_data$cyl) <- sum_code(3) my_data$am <- factor(my_data$am) glimpse_contrasts(my_data, cyl ~ sum_code, carb ~ sum_code, gear ~ sum_code, am ~ treatment_code | c("diffLabel")) ``` Currently, `glimpse_contrasts` depends on being provided with the desired contrast formulas/matrices to provide a summary table correctly. If you set the contrasts, then try to summarize the dataframe without the the same formulas, then you'll get a warning that the factor (whether unordered or ordered) does not match the default contrasts that would normally be expected. In the example below, the contrasts are set to `sum_code`, but because the default contrasts for unordered factors is `contr.treatment`, the contrast matrices won't match for `my_data$cyl`. The scheme will be reported as `"???"` ```{r} my_data <- set_contrasts(mtcars, cyl ~ sum_code, verbose = FALSE) glimpse_contrasts(my_data) ``` Currently, there is not an elegant way for `glimpse_contrasts` to check what the contrast scheme is based solely on the resulting matrix. Moreover, many contrast schemes are the same when the number of levels is 2, meaning that a matrix of `+.5/-.5` could be `scaled_sum_code` or `helmert_code` (among others). I'm also reluctant to add additional attributes to the dataframe for the sole purpose of `glimpse_contrasts` being able to summarize things. The most robust solution if you want to make use of `glimpse_contrasts` is to specify a list of contrast matrices, then pass the list to `set_contrasts` and `glimpse_contrasts`. ```{r,results='hide'} contrast_list <- list(cyl ~ scaled_sum_code, carb ~ sum_code, gear ~ sum_code, am ~ treatment_code | c("diffLabel")) my_data <- set_contrasts(mtcars, contrast_list, verbose = FALSE) glimpse_contrasts(my_data, contrast_list) ``` ## Errors Here are some common errors, though not all of them are errors from contrastable per se. You'll receive an error if you use a matrix that's a different size from what the contrast matrix should be. Specifically, contrasts matrices for a factor with n levels should be size `nx(n-1)`. ```{r} my_matrix <- sum_code(4) try(set_contrasts(mtcars, cyl ~ my_matrix)) # cyl has 3 levels, not 4 ``` You'll receive an error if you try to set the reference level (or intercept) to a level that doesn't exist in the factor. ```{r} try(set_contrasts(mtcars, cyl ~ sum_code + 100)) try(set_contrasts(mtcars, cyl ~ sum_code * "blah")) ``` You'll receive an error if you accidentally use `=` instead of `~` ```{r} try(set_contrasts(mtcars, cyl = sum_code)) ``` You'll receive an error if you accidentally specify contrasts for the same variable more than once: ```{r} try(set_contrasts(mtcars, cyl ~ sum_code, cyl ~ scaled_sum_code)) try(set_contrasts(mtcars, cyl + gear ~ sum_code, cyl ~ scaled_sum_code)) try(set_contrasts(mtcars, where(is.numeric) ~ sum_code, cyl ~ scaled_sum_code)) these_vars <- c("cyl", "gear") try(set_contrasts(mtcars, all_of(these_vars) ~ sum_code, where(is.numeric) ~ scaled_sum_code)) ``` You'll receive an error if you accidentally forget to pass the dataframe: ```{r} try(set_contrasts(cyl ~ sum_code)) ``` You'll receive an error if you forget to pass any contrasts formulas: ```{r} try(set_contrasts(mtcars)) ``` Variables with only one level will be ignored, but you'll receive an error if there are no remaining variables with more than one level (i.e., if you *only* specify one-level factors). Note that you'll still get the warning message. ```{r} try(set_contrasts(data.frame(a = factor(1)), a ~ sum_code)) ```