--- title: "Group Comparisons with Extra Sum-of-Squares F-Test" author: "Brent Kaplan" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Group Comparisons with Extra Sum-of-Squares F-Test} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( echo = TRUE, collapse = TRUE, comment = "#>", dev = "png", dpi = 144, fig.width = 7, fig.height = 5, warning = FALSE, message = FALSE ) ``` # Introduction When you have multiple groups (e.g., treatment vs. control, different populations), it is often useful to compare whether separate demand curves are preferred over a single curve. The **Extra Sum-of-Squares _F_-test** addresses this by fitting a single curve to all data, computing the residual deviations, and comparing those residuals to those obtained from separate group-level curves. For background on fitting individual demand curves, see `vignette("beezdemand")`. For mixed-effects approaches to group comparisons, see `vignette("mixed-demand")`. > **Note:** For formal statistical inference on group differences, mixed-effects > models (`vignette("mixed-demand")`) are the preferred modern approach, offering > random effects and estimated marginal means. `ExtraF()` remains useful for > quick, traditional group comparisons. ```{r packages, include = FALSE, echo = FALSE} library(dplyr) library(ggplot2) library(beezdemand) ``` ## Setting Up Groups We will use the built-in `apt` dataset and manufacture random groupings for demonstration purposes. ```{r ftest} ## setting the seed initializes the random number generator so results will be ## reproducible set.seed(1234) ## manufacture random grouping apt$group <- NA apt[apt$id %in% sample(unique(apt$id), length(unique(apt$id))/2), "group"] <- "a" apt$group[is.na(apt$group)] <- "b" ## take a look at what the new groupings look like in long form knitr::kable(apt[1:20, ]) ``` ## Running the Extra Sum-of-Squares F-Test The `ExtraF()` function will determine whether a single $\alpha$ or a single $Q_0$ is better than multiple $\alpha$s or $Q_0$s. A resulting _F_ statistic will be reported along with a _p_ value. ```{r ftest2} ## in order for this to run, you will have had to run the code immediately ## preceeding (i.e., the code to generate the groups) ef <- ExtraF(dat = apt, equation = "koff", k = 2, groupcol = "group", verbose = TRUE) ``` A summary table (broken up here for ease of display) will be created when the option `verbose = TRUE`. This table can be accessed as the `dfres` object resulting from `ExtraF`. In the example above, we can access this summary table using `ef$dfres`: ```{r ftest-ouput, results = 'asis', echo=FALSE} knitr::kable(ef$dfres[, 1:5], caption = "Fitted Measures") knitr::kable(ef$dfres[, c(1, 6:8)], caption = "Uncertainty and Model Information") knitr::kable(ef$dfres[, c(1, 9:11)], caption = "Derived Measures") knitr::kable(ef$dfres[, c(1, 12, 14)], caption = "Convergence and Summary Information") ``` ## Visualizing Group Comparisons When `verbose = TRUE`, objects from the result can be used in subsequent graphing. The following code generates a plot of our two groups. We can use the predicted values already generated from the `ExtraF` function by accessing the `newdat` object. In the example above, we can access these predicted values using `ef$newdat`. Note that we keep the linear scaling of y given we used Koffarnus et al. (2015)'s equation fitted to the data. ```{r plot-ftest, warning = FALSE} ## be sure that you've loaded the tidyverse package (e.g., library(tidyverse)) ggplot(apt, aes(x = x, y = y, group = group)) + ## the predicted lines from the sum of squares f-test can be used in subsequent ## plots by calling data = ef$newdat geom_line(aes(x = x, y = y, group = group, color = group), data = ef$newdat[ef$newdat$x >= .1, ]) + stat_summary(fun.data = mean_se, aes(width = .05, color = group), geom = "errorbar") + stat_summary(fun = mean, aes(fill = group), geom = "point", shape = 21, color = "black", stroke = .75, size = 4) + scale_x_log10(limits = c(.4, 50), breaks = c(.1, 1, 10, 100)) + scale_color_discrete(name = "Group") + scale_fill_discrete(name = "Group") + labs(x = "Price per Drink", y = "Drinks Purchased") + theme(legend.position = c(.85, .75)) + ## theme_apa is a beezdemand function used to change the theme in accordance ## with American Psychological Association style theme_apa() ``` ## See Also - `vignette("beezdemand")` -- Getting started with beezdemand - `vignette("model-selection")` -- Choosing the right model class for your data - `vignette("fixed-demand")` -- Fixed-effect demand modeling - `vignette("mixed-demand")` -- Mixed-effects nonlinear demand models (NLME) - `vignette("hurdle-demand-models")` -- Two-part hurdle demand models - `vignette("cross-price-models")` -- Cross-price demand analysis - `vignette("migration-guide")` -- Migrating from `FitCurves()`