surreal

Overview

surreal implements the “Residual (Sur)Realism” algorithm described by Stefanski (2007). This package allows you to generate datasets that reveal hidden images or messages in their residual plots, providing a novel approach to understanding and illustrating statistical concepts.

Installation

You can install the development version of surreal from GitHub with:

# install.packages("remotes")
remotes::install_github("coatless-rpkg/surreal")

Usage

First, load the package:

library(surreal)

We can take an image with x and y coordinate positions for pixels and embed it into the residual plot.

Importing Data

As an example, let’s use the built-in R logo dataset:

data("r_logo_image_data", package = "surreal")

plot(r_logo_image_data, pch = 16, main = "Original R Logo Data")

The data is in a 2D format:

str(r_logo_image_data)
#> 'data.frame':    2000 obs. of  2 variables:
#>  $ x: int  54 55 56 57 58 59 34 35 36 49 ...
#>  $ y: int  -9 -9 -9 -9 -9 -9 -10 -10 -10 -10 ...
summary(r_logo_image_data)
#>        x                y         
#>  Min.   :  5.00   Min.   :-75.00  
#>  1st Qu.: 32.00   1st Qu.:-57.00  
#>  Median : 57.00   Median :-39.00  
#>  Mean   : 55.29   Mean   :-40.48  
#>  3rd Qu.: 77.00   3rd Qu.:-24.00  
#>  Max.   :100.00   Max.   : -9.00

Applying the Surreal Method

Now, let’s apply the surreal method:

set.seed(114)
transformed_data <- surreal(r_logo_image_data)

The transformation adds predictors that appear to have no underlying patterns:

pairs(y ~ ., data = transformed_data, main = "Data After Transformation")

Revealing the Hidden Image

Fit a linear model to the transformed data and plot the residuals:

model <- lm(y ~ ., data = transformed_data)
plot(model$fitted, model$resid, pch = 16, 
     main = "Residual Plot: Hidden R Logo Revealed")

The residual plot reveals the original R logo with a slight border, enhancing the image recovery.

Creating Custom Hidden Images

You can also create datasets with custom hidden images or text. Here’s a quick example using text:

text_data <- surreal_text("R\nis\nawesome!")
model <- lm(y ~ ., data = text_data)
plot(model$fitted, model$resid, pch = 16, main = "Custom Text in Residuals")

References

Stefanski, L. A. (2007). “Residual (Sur)realism”. The American Statistician, 61(2), 163-177. doi:10.1198/000313007X190079

Acknowledgements

This package builds upon the work of John Staudenmayer, Peter Wolf, and Ulrike Gromping, who initially brought these algorithms to R.