---
title: "Building Your Own Custom Pipeline Extensions"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Building Your Own Custom Pipeline Extensions}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

One key strength of the preprocessing framework within `eyeris` is its modularity. 

While we encourage most users to use the `glassbox()` function for simplicity and reproducibility, advanced users can create custom preprocessing steps that 
seamlessly integrate into the pipeline.

This vignette walks you through the structure required to write your own 
`eyeris`-compatible preprocessing functions.

## 🧩 How the Pipeline Works

Under the hood, each preprocessing function in `eyeris` is a wrapper around a 
core operation that gets tracked, versioned, and stored using the 
`pipeline_handler()`. 

Custom pipeline steps must conform to the `eyeris`
protocol for maximum compatibility with the downstream functions we provide.

Following the `eyeris` protocol also ensures:
- all operations follow a predictable structure, and
- that new pupil data columns based on previous operations in the chain are able
to be dynamically constructed within the core `timeseries` data frame.

For instance: 
`pupil_raw -> pupil_raw_deblink -> pupil_raw_deblink_detransient -> ...`

If you're unfamiliar with how these columns are structured and tracked, 
first check out the companion vignette: 
[📦 Anatomy of an `eyeris` Object](anatomy.html).

## 🛠️Creating a Custom Extension for `eyeris`

Let’s say you want to write a new `eyeris` extension function called `winsorize()`
  to apply [winsorization](https://en.wikipedia.org/wiki/Winsorizing) to extreme
  pupil values.
  
### 1) Write the core operation function

This function should accept a data frame `x`, a string `prev_op` (i.e., the name
of the previous pupil column), and any custom parameters.

#### To illustrate:

```{r eval=FALSE}
winsorize_pupil <- function(x, prev_op, lower = 0.01, upper = 0.99) {
  vec <- x[[prev_op]]
  q <- quantile(vec, probs = c(lower, upper), na.rm = TRUE)
  vec[vec < q[1]] <- q[1]
  vec[vec > q[2]] <- q[2]
  vec
}
```

### 2) Create the wrapper using the `eyeris::pipeline_handler()`

The `pipeline_handler()` enables your function to automatically:

- track your function within the `eyeris` list object's `params` field,
- append a new column to each block within the `timeseries` list of data frames, and
- update the object's `latest` pointer.

#### To illustrate:

```{r eval=FALSE}
#' Winsorize pupil values
#'
#' Applies winsorization to extreme pupil values within each block.
#'
#' @param eyeris An `eyeris` object created by [load_asc()].
#' @param lower Lower quantile threshold. Default is 0.01.
#' @param upper Upper quantile threshold. Default is 0.99.
#'
#' @return Updated `eyeris` object with new winsorized pupil column.
winsorize <- function(eyeris, lower = 0.01, upper = 0.99) {
  pipeline_handler(
    eyeris,
    winsorize_pupil,
    "winsorize",
    lower = lower,
    upper = upper
  )
}
```

1. Here, the first argument is always the `eyeris` class object.
2. Then, is the name of the function (the one you created in step 1 above).
3. Third, is the internal label you want `eyeris` to refer to (i.e., the one
   for the column name, plots, etc.).
4. Fourth position++ are the parameters you created after `prev_op` when
   creating the function `winsorize_pupil` in step 1.
   
## 🎉 And that's it!

You should now be able to use your new function extension as a component within 
a new custom `eyeris` pipeline declaration. To illustrate:

```{r eval=FALSE}
system.file("extdata", "memory.asc", package = "eyeris") |>
  eyeris::load_asc(block = "auto") |>
  eyeris::deblink(extend = 50) |>
  winsorize()
```

## 💪 Best Practices

- **Use consistent naming:** i.e., match your suffix (e.g. `"winsorize"`) to the 
column and log structure:
  - If you call `pipeline_handler(..., "winsorize")`
  - Then, your function names should reflect that:
    - the public facing `winsorize()` wrapper function, and
    - the private `winsorize_pupil()` logic implementation function.
  - You will then see `pupil_raw_*_winsorize` as the new output column!
- **Respect previous steps:** Your custom function should rely only on `prev_op`,
not on **_any hardcoded column names_**!
- **Return the expected data type:** Be sure that you private function always 
returns a modified vector type, as the underlying `pipeline_handler()` is
looking out for a vector it can transpose into the new column that will be added
to the `timeseries` data frame within the resulting `eyeris` object.
- **Test on individual blocks:** Always try your private function logic on a 
single data block (e.g., `eyeris$timeseries[[1]]`) to debug before integrating
it with the `eyeris` pipeline protocol.

## ✨ Summary

We hope you are now convinced at the power and extensibility the `eyeris`
protocol enables! As we demonstrated here, with just a little bit of structure,
you can create custom extension steps tailored to your specific analysis
needs -- all while preserving the reproducibility and organizational core
principles `eyeris` was designed and built around.

If you'd like to contribute new steps to `eyeris`, feel free to open a pull 
request or discussion on [GitHub](https://github.com/shawntz/eyeris)!

---

## 📚 Citing `eyeris`

<div class="alert alert-light" style="padding-bottom: 0;">
  If you use the `eyeris` package in your research, please cite it!
  
  Run the following in R to get the citation:
</div>

```{r}
citation("eyeris")
```