---
title: "Peak Finder"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Peaks-and-Troughs}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup, message = FALSE}
library(PnT)
library(plotly)
mylar <- readRDS('../inst/extdata/mylar_data.rds')
```

This data represents a spectrum of light released by analyzing the elements in a sample. The first peak represents hydrogen in the sample, and the second peak represents other elements.

```{r plot, out.width="100%"}
plot_ly(mylar, x = ~Ch, y = ~CPS,
        type = 'scatter', mode = 'lines')
```

To find these peaks, we can just use the following function:

```{r peaks}
find_peaks(mylar, 'Ch', 'CPS')
```

Next, let's see how to find peaks in more noisy data

```{r noisy data}
X <- seq(0, 10, 0.01)
Y <- sin(X) + runif(length(X), min=-0.1, max=0.1)
data <- data.frame(X=X, Y=Y)
plot_ly(data, x = ~X, y = ~Y,
        type = 'scatter', mode = 'lines')
```

To find the peaks, we can just use the same function

```{r noisy peaks}
find_peaks(data, 'X', 'Y')
```

Now, if we have even more noisy data, the peak finder might need better filter parameters.

```{r}
Y <- Y + runif(length(X), min=-0.1, max=0.1)
data <- data.frame(X=X, Y=Y)
find_peaks(data, 'X', 'Y')
```

This data has too much noise for the filter.
So, we can increase the amount of noise blocked by the filter using the parameter minYerror.
This parameter has default value 0.1.

```{r use filter}
find_peaks(data, 'X', 'Y', minYerror=0.2)
```

Another possibility is that we only want the thin, sharp peaks.

```{r}
Y <- (7/(5+(X-5)^2)) + (0.05/(0.02+(X-8)^2)) + (0.03/(0.03+(X-1)^2))
data <- data.frame(X=X, Y=Y)
plot_ly(data, x = ~X, y = ~Y,
        type = 'scatter', mode = 'lines')
```

For example, say we only want the sharp peaks (the first and last), not the wide peak in the middle.

```{r}
find_peaks(data, 'X', 'Y')
```

The usual function call does not only get those peaks.
Instead, we use the parameter minSlope, which controls a filter testing the slope around peaks relative
to the dimensions of the data. This argument is 0 by default. Increasing minSlope allows us to filter out
smooth peaks in favor of sharp ones.

```{r}
find_peaks(data, 'X', 'Y', minSlope = 1)
```

If, instead, the outer peaks are just edge effects, and we only want the middle peak, we can use edgeFilter.
In this case, setting edgeFilter to 0.25 filters out the first and last quarter (0.25) of the data.

```{r}
find_peaks(data, 'X', 'Y', edgeFilter = 0.25)
```

# Examples

```{r, include = FALSE}
school_data <- readRDS("../inst/extdata/school_data.rds")
jadeite_control <- readRDS("../inst/extdata/jadeite_control.rds")
```

```{r logy, message = FALSE}
logy <- function(figure){
  layout(figure, yaxis=list(type="log"))
}
```

Here, we see some real world examples where this package can be used.

## Air Quality Data Analysis

Here, we have a data set to analyze: air quality data from a school.

```{r school}
fig <- plot_ly(school_data, x=~Energy, y=~Counts, type="scatter", mode="lines"); fig
```

To find the prominent peaks, we can use the basic function call:

```{r find peaks}
find_peaks(school_data, 'Energy', 'Counts')
```

However, if we plot the data using a log plot, we see that many of the other peaks are not noise.

```{r log plot}
logy(fig)
```

The main example of this is the peak at x = 6.4 keV. In the previous plot, this peak was barely visible, but in this one, it is obviously not just noise. By hovering over the peak with your mouse, you can see that this peak has a height of about 60 from the rest of the graph.
Using the argument `asFraction` in conjunction with `minYerror`, you can make sure this is the smallest peak found.

```{r asFraction}
find_peaks(school_data, 'Energy', 'Counts', asFraction = FALSE, minYerror = 60)
```

In this example, setting `asFraction = FALSE` allows you to input the height of 60 you measured for the peak directly into the parameter `minYerror` without having to divide it by the total height of the data.

## Jadeite Control Data

```{r jadeite control}
fig <- plot_ly(jadeite_control, x=~Ch, y=~CPS, type="scatter", mode="lines"); fig
```

Again, we need to use a log plot to see the smaller peaks clearly.

```{r}
logy(fig)
```

We can see that the small peak at x = 1000 has a height of about 400.
So, we can try what we did before to find the peaks of the graph.

```{r}
find_peaks(jadeite_control, 'Ch', 'CPS', asFraction = FALSE, minYerror = 400)
```

However, what if the peaks at 285, 323, and 366 are actually noise?
Say you want to filter out these peaks, and other small peaks in the area where most of the peaks are larger.
In this case, we have to break up the peak finding into two separate ranges.
First, we find the peaks in the range where the peaks are smaller:

```{r ROI}
smallerPeaks <- find_peaks(jadeite_control, 'Ch', 'CPS', asFraction = FALSE, minYerror = 400, ROI = c(500, 1050))
smallerPeaks
```

Now, we can find the peaks in the other range:

```{r}
largerPeaks <- find_peaks(jadeite_control, 'Ch', 'CPS', asFraction = FALSE, minYerror = 10000, ROI = c(0, 500))
largerPeaks
```

Here, the `ROI` argument determines the region searched by the peak finder.
If we want to combine these peaks into one data frame, we can just use `bind_rows`.

```{r bind rows}
dplyr::bind_rows(largerPeaks, smallerPeaks)
```