--- title: "Allowing the lower asymptote parameter to vary freely" author: "Thomas Matheis, Phineus Choi, Sam Butler, Mira Terdiman, Johanna Hardin" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Allowing the lower asymptote parameter to vary freely} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Introduction There may be situations where we want to estimate the lower asymptote of $h_0$ freely in our model rather than assuming it always starts at zero, which is what **sicegar** assumes by default. For this purpose, the functions `fitAndCategorize()` and `figureModelCurves()` contain the argument `use_h0` (which has a default value set to `FALSE`). Setting the argument to `TRUE` results in the same process as usual, using functions ending in `_h0` instead of their default counterparts. For example, the functions `multipleFitFunction()`, `doublesigmoidalFitFormula()`, `parameterCalculation()`, and `normalizeData()` have `_h0` counterparts, `multipleFitFunction_h0()`, `doublesigmoidalFitFormula_h0()`, `parameterCalculation_h0()`, and `normalizeData_h0()`. ```{r install_packages, echo=FALSE, warning=FALSE, results='hide',message=FALSE} ###***************************** # INITIAL COMMANDS TO RESET THE SYSTEM seedNo=14159 set.seed(seedNo) ###***************************** ###***************************** require("sicegar") require("dplyr") require("ggplot2") require("cowplot") ###***************************** ``` We will demonstrate the differences between letting $h_0$ be estimated freely and assuming it is fixed at zero, first generating data where $h_0$ is not zero: ```{r generate_data} noise_parameter <- 1 reps <- 5 time <- rep(seq(3, 24, 3), reps) mean_values <- doublesigmoidalFitFormula_h0(time, finalAsymptoteIntensityRatio = .3, maximum = 10, slope1Param = 1, midPoint1Param = 7, slope2Param = 1, midPointDistanceParam = 8, h0 = 3) intensity <- rnorm(n = length(mean_values), mean = mean_values, sd = rep(noise_parameter, length(mean_values))) dataInput <- data.frame(time, intensity) ggplot(dataInput, aes(time, intensity)) + geom_point() + scale_y_continuous(limits = c(-1, 13), expand = expansion(mult = c(0, 0))) + theme_bw() ``` ## Fitting the models to the data `fitAndCategorize()` can be applied to the data, first with default arguments and second by setting the argument `use_h0` to `TRUE`: ```{r} fitObj_zero <- fitAndCategorize(dataInput, threshold_minimum_for_intensity_maximum = 0.3, threshold_intensity_range = 0.1, threshold_t0_max_int = 1E10, use_h0 = FALSE) # Default fitObj_free <- fitAndCategorize(dataInput, threshold_minimum_for_intensity_maximum = 0.3, threshold_intensity_range = 0.1, threshold_t0_max_int = 1E10, use_h0 = TRUE) ``` Using `figureModelCurves()`, we can visualize the differences between using the default arguments and letting $h_0$ be freely estimated. ```{r zero_free_plots, fig.height=4, fig.width=8} # Double-sigmoidal fit with parameter related lines fig_a <- figureModelCurves(dataInput = fitObj_zero$normalizedInput, doubleSigmoidalFitVector = fitObj_zero$doubleSigmoidalModel, showParameterRelatedLines = TRUE, use_h0 = FALSE) # Default fig_b <- figureModelCurves(dataInput = fitObj_free$normalizedInput, doubleSigmoidalFitVector = fitObj_free$doubleSigmoidalModel, showParameterRelatedLines = TRUE, use_h0 = TRUE) plot_grid(fig_a, fig_b, ncol = 2) # function from the cowplot package ``` It is clear that in this situation, using the default arguments result in a worse fit than when $h_0$ is allowed to be estimated freely. ## Model fitting components (h0 free) To fit and plot individual models using a freely estimated $h_0$, we must directly call the `_h0` counterparts of each **sicegar** function. We have already generated the data (with $h_0 = 2$), so now we can normalize the data. ```{r normalize_data} normalizedInput_free <- normalizeData(dataInput = dataInput, dataInputName = "doubleSigmoidalSample") head(normalizedInput_free$timeIntensityData) # the normalized time and intensity data ``` We can now call `multipleFitFunction_h0()` on our data to be fitted, calculating additional parameters using `parameterCalculation_h0()`: ```{r} # Fit the double-sigmoidal model doubleSigmoidalModel_free <- multipleFitFunction_h0(dataInput=normalizedInput_free, model="doublesigmoidal") doubleSigmoidalModel_free <- parameterCalculation_h0(doubleSigmoidalModel_free) ``` Now that we have obtained a fit, we can use `figureModelCurves()` to plot: ```{r plot_raw_fit, echo=TRUE, message=FALSE, warning=FALSE, comment=FALSE, fig.height=4, fig.width=6} # double-sigmoidal fit figureModelCurves(dataInput = normalizedInput_free, doubleSigmoidalFitVector = doubleSigmoidalModel_free, showParameterRelatedLines = TRUE, use_h0 = TRUE) ``` ## Model parameters Recall that the original model parameters (which generated the data) are given as `finalAsymptoteIntensityRatio = 0.3, maximum = 10, slope1Param = 1, midPoint1Param = 7, slope2Param = 1, midPointDistanceParam = 8, h0 = 2`. We can recover the parameter estimates from both of the `doubleSigmoidalModel` objects created above. `fitObj_zero` does not return a value for $h_0$ (because it is not part of the estimation process). When $h_0$ is allowed to vary freely, the full set of parameters are estimated to be much closer to the data generating parameters (as opposed to when $h_0 = 0$ is forced). ```{r} fitObj_zero$doubleSigmoidalModel |> dplyr::select(finalAsymptoteIntensityRatio_Estimate, maximum_Estimate, slope1Param_Estimate, midPoint1Param_Estimate, slope2Param_Estimate, midPointDistanceParam_Estimate) |> c() fitObj_free$doubleSigmoidalModel |> dplyr::select(finalAsymptoteIntensityRatio_Estimate, maximum_Estimate, slope1Param_Estimate, midPoint1Param_Estimate, slope2Param_Estimate, midPointDistanceParam_Estimate, h0_Estimate) |> c() ```