Type: | Package |
Title: | Statistical Hypothesis Testing Using the Delta Method |
Version: | 0.1.0 |
Description: | Statistical hypothesis testing using the Delta method as proposed by Deng et al. (2018) <doi:10.1145/3219819.3219919>. This method replaces the standard variance estimation formula in the Z-test with an approximate formula derived via the Delta method, which can account for within-user correlation. |
License: | MIT + file LICENSE |
URL: | https://github.com/hoxo-m/deltatest |
BugReports: | https://github.com/hoxo-m/deltatest/issues |
Depends: | R (≥ 4.1.0) |
Imports: | glue, R6, rlang, stats |
Suggests: | dplyr, ggplot2, knitr, rmarkdown, testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
Language: | en-US |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-02-28 08:33:55 UTC; akagi |
Author: | Koji Makiyama [aut, cre, cph], Shinichi Takayanagi [med] |
Maintainer: | Koji Makiyama <hoxo.smile@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-03-03 12:10:15 UTC |
The Delta Method for Ratio
Description
Applies the Delta method to the ratio of two random variables,
f(X,Y)=X/Y
, to estimate the expected value, variance, standard error,
and confidence interval.
Methods
Public methods
Method new()
Initialize a new DeltaMethodForRatio object.
Usage
DeltaMethodForRatio$new(numerator, denominator, bias_correction = FALSE)
Arguments
numerator, denominator
numeric vectors sampled from the distributions of the random variables in the numerator and denominator of the ratio.
bias_correction
logical value indicating whether correction to the mean of the metric is performed using the second-order term of the Taylor expansion. The default is
FALSE
.
Method get_expected_value()
Get the expected value.
Usage
DeltaMethodForRatio$get_expected_value()
Returns
numeric estimate of the expected value of the ratio.
Method get_variance()
Get the variance.
Usage
DeltaMethodForRatio$get_variance()
Returns
numeric estimate of the variance of the ratio.
Method get_squared_standard_error()
Get the squared standard error.
Usage
DeltaMethodForRatio$get_squared_standard_error()
Returns
numeric estimate of the squared standard error of the ratio.
Method get_standard_error()
Get the standard error.
Usage
DeltaMethodForRatio$get_standard_error()
Returns
numeric estimate of the standard error of the ratio.
Method get_confidence_interval()
Get the confidence interval.
Usage
DeltaMethodForRatio$get_confidence_interval( alternative = c("two.sided", "less", "greater"), conf_level = 0.95 )
Arguments
alternative
character string specifying the alternative hypothesis, must be one of
"two.sided"
(default),"greater"
, or"less"
. You can specify just the initial letter.conf_level
numeric value specifying the confidence level of the interval. The default is 0.95.
Returns
numeric estimates of the lower and upper bounds of the confidence interval of the ratio.
Method get_info()
Get statistical information.
Usage
DeltaMethodForRatio$get_info( alternative = c("two.sided", "less", "greater"), conf_level = 0.95 )
Arguments
alternative
character string specifying the alternative hypothesis, must be one of
"two.sided"
(default),"greater"
, or"less"
. You can specify just the initial letter.conf_level
numeric value specifying the confidence level of the interval. The default is 0.95.
Returns
numeric estimates include the expected value, variance, standard error, and confidence interval.
Method compute_expected_value()
Class method to compute the expected value of the ratio using the Delta method.
Usage
DeltaMethodForRatio$compute_expected_value( mean1, mean2, var2, cov = 0, bias_correction = FALSE )
Arguments
mean1
numeric value of the mean numerator of the ratio.
mean2
numeric value of the mean denominator of the ratio.
var2
numeric value of the variance of the denominator of the ratio.
cov
numeric value of the covariance between the numerator and denominator of the ratio. The default is 0.
bias_correction
logical value indicating whether correction to the mean of the metric is performed using the second-order term of the Taylor expansion. The default is
FALSE
.
Returns
numeric estimate of the expected value of the ratio.
Method compute_variance()
Class method to compute the variance of the ratio using the Delta method.
Usage
DeltaMethodForRatio$compute_variance(mean1, mean2, var1, var2, cov = 0)
Arguments
mean1
numeric value of the mean numerator of the ratio.
mean2
numeric value of the mean denominator of the ratio.
var1
numeric value of the variance of the numerator of the ratio.
var2
numeric value of the variance of the denominator of the ratio.
cov
numeric value of the covariance between the numerator and denominator of the ratio. The default is 0.
Returns
numeric estimate of the variance of the ratio
Method compute_confidence_interval()
Class method to compute the confidence interval of the ratio using the Delta method.
Usage
DeltaMethodForRatio$compute_confidence_interval( mean, standard_error, alternative = c("two.sided", "less", "greater"), conf_level = 0.95 )
Arguments
mean
numeric value of the estimated mean of the ratio.
standard_error
numeric value of the estimated standard error of the mean of the ratio.
alternative
character string specifying the alternative hypothesis, must be one of
"two.sided"
(default),"greater"
, or"less"
. You can specify just the initial letter.conf_level
numeric value specifying the confidence level of the interval. The default is 0.95.
Returns
numeric estimates of the lower and upper bounds of the confidence interval of the ratio.
Method clone()
The objects of this class are cloneable with this method.
Usage
DeltaMethodForRatio$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
References
id:sz_dr (2018). Calculating the mean and variance of the ratio of random variables using the Delta method [in Japanese]. If you are human, think more now. https://www.szdrblog.info/entry/2018/11/18/154952
Two Sample Z-Test for Ratio Metrics Using the Delta Method
Description
Performs two sample Z-test to compare the ratio metrics between two groups using the delta method. The Delta method is used to estimate the variance by accounting for the correlation between the numerator and denominator of ratio metrics.
Usage
deltatest(
data,
formula,
by,
group_names = "auto",
type = c("difference", "relative_change"),
bias_correction = FALSE,
alternative = c("two.sided", "less", "greater"),
conf.level = 0.95,
na.rm = FALSE,
quiet = FALSE
)
Arguments
data |
data.frame containing the numerator and denominator columns of the ratio metric, aggregated by randomization unit. It also includes a column indicating the assigned group (control or treatment). For example, if randomizing by user while the metric is click-through rate (CTR) per page-view, the numerator is the number of clicks per user, and the denominator is the number of page views per user. |
formula |
expression representing the ratio metric. It can be written in
three styles: standard formula |
by |
character string or symbol that indicates the group column. If the
group column is specified in the |
group_names |
character vector of length 2 or |
type |
character string specifying the test type. If |
bias_correction |
logical value indicating whether correction to the
mean of the metric is performed using the second-order term of the Taylor
expansion. The default is |
alternative |
character string specifying the alternative hypothesis,
must be one of |
conf.level |
numeric value specifying the confidence level of the interval. The default is 0.95. |
na.rm |
logical value. If |
quiet |
logical value indicating whether messages should be displayed
during the execution of the function. The default is |
Value
A list with class "htest"
containing following components:
statistic |
the value of the Z-statistic. |
p.value |
the p-value for the test. |
conf.int |
a confidence interval for the difference or relative change appropriate to the specified alternative hypothesis. |
estimate |
the estimated means of the two groups, and the difference or relative change. |
null.value |
the hypothesized value of the difference or relative change in means under the null hypothesis. |
stderr |
the standard error of the difference or relative change. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string describing the method used. |
data.name |
the name of the data. |
References
Deng, A., Knoblich, U., & Lu, J. (2018). Applying the Delta Method in Metric Analytics: A Practical Guide with Novel Ideas. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. doi:10.1145/3219819.3219919
Examples
library(dplyr)
library(deltatest)
n_user <- 2000
set.seed(314)
df <- deltatest::generate_dummy_data(n_user) |>
group_by(user_id, group) |>
summarise(click = sum(metric), pageview = n(), .groups = "drop")
deltatest(df, click / pageview, by = group)
Generate Dummy Data
Description
Generate random dummy data for simulation studies. For details, see Section 4.3 in Deng et al. (2017).
Usage
generate_dummy_data(
n_user,
model = c("Bernoulli", "normal"),
xi = 0,
sigma = 0,
random_unit = c("user", "session", "pageview"),
treatment_ratio = 0.5
)
Arguments
n_user |
integer value specifying the number of users included in the generated data. Since multiple rows are generated for each user, the number of rows in the data exceeds the number of users. |
model |
character string specifying the model that generates the
potential outcomes. It must be one of |
xi |
numeric value specifying the treatment effect variation (TEV) under
the Bernoulli model, where |
sigma |
numeric value specifying the treatment effect variation (TEV)
under the normal model, where |
random_unit |
character string specifying the randomization unit. It
must be one of |
treatment_ratio |
numeric value specifying the ratio assigned to treatment. The default value is 0.5. |
Value
data.frame with the columns user_id, group, and metric, where each row represents a metric value for a page-view.
References
Deng, A., Lu, J., & Litz, J. (2017). Trustworthy Analysis of Online A/B Tests: Pitfalls, challenges and solutions. Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. doi:10.1145/3018661.3018677
Examples
library(deltatest)
set.seed(314)
generate_dummy_data(n_user = 2000)