
The deltapif R package calculates Potential
Impact Fractions (PIF) and Population Attributable Fractions (PAF) from
aggregated data. It uses the delta method to derive confidence
intervals, providing a robust approach for quantifying the burden of
disease attributable to risk factors and the potential impact of
interventions.
You can install the development version of deltapif from GitHub with:
remotes::install_github("RodrigoZepeda/deltapif")The package provides two core functions:
paf(): Calculates the Population Attributable
Fraction.
pif(): Calculates the Potential Impact
Fraction.
Both functions require:
p: The exposure prevalence in the
population.
beta: The log-relative risk coefficient(s)
var_p, var_beta: Variances for the
prevalence and log-relative risk estimates.
The pif() function additionally requires:
p_cft: The counterfactual exposure prevalence under an
intervention scenario.Note A key assumption of the delta method implementation is that the relative risk and exposure prevalence estimates are independent (i.e., derived from different studies or populations).
Lee et al. (2022) estimated the fraction of dementia cases attributable to smoking in the US. They reported:
A relative risk of 1.59 (95% CI: 1.15, 2.20)
A smoking prevalence of 8.5%
The point estimate of the PAF can be calculated using Levin’s formula:
library(deltapif)
paf(p = 0.085, beta = log(1.59), quiet = TRUE)
#>
#> ── Population Attributable Fraction: [deltapif-121642715990569] ──
#>
#> PAF = 4.776% [95% CI: 4.776% to 4.776%]
#> standard_deviation(paf %) = 0.000To calculate confidence intervals, we need the variance of the log-relative risk. The variance can be derived from the confidence interval following the Cochrane Handbook:
var_log_rr <- ((log(2.20) - log(1.15)) / (2 * 1.96))^2
var_log_rr
#> [1] 0.0273848We then provide the log-relative risk (log(1.59)) and
its variance to paf(), specifying the rr_link
as exp to convert the coefficient to a relative risk by
exponentiating the log. Since the prevalence variance was not reported,
we assume var_p = 0.
paf_dementia <- paf(
p = 0.085,
beta = log(1.59),
var_beta = var_log_rr,
var_p = 0
)
paf_dementia
#>
#> ── Population Attributable Fraction: [deltapif-0305065688938981] ──
#>
#> PAF = 4.776% [95% CI: 0.717% to 8.669%]
#> standard_deviation(paf %) = 2.028The results match those reported by Lee et al.: PAF = 4.9% (95% CI: 1.3–9.3).
Lee et al. (2022) also considered a scenario reducing smoking prevalence by 15% (from 8.5% to 7.225%). The PIF for this intervention is:
lee_pif <- pif(
p = 0.085,
p_cft = 0.085 * (1 - 0.15), # 15% reduction
beta = log(1.59),
var_beta = var_log_rr,
var_p = 0
)
lee_pif
#>
#> ── Potential Impact Fraction: [deltapif-0201324401270228] ──
#>
#> PIF = 0.716% [95% CI: 0.118% to 1.311%]
#> standard_deviation(pif %) = 0.304This result is consistent with the reported estimate: PIF = 0.7% (95% CI: 0.2–1.4).
Attributable and averted cases can be calculated with the
attributable_cases function. For example Dhana et
al estimate the number of people with Alzheimer’s Disease in New
York, USA 426.5 (400.2, 452.7) thousand. This implies a variance of
((452.7 - 400.2) / 2*qnorm(0.975))^2 = 2647.005.
The number of cases (in thousands) that would be averted if we reduced smoking by 15% assuming the prevalence of smoking is identical to the rest of the US is given by:
averted_cases(426.5, lee_pif, variance = 2647.005)
#>
#> ── Averted cases: [deltapif-0201324401270228] ──
#>
#> Averted cases = 3.055 [95% CI: 0.394 to 5.716]
#> standard_deviation(averted cases) = 135.779Attributable cases can likewise be estimated using the previous
paf as:
attributable_cases(426.5, paf_dementia, variance = 2647.005)
#>
#> ── Attributable cases: [deltapif-0305065688938981] ──
#>
#> Attributable cases = 20.368 [95% CI: 2.626 to 38.109]
#> standard_deviation(attributable cases) = 905.195Multiple fractions can be combined into totals and ensembles. For example the fraction among men and women can be combined into an overall fraction by specifying the distribution of the subgroups in the population:
paf_men <- paf(p = 0.41, beta = 0.31, var_p = 0.001,
var_beta = 0.14,
label = "Men")
paf_women <- paf(p = 0.37, beta = 0.35, var_p = 0.001,
var_beta = 0.16,
label = "Women")Assuming the distribution is 51% women and 49% men:
paf_total(paf_men, paf_women, weights = c(0.49, 0.51))
#>
#> ── Population Attributable Fraction: [deltapif-0997466165469617] ──
#>
#> PAF = 13.201% [95% CI: 10.473% to 15.845%]
#> standard_deviation(paf %) = 11.187
#> ────────────────────────────────── Components: ─────────────────────────────────
#> • 12.968% (sd %: 15.867) --- [Men]
#> • 13.424% (sd %: 15.773) --- [Women]
#> ────────────────────────────────────────────────────────────────────────────────This is equivalent to calculating:
\[ \textrm{PAF}_{\text{All}} = 0.49 \cdot \text{PAF}_{\text{Men}} + 0.51 \cdot \text{PAF}_{\text{Women}} \]
Fractions from disjointed risks can be calculated as an ensemble. For example the fraction of exposure to lead and the fraction of exposure to asbestus:
paf_lead <- paf(p = 0.41, beta = 0.31, var_p = 0.001,
var_beta = 0.014,
label = "Lead")
paf_absts <- paf(p = 0.61, beta = 0.15, var_p = 0.001,
var_beta = 0.001,
label = "Asbestus")A fraction of environmental exposure considering
both can be calculated by multiplying the inverse of the fractions,
assuming a commonality correction (say of c(0.1, 0.2)):
paf_ensemble(paf_lead, paf_absts, weights = c(0.1, 0.2))
#>
#> ── Population Attributable Fraction: [deltapif-132925867320251] ──
#>
#> PAF = 3.070% [95% CI: 3.033% to 3.108%]
#> standard_deviation(paf %) = 0.625
#> ────────────────────────────────── Components: ─────────────────────────────────
#> • 12.968% (sd %: 5.085) --- [Lead]
#> • 8.985% (sd %: 1.904) --- [Asbestus]
#> ────────────────────────────────────────────────────────────────────────────────where this quantity estimates:
\[ \textrm{PAF}_{\text{Ensemble}} = 1 - (1 - 0.1 \cdot \textrm{PAF}_{\text{Lead}}) \cdot (1 - 0.2 \cdot \textrm{PAF}_{\text{Asbestus}}) \]
Adjuting for commonality is usually performed when different risks can be concurrent. In the previous example, exposure to lead and to asbestus can happen at the same time. Mukadan et al propose the individual weighted (adjusted) fractions based on commonality weights. These weights represent the proportion of the variance shared among risk factors. To calculate the adjusted fractions one needs to estimate:
\[ \textrm{PIF}_k^{\text{Adjusted}} = \dfrac{\text{PIF}_k}{\sum_k \text{PIF}_k} \cdot \text{PIF}_{\text{Overall}} \] where
\[ \textrm{PIF}^{\text{Overall}} = 1 - \prod\limits_k (1 - w_k \text{PIF}_k) \] with
\[ w_k = 1 - \text{commonality}_k \]
The adjusted fractions can be calculated with the
weighted_adjusted as:
weighted_adjusted_paf(paf_lead, paf_absts, weights = c(0.2, 0.3))
#> $Lead
#>
#> ── Population Attributable Fraction: [Lead_adj] ──
#>
#> PAF = 3.083% [95% CI: 0.964% to 5.202%]
#> standard_deviation(paf %) = 1.081
#>
#> $Asbestus
#>
#> ── Population Attributable Fraction: [Asbestus_adj] ──
#>
#> PAF = 2.136% [95% CI: 1.294% to 2.978%]
#> standard_deviation(paf %) = 0.429which returns a named list of the adjusted fractions.
For more detailed examples, see the Examples article where we reproduce some results from the literature.
There is additional information on the package’s website.
Contributions are welcome! Please file issues and pull requests on GitHub.