Maintainer: Steven E. Pav <shabbychef@gmail.com>
Version: 0.1.6
Date: 2017-03-17
License: LGPL-3
Title: PDQ Functions via Gram Charlier, Edgeworth, and Cornish Fisher Approximations
BugReports: https://github.com/shabbychef/PDQutils/issues
Description: A collection of tools for approximating the 'PDQ' functions (respectively, the cumulative distribution, density, and quantile) of probability distributions via classical expansions involving moments and cumulants.
Depends: R (≥ 3.0.2)
Imports: orthopolynom, moments
Suggests: ggplot2, reshape2, testthat, knitr
URL: https://github.com/shabbychef/PDQutils
VignetteBuilder: knitr
Collate: 'cornish_fisher.r' 'edgeworth.r' 'gram_charlier.r' 'moments.r' 'PDQutils.r'
RoxygenNote: 5.0.1
NeedsCompilation: no
Packaged: 2017-03-18 03:55:40 UTC; spav
Author: Steven E. Pav [aut, cre]
Repository: CRAN
Date/Publication: 2017-03-18 08:37:53 UTC

PDQ Functions via Gram Charlier, Edgeworth, and Cornish Fisher Approximations

Description

PDQ Functions via Gram-Charlier, Edgeworth, and Cornish Fisher Approximations

Gram Charlier and Edgeworth Expansions

Given the raw moments of a probability distribution, we can approximate the probability density function, or the cumulative distribution function, via a Gram-Charlier 'A' expansion on the standardized distribution.

Suppose f(x) is the probability density of some random variable, and let F(x) be the cumulative distribution function. Let He_j(x) be the jth probabilist's Hermite polynomial. These polynomials form an orthogonal basis, with respect to the function w(x) of the Hilbert space of functions which are square w-weighted integrable. The weighting function is w(x) = e^{-x^2/2} = \sqrt{2\pi}\phi(x). The orthogonality relationship is

\int_{-\infty}^{\infty} He_i(x) He_j(x) w(x) \mathrm{d}x = \sqrt{2\pi} j! \delta_{ij}.

Expanding the density f(x) in terms of these polynomials in the usual way (abusing orthogonality) one has

f(x) \approx \sum_{0\le j} \frac{He_j(x)}{j!} \phi(x) \int_{-\infty}^{\infty} f(z) He_j(z) \mathrm{d}z.

The cumulative distribution function is 'simply' the integral of this expansion. Abusing certain facts regarding the PDF and CDF of the normal distribution and the probabilist's Hermite polynomials, the CDF has the representation

F(x) = \Phi(x) - \sum_{1\le j} \frac{He_{j-1}(x)}{j!} \phi(x) \int_{-\infty}^{\infty} f(z) He_j(z) \mathrm{d}z.

These series contain coefficients defined by the probability distribution under consideration. They take the form

c_j = \frac{1}{j!}\int_{-\infty}^{\infty} f(z) He_j(z) \mathrm{d}z.

Using linearity of the integral, these coefficients are easily computed in terms of the coefficients of the Hermite polynomials and the raw, uncentered moments of the probability distribution under consideration. Note that it may be the case that the computation of these coefficients suffers from bad numerical cancellation for some distributions, and that an alternative formulation may be more numerically robust.

Generalized Gram Charlier Expansions

The Gram Charlier 'A' expansion is most appropriate for random variables which are vaguely like the normal distribution. For those which are like another distribution, the same general approach can be pursued. One needs only define a weighting function, w, which is the density of the 'parent' probability distribution, then find polynomials, p_n(x) which are orthogonal with respect to w over its support. One has

f(x) = \sum_{0\le j} p_j(x) w(x) \frac{1}{h_j} \int_{-\infty}^{\infty} f(z) p_j(z) \mathrm{d}z.

Here h_j is the normalizing constant:

h_j = \int w(z)p_j^2(z)\mathrm{d}z.

One must then use facts about the orthogonal polynomials to approximate the CDF.

Another approach to arrive at the same computation is described by Berberan-Santos.

Cornish Fisher Approximation

The Cornish Fisher approximation is the Legendre inversion of the Edgeworth expansion of a distribution, but ordered in a way that is convenient when used on the mean of a number of independent draws of a random variable.

Suppose x_1, x_2, \ldots, x_n are n independent draws from some probability distribution. Letting

X = \frac{1}{\sqrt{n}} \sum_{1 \le i \le n} x_i,

the Central Limit Theorem assures us that, assuming finite variance,

X \rightarrow \mathcal{N}(\sqrt{n}\mu, \sigma),

with convergence in n.

The Cornish Fisher approximation gives a more detailed picture of the quantiles of X, one that is arranged in decreasing powers of \sqrt{n}. The quantile function is the function q(p) such that P\left(X \le q(p)\right) = q(p). The Cornish Fisher expansion is

q(p) = \sqrt{n}\mu + \sigma \left(z + \sum_{3 \le j} c_j f_j(z)\right),

where z = \Phi^{-1}(p), and c_j involves standardized cumulants of the distribution of x_i of order j and higher. Moreover, the c_j feature decreasing powers of \sqrt{n}, giving some justification for truncation. When n=1, however, the ordering is somewhat arbitrary.

Legal Mumbo Jumbo

PDQutils is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

Note

This package is maintained as a hobby.

Author(s)

Steven E. Pav shabbychef@gmail.com

References

Lee, Y-S., and Lin, T-K. "Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 41, no. 1 (1992): 233-240. http://www.jstor.org/stable/2347649

Lee, Y-S., and Lin, T-K. "Correction to Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 42, no. 1 (1993): 268-269. http://www.jstor.org/stable/2347433

AS 269. http://lib.stat.cmu.edu/apstat/269

Jaschke, Stefan R. "The Cornish-Fisher-expansion in the context of Delta-Gamma-normal approximations." No. 2001, 54. Discussion Papers, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes, 2001. http://www.jaschke-net.de/papers/CoFi.pdf

S. Blinnikov and R. Moessner. "Expansions for nearly Gaussian distributions." Astronomy and Astrophysics Supplement 130 (1998): 193-205. http://arxiv.org/abs/astro-ph/9711239

M. N. Berberan-Santos. "Expressing a Probability Density Function in Terms of another PDF: A Generalized Gram-Charlier Expansion." Journal of Mathematical Chemistry 42, no 3 (2007): 585-594. http://web.ist.utl.pt/ist12219/data/115.pdf


Higher order Cornish Fisher approximation.

Description

Lee and Lin's Algorithm AS269 for higher order Cornish Fisher quantile approximation.

Usage

AS269(z,cumul,order.max=NULL,all.ords=FALSE)

Arguments

z

the quantiles of the normal distribution. an atomic vector.

cumul

the standardized cumulants of order 3, 4, ..., k. an atomic vector.

order.max

the maximum order approximation, must be greater than length(cumul)+2. We assume the cumulants have been adjusted to reflect that the random variable has unit variance ('standardized cumulants')

all.ords

a logical value. If TRUE, then results are returned as a matrix, with a column for each order of the approximation. Otherwise the results are a matrix with a single column of the highest order approximation.

Details

The Cornish Fisher approximation is the Legendre inversion of the Edgeworth expansion of a distribution, but ordered in a way that is convenient when used on the mean of a number of independent draws of a random variable.

Suppose x_1, x_2, \ldots, x_n are n independent draws from some probability distribution. Letting

X = \frac{1}{\sqrt{n}} \sum_{1 \le i \le n} x_i,

the Central Limit Theorem assures us that, assuming finite variance,

X \rightarrow \mathcal{N}(\sqrt{n}\mu, \sigma),

with convergence in n.

The Cornish Fisher approximation gives a more detailed picture of the quantiles of X, one that is arranged in decreasing powers of \sqrt{n}. The quantile function is the function q(p) such that P\left(X \le q(p)\right) = q(p). The Cornish Fisher expansion is

q(p) = \sqrt{n}\mu + \sigma \left(z + \sum_{3 \le j} c_j f_j(z)\right),

where z = \Phi^{-1}(p), and c_j involves standardized cumulants of the distribution of x_i of order up to j. Moreover, the c_j include decreasing powers of \sqrt{n}, giving some justification for truncation. When n=1, however, the ordering is somewhat arbitrary.

Value

A matrix, which is, depending on all.ords, either with one column per order of the approximation, or a single column giving the maximum order approximation. There is one row per value in z.

Invalid arguments will result in return value NaN with a warning.

Note

A warning will be thrown if any of the z are greater than 3.719017274 in absolute value; the traditional AS269 errors out in this case.

Author(s)

Steven E. Pav shabbychef@gmail.com

References

Lee, Y-S., and Lin, T-K. "Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 41, no. 1 (1992): 233-240. http://www.jstor.org/stable/2347649

Lee, Y-S., and Lin, T-K. "Correction to Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 42, no. 1 (1993): 268-269. http://www.jstor.org/stable/2347433

AS 269. http://lib.stat.cmu.edu/apstat/269

Jaschke, Stefan R. "The Cornish-Fisher-expansion in the context of Delta-Gamma-normal approximations." No. 2001, 54. Discussion Papers, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes, 2001. http://www.jaschke-net.de/papers/CoFi.pdf

See Also

qapx_cf

Examples

foo <- AS269(seq(-2,2,0.01),c(0,2,0,4))
# test with the normal distribution:
s.cumul <- c(0,0,0,0,0,0,0,0,0)
pv <- seq(0.001,0.999,0.001)
zv <- qnorm(pv)
apq <- AS269(zv,s.cumul,all.ords=FALSE)
err <- zv - apq

# test with the exponential distribution
rate <- 0.7
n <- 18
# these are 'raw' cumulants'
cumul <- (rate ^ -(1:n)) * factorial(0:(n-1))
# standardize and chop
s.cumul <- cumul[3:length(cumul)] / (cumul[2]^((3:length(cumul))/2))
pv <- seq(0.001,0.999,0.001)
zv <- qnorm(pv)
apq <- cumul[1] + sqrt(cumul[2]) * AS269(zv,s.cumul,all.ords=TRUE)
truq <- qexp(pv, rate=rate)
err <- truq - apq
colSums(abs(err))

# an example from Wikipedia page on CF, 
# \url{https://en.wikipedia.org/wiki/Cornish%E2%80%93Fisher_expansion}
s.cumul <- c(5,2)
apq <- 10 + sqrt(25) * AS269(qnorm(0.95),s.cumul,all.ords=TRUE)


News for package ‘PDQutils’:

Description

News for package ‘PDQutils’.

PDQutils Version 0.1.6 (2017-03-18)

PDQutils Version 0.1.5 (2016-09-18)

PDQutils Version 0.1.4 (2016-03-03)

PDQutils Version 0.1.3 (2016-01-04)

PDQutils Version 0.1.2 (2015-06-15)

PDQutils Version 0.1.1 (2015-02-26)

PDQutils Initial Version 0.1.0 (2015-02-14)


Convert raw cumulants to moments.

Description

Conversion of a vector of raw cumulatnts to moments.

Usage

cumulant2moment(kappa)

Arguments

kappa

a vector of the raw cumulants. The first element is the first cumulant, which is also the first moment.

Details

The 'raw' cumulants \kappa_i are connected to the 'raw' (uncentered) moments, \mu_i' via the equation

\mu_n' = \kappa_n + \sum_{m=1}^{n-1} {n-1 \choose m-1} \kappa_m \mu_{n-m}'

Value

a vector of the raw moments. The first element of the input shall be the same as the first element of the output.

Author(s)

Steven E. Pav shabbychef@gmail.com

See Also

moment2cumulant

Examples

# normal distribution, mean 0, variance 1
n.mom <- cumulant2moment(c(0,1,0,0,0,0))
# normal distribution, mean 1, variance 1
n.mom <- cumulant2moment(c(1,1,0,0,0,0))

Approximate density and distribution via Edgeworth expansion.

Description

Approximate the probability density or cumulative distribution function of a distribution via its raw cumulants.

Usage

dapx_edgeworth(x, raw.cumulants, support=c(-Inf,Inf), log=FALSE)

papx_edgeworth(q, raw.cumulants, support=c(-Inf,Inf), lower.tail=TRUE, log.p=FALSE)

Arguments

x

where to evaluate the approximate density.

raw.cumulants

an atomic array of the 1st through kth raw cumulants of the probability distribution. The first cumulant is the mean, the second is the variance. The third is not the typical unitless skew.

support

the support of the density function. It is assumed that the density is zero on the complement of this open interval.

log

logical; if TRUE, densities f are given as \mbox{log}(f).

q

where to evaluate the approximate distribution.

log.p

logical; if TRUE, probabilities p are given as \mbox{log}(p).

lower.tail

whether to compute the lower tail. If false, we approximate the survival function.

Details

Given the raw cumulants of a probability distribution, we can approximate the probability density function, or the cumulative distribution function, via an Edgeworth expansion on the standardized distribution. The derivation of the Edgeworth expansion is rather more complicated than that of the Gram Charlier approximation, involving the characteristic function and an expression of the higher order derivatives of the composition of functions; see Blinnikov and Moessner for more details. The Edgeworth expansion can be expressed succinctly as

\sigma f(\sigma x) = \phi(x) + \phi(x)\sum_{1 \le s}\sigma^s \sum_{\{k_m\}} He_{s+2r}(x) c_{k_m},

where the second sum is over some partitions, and the constant c involves cumulants up to order s+2. Unlike the Gram Charlier expansion, of which it is a rearrangement, the Edgeworth expansion is arranged in increasing powers of the standard deviation \sigma.

Value

The approximate density at x, or the approximate CDF at q.

Note

Monotonicity of the CDF is not guaranteed.

Author(s)

Steven E. Pav shabbychef@gmail.com

References

S. Blinnikov and R. Moessner. "Expansions for nearly Gaussian distributions." Astronomy and Astrophysics Supplement 130 (1998): 193-205. http://arxiv.org/abs/astro-ph/9711239

See Also

the Gram Charlier expansions, dapx_gca, papx_gca

Examples

# normal distribution, for which this is silly
xvals <- seq(-2,2,length.out=501)
d1 <- dapx_edgeworth(xvals, c(0,1,0,0,0,0))
d2 <- dnorm(xvals)
d1 - d2

qvals <- seq(-2,2,length.out=501)
p1 <- papx_edgeworth(qvals, c(0,1,0,0,0,0))
p2 <- pnorm(qvals)
p1 - p2

Approximate density and distribution via Gram-Charlier A expansion.

Description

Approximate the probability density or cumulative distribution function of a distribution via its raw moments.

Usage

dapx_gca(x, raw.moments, support=NULL, 
 basis=c('normal','gamma','beta','arcsine','wigner'), 
 basepar=NULL, log=FALSE)

papx_gca(q, raw.moments, support=NULL, 
 basis=c('normal','gamma','beta','arcsine','wigner'), 
 basepar=NULL, lower.tail=TRUE, log.p=FALSE)

Arguments

x

where to evaluate the approximate density.

raw.moments

an atomic array of the 1st through kth raw moments of the probability distribution.

support

the support of the density function. It is assumed that the density is zero on the complement of this open interval. This defaults to c(-Inf,Inf) for the normal basis, c(0,Inf) for the gamma basis, and c(0,1) for the Beta, and c(-1,1) for the arcsine and wigner.

basis

the basis under which to perform the approximation. 'normal' gives the classical 'A' series expansion around the PDF and CDF of the normal distribution via Hermite polynomials. 'gamma' expands around a gamma distribution with parameters basepar$shape and basepar$scale. 'beta' expands around a beta distribution with parameters basepar$shape1 and basepar$shape2.

basepar

the parameters for the base distribution approximation. If NULL, the shape and rate are inferred from the first two moments and/or from the support as appropriate.

log

logical; if TRUE, densities f are given as \mbox{log}(f).

q

where to evaluate the approximate distribution.

log.p

logical; if TRUE, probabilities p are given as \mbox{log}(p).

lower.tail

whether to compute the lower tail. If false, we approximate the survival function.

Details

Given the raw moments of a probability distribution, we can approximate the probability density function, or the cumulative distribution function, via a Gram-Charlier expansion on the standardized distribution. This expansion uses some weighting function, w, typically the density of some 'parent' probability distribution, and polynomials, p_n which are orthogonal with respect to that weighting:

\int_{-\infty}^{\infty} p_n(x) p_m(x) w(x) \mathrm{d}x = h_n \delta_{mn}.

Let f(x) be the probability density of some random variable, with cumulative distribution function F(x). We express

f(x) = \sum_{n \ge 0} c_n p_n(x) w(x)

The constants c_n can be computed from the known moments of the distribution.

For the Gram Charlier 'A' series, the weighting function is the PDF of the normal distribution, and the polynomials are the (probabilist's) Hermite polynomials. As a weighting function, one can also use the PDF of the gamma distribution (resulting in generalized Laguerre polynomials), or the PDF of the Beta distribution (resulting in Jacobi polynomials).

Value

The approximate density at x, or the approximate CDF at

Note

Monotonicity of the CDF is not guaranteed.

Author(s)

Steven E. Pav shabbychef@gmail.com

References

Jaschke, Stefan R. "The Cornish-Fisher-expansion in the context of Delta-Gamma-normal approximations." No. 2001, 54. Discussion Papers, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes, 2001. http://www.jaschke-net.de/papers/CoFi.pdf

S. Blinnikov and R. Moessner. "Expansions for nearly Gaussian distributions." Astronomy and Astrophysics Supplement 130 (1998): 193-205. http://arxiv.org/abs/astro-ph/9711239

See Also

qapx_cf

Examples

# normal distribution:
xvals <- seq(-2,2,length.out=501)
d1 <- dapx_gca(xvals, c(0,1,0,3,0), basis='normal')
d2 <- dnorm(xvals)
# they should match:
d1 - d2

qvals <- seq(-2,2,length.out=501)
p1 <- papx_gca(qvals, c(0,1,0,3,0))
p2 <- pnorm(qvals)
p1 - p2

xvals <- seq(-6,6,length.out=501)
mu <- 2
sigma <- 3
raw.moments <- c(2,13,62,475,3182)
d1 <- dapx_gca(xvals, raw.moments, basis='normal')
d2 <- dnorm(xvals,mean=mu,sd=sigma)
## Not run: 
plot(xvals,d1)
lines(xvals,d2,col='red')

## End(Not run)
p1 <- papx_gca(xvals, raw.moments, basis='normal')
p2 <- pnorm(xvals,mean=mu,sd=sigma)
## Not run: 
plot(xvals,p1)
lines(xvals,p2,col='red')

## End(Not run)

# for a one-sided distribution, like the chi-square
chidf <- 30
ords <- seq(1,9)
raw.moments <- exp(ords * log(2) + lgamma((chidf/2) + ords) - lgamma(chidf/2))
xvals <- seq(0.3,10,length.out=501)
d1g <- dapx_gca(xvals, raw.moments, support=c(0,Inf), basis='gamma')
d2 <- dchisq(xvals,df=chidf)
## Not run: 
plot(xvals,d1g)
lines(xvals,d2,col='red')

## End(Not run)

p1g <- papx_gca(xvals, raw.moments, support=c(0,Inf), basis='gamma')
p2 <- pchisq(xvals,df=chidf)
## Not run: 
plot(xvals,p1g)
lines(xvals,p2,col='red')

## End(Not run)

# for a one-sided distribution, like the log-normal
mu <- 2
sigma <- 1
ords <- seq(1,8)
raw.moments <- exp(ords * mu + 0.5 * (sigma*ords)^2)
xvals <- seq(0.5,10,length.out=501)
d1g <- dapx_gca(xvals, raw.moments, support=c(0,Inf), basis='gamma')
d2 <- dnorm(log(xvals),mean=mu,sd=sigma) / xvals
## Not run: 
	plot(xvals,d1g)
	lines(xvals,d2,col='red')

## End(Not run)

Convert moments to raw cumulants.

Description

Conversion of a vector of moments to raw cumulants.

Usage

moment2cumulant(moms)

Arguments

moms

a vector of the moments. The first element is the first moment (the mean). If centered moments are given, the first moment must be zero. If raw moments are given, the first moment must be the mean.

Details

The 'raw' cumulants \kappa_i are connected to the 'raw' (uncentered) moments, \mu_i' via the equation

\kappa_n = \mu_n' - \sum_{m=1}^{n-1} {n-1 \choose m-1} \kappa_m \mu_{n-m}'

Note that this formula also works for central moments, assuming the distribution has been normalized to zero mean.

Value

a vector of the cumulants. The first element of the input shall be the same as the first element of the output.

Note

The presence of a NA or infinite value in the input will propagate to the output.

Author(s)

Steven E. Pav shabbychef@gmail.com

See Also

cumulant2moment

Examples

# normal distribution, mean 0, variance 1
n.cum <- moment2cumulant(c(0,1,0,3,0,15))
# normal distribution, mean 1, variance 1
n.cum <- moment2cumulant(c(1,2,4,10,26))
# exponential distribution
lambda <- 0.7
n <- 1:6
e.cum <- moment2cumulant(factorial(n) / (lambda^n))

Approximate quantile via Cornish-Fisher expansion.

Description

Approximate the quantile function of a distribution via its cumulants.

Usage

qapx_cf(p, raw.cumulants, support=c(-Inf,Inf), lower.tail = TRUE, log.p = FALSE)

Arguments

p

where to evaluate the approximate distribution.

raw.cumulants

an atomic array of the 1st through kth raw cumulants. The first value is the mean of the distribution, the second should be the variance of the distribution, the remainder are raw cumulants.

support

the support of the density function. It is assumed that the density is zero on the complement of this open interval. This defaults to c(-Inf,Inf) for the normal basis, c(0,Inf) for the gamma basis, and c(0,1) for the Beta, and c(-1,1) for the arcsine and wigner.

lower.tail

whether to compute the lower tail. If false, we approximate the survival function.

log.p

logical; if TRUE, probabilities p are given as \mbox{log}(p).

Details

Given the cumulants of a probability distribution, we approximate the quantile function via a Cornish-Fisher expansion.

Value

The approximate quantile at p.

Note

Monotonicity of the quantile function is not guaranteed.

Author(s)

Steven E. Pav shabbychef@gmail.com

References

Lee, Y-S., and Lin, T-K. "Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 41, no. 1 (1992): 233-240. http://www.jstor.org/stable/2347649

Lee, Y-S., and Lin, T-K. "Correction to Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 42, no. 1 (1993): 268-269. http://www.jstor.org/stable/2347433

AS 269. http://lib.stat.cmu.edu/apstat/269

Jaschke, Stefan R. "The Cornish-Fisher-expansion in the context of Delta-Gamma-normal approximations." No. 2001, 54. Discussion Papers, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes, 2001. http://www.jaschke-net.de/papers/CoFi.pdf

See Also

dapx_gca, papx_gca, AS269, rapx_cf

Examples

# normal distribution:
pvals <- seq(0.001,0.999,length.out=501)
q1 <- qapx_cf(pvals, c(0,1,0,0,0,0,0))
q2 <- qnorm(pvals)
q1 - q2

Approximate random generation via Cornish-Fisher expansion.

Description

Approximate random generation via approximate quantile function.

Usage

rapx_cf(n, raw.cumulants, support=c(-Inf,Inf))

Arguments

n

number of observations. If 'length(n) > 1', the length is taken to be the number required.

raw.cumulants

an atomic array of the 1st through kth raw cumulants. The first value is the mean of the distribution, the second should be the variance of the distribution, the remainder are raw cumulants.

support

the support of the density function. It is assumed that the density is zero on the complement of this open interval. This defaults to c(-Inf,Inf) for the normal basis, c(0,Inf) for the gamma basis, and c(0,1) for the Beta, and c(-1,1) for the arcsine and wigner.

Details

Given the cumulants of a probability distribution, we approximate the quantile function via a Cornish-Fisher expansion.

Value

A vector of approximate draws.

Author(s)

Steven E. Pav shabbychef@gmail.com

See Also

qapx_cf

Examples

# normal distribution:
r1 <- rapx_cf(1000, c(0,1,0,0,0,0,0))