Maintainer: | Steven E. Pav <shabbychef@gmail.com> |
Version: | 0.1.6 |
Date: | 2017-03-17 |
License: | LGPL-3 |
Title: | PDQ Functions via Gram Charlier, Edgeworth, and Cornish Fisher Approximations |
BugReports: | https://github.com/shabbychef/PDQutils/issues |
Description: | A collection of tools for approximating the 'PDQ' functions (respectively, the cumulative distribution, density, and quantile) of probability distributions via classical expansions involving moments and cumulants. |
Depends: | R (≥ 3.0.2) |
Imports: | orthopolynom, moments |
Suggests: | ggplot2, reshape2, testthat, knitr |
URL: | https://github.com/shabbychef/PDQutils |
VignetteBuilder: | knitr |
Collate: | 'cornish_fisher.r' 'edgeworth.r' 'gram_charlier.r' 'moments.r' 'PDQutils.r' |
RoxygenNote: | 5.0.1 |
NeedsCompilation: | no |
Packaged: | 2017-03-18 03:55:40 UTC; spav |
Author: | Steven E. Pav [aut, cre] |
Repository: | CRAN |
Date/Publication: | 2017-03-18 08:37:53 UTC |
PDQ Functions via Gram Charlier, Edgeworth, and Cornish Fisher Approximations
Description
PDQ Functions via Gram-Charlier, Edgeworth, and Cornish Fisher Approximations
Gram Charlier and Edgeworth Expansions
Given the raw moments of a probability distribution, we can approximate the probability density function, or the cumulative distribution function, via a Gram-Charlier 'A' expansion on the standardized distribution.
Suppose f(x)
is the probability density of some random
variable, and let F(x)
be the cumulative distribution function.
Let He_j(x)
be the j
th probabilist's Hermite
polynomial. These polynomials form an orthogonal basis, with respect to the
function w(x)
of the Hilbert space of functions which are square
w
-weighted integrable. The weighting function is
w(x) = e^{-x^2/2} = \sqrt{2\pi}\phi(x)
.
The orthogonality relationship is
\int_{-\infty}^{\infty} He_i(x) He_j(x) w(x) \mathrm{d}x = \sqrt{2\pi} j! \delta_{ij}.
Expanding the density f(x)
in terms of these polynomials in the
usual way (abusing orthogonality) one has
f(x) \approx \sum_{0\le j} \frac{He_j(x)}{j!} \phi(x) \int_{-\infty}^{\infty} f(z) He_j(z) \mathrm{d}z.
The cumulative distribution function is 'simply' the integral of this expansion. Abusing certain facts regarding the PDF and CDF of the normal distribution and the probabilist's Hermite polynomials, the CDF has the representation
F(x) = \Phi(x) - \sum_{1\le j} \frac{He_{j-1}(x)}{j!} \phi(x) \int_{-\infty}^{\infty} f(z) He_j(z) \mathrm{d}z.
These series contain coefficients defined by the probability distribution under consideration. They take the form
c_j = \frac{1}{j!}\int_{-\infty}^{\infty} f(z) He_j(z) \mathrm{d}z.
Using linearity of the integral, these coefficients are easily computed in terms of the coefficients of the Hermite polynomials and the raw, uncentered moments of the probability distribution under consideration. Note that it may be the case that the computation of these coefficients suffers from bad numerical cancellation for some distributions, and that an alternative formulation may be more numerically robust.
Generalized Gram Charlier Expansions
The Gram Charlier 'A' expansion is most appropriate for random variables
which are vaguely like the normal distribution. For those which are like
another distribution, the same general approach can be pursued. One needs
only define a weighting function, w
, which is the density of the
'parent' probability distribution, then find polynomials,
p_n(x)
which are orthogonal with respect to w
over
its support. One has
f(x) = \sum_{0\le j} p_j(x) w(x) \frac{1}{h_j} \int_{-\infty}^{\infty} f(z) p_j(z) \mathrm{d}z.
Here h_j
is the normalizing constant:
h_j = \int w(z)p_j^2(z)\mathrm{d}z.
One must then use facts about the orthogonal polynomials to approximate the CDF.
Another approach to arrive at the same computation is described by Berberan-Santos.
Cornish Fisher Approximation
The Cornish Fisher approximation is the Legendre inversion of the Edgeworth expansion of a distribution, but ordered in a way that is convenient when used on the mean of a number of independent draws of a random variable.
Suppose x_1, x_2, \ldots, x_n
are n
independent
draws from some probability distribution.
Letting
X = \frac{1}{\sqrt{n}} \sum_{1 \le i \le n} x_i,
the Central Limit Theorem assures us that, assuming finite variance,
X \rightarrow \mathcal{N}(\sqrt{n}\mu, \sigma),
with convergence in n
.
The Cornish Fisher approximation gives a more detailed picture of the
quantiles of X
, one that is arranged in decreasing powers of
\sqrt{n}
. The quantile function is the function q(p)
such that P\left(X \le q(p)\right) = q(p)
. The
Cornish Fisher expansion is
q(p) = \sqrt{n}\mu + \sigma \left(z + \sum_{3 \le j} c_j f_j(z)\right),
where z = \Phi^{-1}(p)
, and c_j
involves
standardized cumulants of the distribution of x_i
of order
j
and higher. Moreover, the c_j
feature decreasing powers
of \sqrt{n}
, giving some justification for truncation.
When n=1
, however, the ordering is somewhat arbitrary.
Legal Mumbo Jumbo
PDQutils is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
Note
This package is maintained as a hobby.
Author(s)
Steven E. Pav shabbychef@gmail.com
References
Lee, Y-S., and Lin, T-K. "Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 41, no. 1 (1992): 233-240. http://www.jstor.org/stable/2347649
Lee, Y-S., and Lin, T-K. "Correction to Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 42, no. 1 (1993): 268-269. http://www.jstor.org/stable/2347433
AS 269. http://lib.stat.cmu.edu/apstat/269
Jaschke, Stefan R. "The Cornish-Fisher-expansion in the context of Delta-Gamma-normal approximations." No. 2001, 54. Discussion Papers, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes, 2001. http://www.jaschke-net.de/papers/CoFi.pdf
S. Blinnikov and R. Moessner. "Expansions for nearly Gaussian distributions." Astronomy and Astrophysics Supplement 130 (1998): 193-205. http://arxiv.org/abs/astro-ph/9711239
M. N. Berberan-Santos. "Expressing a Probability Density Function in Terms of another PDF: A Generalized Gram-Charlier Expansion." Journal of Mathematical Chemistry 42, no 3 (2007): 585-594. http://web.ist.utl.pt/ist12219/data/115.pdf
Higher order Cornish Fisher approximation.
Description
Lee and Lin's Algorithm AS269 for higher order Cornish Fisher quantile approximation.
Usage
AS269(z,cumul,order.max=NULL,all.ords=FALSE)
Arguments
z |
the quantiles of the normal distribution. an atomic vector. |
cumul |
the standardized cumulants of order 3, 4, ..., k. an atomic vector. |
order.max |
the maximum order approximation, must be greater than
|
all.ords |
a logical value. If |
Details
The Cornish Fisher approximation is the Legendre inversion of the Edgeworth expansion of a distribution, but ordered in a way that is convenient when used on the mean of a number of independent draws of a random variable.
Suppose x_1, x_2, \ldots, x_n
are n
independent
draws from some probability distribution.
Letting
X = \frac{1}{\sqrt{n}} \sum_{1 \le i \le n} x_i,
the Central Limit Theorem assures us that, assuming finite variance,
X \rightarrow \mathcal{N}(\sqrt{n}\mu, \sigma),
with convergence in n
.
The Cornish Fisher approximation gives a more detailed picture of the
quantiles of X
, one that is arranged in decreasing powers of
\sqrt{n}
. The quantile function is the function q(p)
such that P\left(X \le q(p)\right) = q(p)
. The
Cornish Fisher expansion is
q(p) = \sqrt{n}\mu + \sigma \left(z + \sum_{3 \le j} c_j f_j(z)\right),
where z = \Phi^{-1}(p)
, and c_j
involves
standardized cumulants of the distribution of x_i
of order
up to j
. Moreover, the c_j
include decreasing
powers of \sqrt{n}
, giving some justification for truncation.
When n=1
, however, the ordering is somewhat arbitrary.
Value
A matrix, which is, depending on all.ords
, either with one column per
order of the approximation, or a single column giving the maximum order
approximation. There is one row per value in z
.
Invalid arguments will result in return value NaN
with a warning.
Note
A warning will be thrown if any of the z are greater than 3.719017274 in absolute value; the traditional AS269 errors out in this case.
Author(s)
Steven E. Pav shabbychef@gmail.com
References
Lee, Y-S., and Lin, T-K. "Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 41, no. 1 (1992): 233-240. http://www.jstor.org/stable/2347649
Lee, Y-S., and Lin, T-K. "Correction to Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 42, no. 1 (1993): 268-269. http://www.jstor.org/stable/2347433
AS 269. http://lib.stat.cmu.edu/apstat/269
Jaschke, Stefan R. "The Cornish-Fisher-expansion in the context of Delta-Gamma-normal approximations." No. 2001, 54. Discussion Papers, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes, 2001. http://www.jaschke-net.de/papers/CoFi.pdf
See Also
Examples
foo <- AS269(seq(-2,2,0.01),c(0,2,0,4))
# test with the normal distribution:
s.cumul <- c(0,0,0,0,0,0,0,0,0)
pv <- seq(0.001,0.999,0.001)
zv <- qnorm(pv)
apq <- AS269(zv,s.cumul,all.ords=FALSE)
err <- zv - apq
# test with the exponential distribution
rate <- 0.7
n <- 18
# these are 'raw' cumulants'
cumul <- (rate ^ -(1:n)) * factorial(0:(n-1))
# standardize and chop
s.cumul <- cumul[3:length(cumul)] / (cumul[2]^((3:length(cumul))/2))
pv <- seq(0.001,0.999,0.001)
zv <- qnorm(pv)
apq <- cumul[1] + sqrt(cumul[2]) * AS269(zv,s.cumul,all.ords=TRUE)
truq <- qexp(pv, rate=rate)
err <- truq - apq
colSums(abs(err))
# an example from Wikipedia page on CF,
# \url{https://en.wikipedia.org/wiki/Cornish%E2%80%93Fisher_expansion}
s.cumul <- c(5,2)
apq <- 10 + sqrt(25) * AS269(qnorm(0.95),s.cumul,all.ords=TRUE)
News for package ‘PDQutils’:
Description
News for package ‘PDQutils’.
PDQutils Version 0.1.6 (2017-03-18)
Package maintenance–no new features.
move github figures to location CRAN understands.
PDQutils Version 0.1.5 (2016-09-18)
Package maintenance–no new features.
Remove errant files from test directory.
PDQutils Version 0.1.4 (2016-03-03)
Package maintenance–no new features.
Incompatibilities in vignette with ggplot2 release.
PDQutils Version 0.1.3 (2016-01-04)
Package maintenance–no new features.
PDQutils Version 0.1.2 (2015-06-15)
Generalized Gram Charlier expansions.
bugfixes.
PDQutils Version 0.1.1 (2015-02-26)
Edgeworth expansions.
PDQutils Initial Version 0.1.0 (2015-02-14)
first CRAN release.
Convert raw cumulants to moments.
Description
Conversion of a vector of raw cumulatnts to moments.
Usage
cumulant2moment(kappa)
Arguments
kappa |
a vector of the raw cumulants. The first element is the first cumulant, which is also the first moment. |
Details
The 'raw' cumulants \kappa_i
are connected
to the 'raw' (uncentered) moments, \mu_i'
via
the equation
\mu_n' = \kappa_n + \sum_{m=1}^{n-1} {n-1 \choose m-1} \kappa_m \mu_{n-m}'
Value
a vector of the raw moments. The first element of the input shall be the same as the first element of the output.
Author(s)
Steven E. Pav shabbychef@gmail.com
See Also
Examples
# normal distribution, mean 0, variance 1
n.mom <- cumulant2moment(c(0,1,0,0,0,0))
# normal distribution, mean 1, variance 1
n.mom <- cumulant2moment(c(1,1,0,0,0,0))
Approximate density and distribution via Edgeworth expansion.
Description
Approximate the probability density or cumulative distribution function of a distribution via its raw cumulants.
Usage
dapx_edgeworth(x, raw.cumulants, support=c(-Inf,Inf), log=FALSE)
papx_edgeworth(q, raw.cumulants, support=c(-Inf,Inf), lower.tail=TRUE, log.p=FALSE)
Arguments
x |
where to evaluate the approximate density. |
raw.cumulants |
an atomic array of the 1st through kth raw cumulants of the probability distribution. The first cumulant is the mean, the second is the variance. The third is not the typical unitless skew. |
support |
the support of the density function. It is assumed that the density is zero on the complement of this open interval. |
log |
logical; if TRUE, densities |
q |
where to evaluate the approximate distribution. |
log.p |
logical; if TRUE, probabilities p are given
as |
lower.tail |
whether to compute the lower tail. If false, we approximate the survival function. |
Details
Given the raw cumulants of a probability distribution, we can approximate the probability density function, or the cumulative distribution function, via an Edgeworth expansion on the standardized distribution. The derivation of the Edgeworth expansion is rather more complicated than that of the Gram Charlier approximation, involving the characteristic function and an expression of the higher order derivatives of the composition of functions; see Blinnikov and Moessner for more details. The Edgeworth expansion can be expressed succinctly as
\sigma f(\sigma x) = \phi(x) + \phi(x)\sum_{1 \le s}\sigma^s \sum_{\{k_m\}} He_{s+2r}(x) c_{k_m},
where the second sum is over some partitions, and the constant c
involves cumulants up to order s+2
. Unlike the Gram Charlier
expansion, of which it is a rearrangement, the Edgeworth expansion
is arranged in increasing powers of the standard deviation
\sigma
.
Value
The approximate density at x
, or the approximate CDF at
q
.
Note
Monotonicity of the CDF is not guaranteed.
Author(s)
Steven E. Pav shabbychef@gmail.com
References
S. Blinnikov and R. Moessner. "Expansions for nearly Gaussian distributions." Astronomy and Astrophysics Supplement 130 (1998): 193-205. http://arxiv.org/abs/astro-ph/9711239
See Also
the Gram Charlier expansions, dapx_gca, papx_gca
Examples
# normal distribution, for which this is silly
xvals <- seq(-2,2,length.out=501)
d1 <- dapx_edgeworth(xvals, c(0,1,0,0,0,0))
d2 <- dnorm(xvals)
d1 - d2
qvals <- seq(-2,2,length.out=501)
p1 <- papx_edgeworth(qvals, c(0,1,0,0,0,0))
p2 <- pnorm(qvals)
p1 - p2
Approximate density and distribution via Gram-Charlier A expansion.
Description
Approximate the probability density or cumulative distribution function of a distribution via its raw moments.
Usage
dapx_gca(x, raw.moments, support=NULL,
basis=c('normal','gamma','beta','arcsine','wigner'),
basepar=NULL, log=FALSE)
papx_gca(q, raw.moments, support=NULL,
basis=c('normal','gamma','beta','arcsine','wigner'),
basepar=NULL, lower.tail=TRUE, log.p=FALSE)
Arguments
x |
where to evaluate the approximate density. |
raw.moments |
an atomic array of the 1st through kth raw moments of the probability distribution. |
support |
the support of the density function. It is assumed
that the density is zero on the complement of this open interval.
This defaults to |
basis |
the basis under which to perform the approximation. |
basepar |
the parameters for the base distribution approximation.
If |
log |
logical; if TRUE, densities |
q |
where to evaluate the approximate distribution. |
log.p |
logical; if TRUE, probabilities p are given
as |
lower.tail |
whether to compute the lower tail. If false, we approximate the survival function. |
Details
Given the raw moments of a probability distribution, we can approximate the probability
density function, or the cumulative distribution function, via a Gram-Charlier
expansion on the standardized distribution. This expansion uses
some weighting function, w
, typically the density of some 'parent'
probability distribution, and polynomials, p_n
which are
orthogonal with respect to that weighting:
\int_{-\infty}^{\infty} p_n(x) p_m(x) w(x) \mathrm{d}x = h_n \delta_{mn}.
Let f(x)
be the probability density of some random variable,
with cumulative distribution function F(x)
. We express
f(x) = \sum_{n \ge 0} c_n p_n(x) w(x)
The constants c_n
can be computed from the known moments
of the distribution.
For the Gram Charlier 'A' series, the weighting function is the PDF of the normal distribution, and the polynomials are the (probabilist's) Hermite polynomials. As a weighting function, one can also use the PDF of the gamma distribution (resulting in generalized Laguerre polynomials), or the PDF of the Beta distribution (resulting in Jacobi polynomials).
Value
The approximate density at x
, or the approximate CDF at
Note
Monotonicity of the CDF is not guaranteed.
Author(s)
Steven E. Pav shabbychef@gmail.com
References
Jaschke, Stefan R. "The Cornish-Fisher-expansion in the context of Delta-Gamma-normal approximations." No. 2001, 54. Discussion Papers, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes, 2001. http://www.jaschke-net.de/papers/CoFi.pdf
S. Blinnikov and R. Moessner. "Expansions for nearly Gaussian distributions." Astronomy and Astrophysics Supplement 130 (1998): 193-205. http://arxiv.org/abs/astro-ph/9711239
See Also
Examples
# normal distribution:
xvals <- seq(-2,2,length.out=501)
d1 <- dapx_gca(xvals, c(0,1,0,3,0), basis='normal')
d2 <- dnorm(xvals)
# they should match:
d1 - d2
qvals <- seq(-2,2,length.out=501)
p1 <- papx_gca(qvals, c(0,1,0,3,0))
p2 <- pnorm(qvals)
p1 - p2
xvals <- seq(-6,6,length.out=501)
mu <- 2
sigma <- 3
raw.moments <- c(2,13,62,475,3182)
d1 <- dapx_gca(xvals, raw.moments, basis='normal')
d2 <- dnorm(xvals,mean=mu,sd=sigma)
## Not run:
plot(xvals,d1)
lines(xvals,d2,col='red')
## End(Not run)
p1 <- papx_gca(xvals, raw.moments, basis='normal')
p2 <- pnorm(xvals,mean=mu,sd=sigma)
## Not run:
plot(xvals,p1)
lines(xvals,p2,col='red')
## End(Not run)
# for a one-sided distribution, like the chi-square
chidf <- 30
ords <- seq(1,9)
raw.moments <- exp(ords * log(2) + lgamma((chidf/2) + ords) - lgamma(chidf/2))
xvals <- seq(0.3,10,length.out=501)
d1g <- dapx_gca(xvals, raw.moments, support=c(0,Inf), basis='gamma')
d2 <- dchisq(xvals,df=chidf)
## Not run:
plot(xvals,d1g)
lines(xvals,d2,col='red')
## End(Not run)
p1g <- papx_gca(xvals, raw.moments, support=c(0,Inf), basis='gamma')
p2 <- pchisq(xvals,df=chidf)
## Not run:
plot(xvals,p1g)
lines(xvals,p2,col='red')
## End(Not run)
# for a one-sided distribution, like the log-normal
mu <- 2
sigma <- 1
ords <- seq(1,8)
raw.moments <- exp(ords * mu + 0.5 * (sigma*ords)^2)
xvals <- seq(0.5,10,length.out=501)
d1g <- dapx_gca(xvals, raw.moments, support=c(0,Inf), basis='gamma')
d2 <- dnorm(log(xvals),mean=mu,sd=sigma) / xvals
## Not run:
plot(xvals,d1g)
lines(xvals,d2,col='red')
## End(Not run)
Convert moments to raw cumulants.
Description
Conversion of a vector of moments to raw cumulants.
Usage
moment2cumulant(moms)
Arguments
moms |
a vector of the moments. The first element is the first moment (the mean). If centered moments are given, the first moment must be zero. If raw moments are given, the first moment must be the mean. |
Details
The 'raw' cumulants \kappa_i
are connected
to the 'raw' (uncentered) moments, \mu_i'
via
the equation
\kappa_n = \mu_n' - \sum_{m=1}^{n-1} {n-1 \choose m-1} \kappa_m \mu_{n-m}'
Note that this formula also works for central moments, assuming the distribution has been normalized to zero mean.
Value
a vector of the cumulants. The first element of the input shall be the same as the first element of the output.
Note
The presence of a NA
or infinite value in the input
will propagate to the output.
Author(s)
Steven E. Pav shabbychef@gmail.com
See Also
Examples
# normal distribution, mean 0, variance 1
n.cum <- moment2cumulant(c(0,1,0,3,0,15))
# normal distribution, mean 1, variance 1
n.cum <- moment2cumulant(c(1,2,4,10,26))
# exponential distribution
lambda <- 0.7
n <- 1:6
e.cum <- moment2cumulant(factorial(n) / (lambda^n))
Approximate quantile via Cornish-Fisher expansion.
Description
Approximate the quantile function of a distribution via its cumulants.
Usage
qapx_cf(p, raw.cumulants, support=c(-Inf,Inf), lower.tail = TRUE, log.p = FALSE)
Arguments
p |
where to evaluate the approximate distribution. |
raw.cumulants |
an atomic array of the 1st through kth raw cumulants. The first value is the mean of the distribution, the second should be the variance of the distribution, the remainder are raw cumulants. |
support |
the support of the density function. It is assumed
that the density is zero on the complement of this open interval.
This defaults to |
lower.tail |
whether to compute the lower tail. If false, we approximate the survival function. |
log.p |
logical; if TRUE, probabilities p are given
as |
Details
Given the cumulants of a probability distribution, we approximate the quantile function via a Cornish-Fisher expansion.
Value
The approximate quantile at p
.
Note
Monotonicity of the quantile function is not guaranteed.
Author(s)
Steven E. Pav shabbychef@gmail.com
References
Lee, Y-S., and Lin, T-K. "Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 41, no. 1 (1992): 233-240. http://www.jstor.org/stable/2347649
Lee, Y-S., and Lin, T-K. "Correction to Algorithm AS269: High Order Cornish Fisher Expansion." Appl. Stat. 42, no. 1 (1993): 268-269. http://www.jstor.org/stable/2347433
AS 269. http://lib.stat.cmu.edu/apstat/269
Jaschke, Stefan R. "The Cornish-Fisher-expansion in the context of Delta-Gamma-normal approximations." No. 2001, 54. Discussion Papers, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes, 2001. http://www.jaschke-net.de/papers/CoFi.pdf
See Also
dapx_gca, papx_gca, AS269, rapx_cf
Examples
# normal distribution:
pvals <- seq(0.001,0.999,length.out=501)
q1 <- qapx_cf(pvals, c(0,1,0,0,0,0,0))
q2 <- qnorm(pvals)
q1 - q2
Approximate random generation via Cornish-Fisher expansion.
Description
Approximate random generation via approximate quantile function.
Usage
rapx_cf(n, raw.cumulants, support=c(-Inf,Inf))
Arguments
n |
number of observations. If 'length(n) > 1', the length is taken to be the number required. |
raw.cumulants |
an atomic array of the 1st through kth raw cumulants. The first value is the mean of the distribution, the second should be the variance of the distribution, the remainder are raw cumulants. |
support |
the support of the density function. It is assumed
that the density is zero on the complement of this open interval.
This defaults to |
Details
Given the cumulants of a probability distribution, we approximate the quantile function via a Cornish-Fisher expansion.
Value
A vector of approximate draws.
Author(s)
Steven E. Pav shabbychef@gmail.com
See Also
Examples
# normal distribution:
r1 <- rapx_cf(1000, c(0,1,0,0,0,0,0))