Title: | Visualizations of Distributions and Uncertainty |
Version: | 3.3.3 |
Date: | 2025-04-20 |
Maintainer: | Matthew Kay <mjskay@northwestern.edu> |
Description: | Provides primitives for visualizing distributions using 'ggplot2' that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as frequentist confidence distributions or Bayesian priors) and distributions represented as samples (such as bootstrap distributions or Bayesian posterior samples) are easily visualized. Visualization primitives include but are not limited to: points with multiple uncertainty intervals, eye plots (Spiegelhalter D., 1999) https://ideas.repec.org/a/bla/jorssa/v162y1999i1p45-58.html, density plots, gradient plots, dot plots (Wilkinson L., 1999) <doi:10.1080/00031305.1999.10474474>, quantile dot plots (Kay M., Kola T., Hullman J., Munson S., 2016) <doi:10.1145/2858036.2858558>, complementary cumulative distribution function barplots (Fernandes M., Walls L., Munson S., Hullman J., Kay M., 2018) <doi:10.1145/3173574.3173718>, and fit curves with multiple uncertainty ribbons. |
Depends: | R (≥ 4.0.0) |
Imports: | grid, ggplot2 (≥ 3.5.0), scales, rlang (≥ 0.3.0), cli, tibble, vctrs, withr, glue, gtable, distributional (≥ 0.3.2), numDeriv, quadprog, Rcpp |
Suggests: | tidyselect, dplyr (≥ 1.0.0), fda, posterior (≥ 1.4.0), beeswarm (≥ 0.4.0), rmarkdown, knitr, testthat (≥ 3.0.0), vdiffr (≥ 1.0.0), svglite (≥ 2.1.0), fontquiver, sysfonts, showtext, mvtnorm, covr, broom (≥ 0.5.6), patchwork, tidyr (≥ 1.0.0), ragg (≥ 1.3.0), pkgdown |
License: | GPL (≥ 3) |
Language: | en-US |
BugReports: | https://github.com/mjskay/ggdist/issues |
URL: | https://mjskay.github.io/ggdist/, https://github.com/mjskay/ggdist/ |
VignetteBuilder: | knitr |
RoxygenNote: | 7.3.2 |
LazyData: | true |
Encoding: | UTF-8 |
Collate: | "ggdist-package.R" "util.R" "compat.R" "rd.R" "RcppExports.R" "abstract_geom.R" "abstract_stat.R" "abstract_stat_slabinterval.R" "auto_partial.R" "binning_methods.R" "bounder.R" "curve_interval.R" "cut_cdf_qi.R" "data.R" "density.R" "distributions.R" "draw_key_slabinterval.R" "geom.R" "geom_slabinterval.R" "geom_dotsinterval.R" "geom_blur_dots.R" "geom_interval.R" "geom_lineribbon.R" "geom_pointinterval.R" "geom_slab.R" "geom_spike.R" "geom_swarm.R" "guide_rampbar.R" "interval_widths.R" "lkjcorr_marginal.R" "parse_dist.R" "partial_colour_ramp.R" "point_interval.R" "position_dodgejust.R" "pr.R" "rd_density.R" "rd_dotsinterval.R" "rd_slabinterval.R" "rd_spike.R" "rd_lineribbon.R" "scale_colour_ramp.R" "scale_thickness.R" "scale_side_mirrored.R" "scale_.R" "smooth.R" "stat.R" "stat_slabinterval.R" "stat_dotsinterval.R" "stat_mcse_dots.R" "stat_pointinterval.R" "stat_interval.R" "stat_lineribbon.R" "stat_spike.R" "student_t.R" "subguide.R" "subscale.R" "testthat.R" "theme_ggdist.R" "thickness.R" "tidy_format_translators.R" "weighted_ecdf.R" "weighted_hist.R" "weighted_quantile.R" "deprecated.R" |
Config/testthat/edition: | 3 |
LinkingTo: | Rcpp |
NeedsCompilation: | yes |
Packaged: | 2025-04-22 23:25:44 UTC; matth |
Author: | Matthew Kay [aut, cre], Brenton M. Wiernik [ctb] |
Repository: | CRAN |
Date/Publication: | 2025-04-23 00:20:02 UTC |
Visualizations of Distributions and Uncertainty
Description
ggdist
is an R package that aims to make it easy to integrate
popular Bayesian modeling methods into a tidy data + ggplot workflow.
Details
ggdist
is an R package that provides a flexible set of ggplot2
geoms and stats designed
especially for visualizing distributions and uncertainty. It is designed for both
frequentist and Bayesian uncertainty visualization, taking the view that uncertainty
visualization can be unified through the perspective of distribution visualization:
for frequentist models, one visualizes confidence distributions or bootstrap distributions
(see vignette("freq-uncertainty-vis")
); for Bayesian models, one visualizes probability
distributions (see vignette("tidybayes", package = "tidybayes")
).
The geom_slabinterval()
/ stat_slabinterval()
family (see vignette("slabinterval")
) makes it
easy to visualize point summaries and intervals, eye plots, half-eye plots, ridge plots,
CCDF bar plots, gradient plots, histograms, and more.
The geom_dotsinterval()
/ stat_dotsinterval()
family (see vignette("dotsinterval")
) makes
it easy to visualize dot+interval plots, Wilkinson dotplots, beeswarm plots, and quantile dotplots.
The geom_lineribbon()
/ stat_lineribbon()
family (see vignette("lineribbon")
)
makes it easy to visualize fit lines with an arbitrary number of uncertainty bands.
Author(s)
Maintainer: Matthew Kay mjskay@northwestern.edu
Other contributors:
Brenton M. Wiernik brenton@wiernik.org [contributor]
See Also
Useful links:
Report bugs at https://github.com/mjskay/ggdist/issues
Base ggproto classes for ggdist
Description
Base ggproto classes for ggdist
See Also
Probability expressions in ggdist aesthetics
Description
Experimental probability-like expressions that can be used in place of
some after_stat()
expressions in aesthetic assignments in ggdist stats.
Usage
Pr_(x)
p_(x)
Arguments
x |
<bare language> Expressions. See Probability expressions, below. |
Details
Pr_()
and p_()
are an experimental mini-language for specifying aesthetic values
based on probabilities and probability densities derived from distributions
supplied to ggdist stats (e.g., in stat_slabinterval()
,
stat_dotsinterval()
, etc.). They generate expressions that use after_stat()
and the computed variables of the stat (such as cdf
and pdf
; see e.g.
the Computed Variables section of stat_slabinterval()
) to compute
the desired probabilities or densities.
For example, one way to map the density of a distribution onto the alpha
aesthetic of a slab is to use after_stat(pdf)
:
ggplot() + stat_slab(aes(xdist = distributional::dist_normal(), alpha = after_stat(pdf)))
ggdist probability expressions offer an alternative, equivalent syntax:
ggplot() + stat_slab(aes(xdist = distributional::dist_normal(), alpha = !!p_(x)))
Where p_(x)
is the probability density function. The use of !!
is
necessary to splice the generated expression into the aes()
call; for
more information, see quasiquotation.
Probability expressions
Probability expressions consist of a call to Pr_()
or p_()
containing
a small number of valid combinations of operators and variable names.
Valid variables in probability expressions include:
-
x
,y
, orvalue
: values along thex
ory
axis.value
is the orientation-neutral form. -
xdist
,ydist
, ordist
: distributions mapped along thex
ory
axis.dist
is the orientation-neutral form.X
andY
can also be used as synonyms forxdist
andydist
. -
interval
: the smallest interval containing the currentx
/y
value.
Pr_()
generates expressions for probabilities, e.g. cumulative distribution
functions (CDFs). Valid operators inside Pr_()
are:
-
<
,<=
,>
,>=
: generates values of the cumulative distribution function (CDF) or complementary CDF by comparing one of {x
,y
,value
} to one of {xdist
,ydist
,dist
,X
,Y
}. For example,Pr_(xdist <= x)
gives the CDF andPr_(xdist > x)
gives the CCDF. -
%in%
: currently can only be used withinterval
on the right-hand side: gives the probability of {x
,y
,value
} (left-hand side) being in the smallest interval the stat generated that contains the value; e.g.Pr_(x %in% interval)
.
p_()
generates expressions for probability density functions or probability mass
functions (depending on if the underlying distribution is continuous or
discrete). It currently does not allow any operators in the expression, and
must be passed one of x
, y
, or value
.
See Also
The Computed Variables section of stat_slabinterval()
(especially
cdf
and pdf
) and the after_stat()
function.
Examples
library(ggplot2)
library(distributional)
df = data.frame(
d = c(dist_normal(2.7, 1), dist_lognormal(1, 1/3)),
name = c("normal", "lognormal")
)
# map density onto alpha of the fill
ggplot(df, aes(y = name, xdist = d)) +
stat_slabinterval(aes(alpha = !!p_(x)))
# map CCDF onto thickness (like stat_ccdfinterval())
ggplot(df, aes(y = name, xdist = d)) +
stat_slabinterval(aes(thickness = !!Pr_(xdist > x)))
# map containing interval onto fill
ggplot(df, aes(y = name, xdist = d)) +
stat_slabinterval(aes(fill = !!Pr_(x %in% interval)))
# the color scale in the previous example is not great, so turn the
# probability into an ordered factor and adjust the fill scale.
# Though, see also the `level` computed variable in `stat_slabinterval()`,
# which is probably easier to use to create this style of chart.
ggplot(df, aes(y = name, xdist = d)) +
stat_slabinterval(aes(fill = ordered(!!Pr_(x %in% interval)))) +
scale_fill_brewer(direction = -1)
Thinned subset of posterior sample from a Bayesian analysis of perception of correlation.
Description
Data from Kay and Heer (2016), primarily used for testing and examples.
Details
For more details, see Kay and Heer (2016) or the Github repository describing the analysis: https://github.com/mjskay/ranking-correlation. The original experiment (but not this analysis of it) is described in Harrison et al. (2014).
data("RankCorr")
is a substantially thinned version of the original posterior sample and has omitted several
parameters in order for it to be a more manageable size.
data("RankCorr_u_tau")
is used for testing and examples and is roughly the equivalent of the following:
data("RankCorr") RankCorr_u_tau = tidybayes::spread_draws(RankCorr, u_tau[i]))
References
Kay, Matthew, and Jeffrey Heer. (2016). "Beyond Weber's law: A second look at ranking visualizations of correlation." IEEE transactions on visualization and computer graphics 22(1): 469-478. doi:10.1109/TVCG.2015.2467671
Harrison, Lane, Fumeng Yang, Steven Franconeri, and Remco Chang. (2014). "Ranking visualizations of correlation using Weber's law." IEEE transactions on visualization and computer graphics 20(12): 1943-1952. doi:10.1109/TVCG.2014.2346979
Break (bin) alignment methods
Description
Methods for aligning breaks (bins) in histograms, as used in the align
argument to density_histogram()
.
Supports automatic partial function application with waived arguments.
Usage
align_none(breaks)
align_boundary(breaks, at = 0)
align_center(breaks, at = 0)
Arguments
breaks |
<numeric> A sorted vector of breaks (bin edges). |
at |
<scalar numeric> The alignment point.
|
Details
These functions take a sorted vector of equally-spaced breaks
giving
bin edges and return a numeric offset which, if subtracted from breaks
,
will align them as desired:
-
align_none()
performs no alignment (it always returns0
). -
align_boundary()
ensures that a bin edge lines up withat
. -
align_center()
ensures that a bin center lines up withat.
For align_boundary()
(respectively align_center()
), if no bin edge (or center) in the
range of breaks
would line up with at
, it ensures that at
is an integer
multiple of the bin width away from a bin edge (or center).
Value
A scalar numeric returning an offset to be subtracted from breaks
.
See Also
Examples
library(ggplot2)
set.seed(1234)
x = rnorm(200, 1, 2)
# If we manually specify a bin width using breaks_fixed(), the default
# alignment (align_none()) will not align bin edges to any "pretty" numbers.
# Here is a comparison of the three alignment methods on such a histogram:
ggplot(data.frame(x), aes(x)) +
stat_slab(
aes(y = "align_none()\nor 'none'"),
density = "histogram",
breaks = breaks_fixed(width = 1),
outline_bars = TRUE,
# no need to specify align; align_none() is the default
color = "black",
) +
stat_slab(
aes(y = "align_center(at = 0)\nor 'center'"),
density = "histogram",
breaks = breaks_fixed(width = 1),
align = align_center(at = 0), # or align = "center"
outline_bars = TRUE,
color = "black",
) +
stat_slab(
aes(y = "align_boundary(at = 0)\nor 'boundary'"),
density = "histogram",
breaks = breaks_fixed(width = 1),
align = align_boundary(at = 0), # or align = "boundary"
outline_bars = TRUE,
color = "black",
) +
geom_point(aes(y = 0.7), alpha = 0.5) +
labs(
subtitle = "ggdist::stat_slab(density = 'histogram', ...)",
y = "align =",
x = NULL
) +
geom_vline(xintercept = 0, linetype = "22", color = "red")
Automatic partial function application in ggdist
Description
Several ggdist functions support automatic partial application: when called, if all of their required arguments have not been provided, the function returns a modified version of itself that uses the arguments passed to it so far as defaults. Technically speaking, these functions are essentially "Curried" with respect to their required arguments, but I think "automatic partial application" gets the idea across more clearly.
Functions supporting automatic partial application include:
The
point_interval()
family, such asmedian_qi()
,mean_qi()
,mode_hdi()
, etc.The
smooth_
family, such assmooth_bounded()
,smooth_unbounded()
,smooth_discrete()
, andsmooth_bar()
.The
density_
family, such asdensity_bounded()
,density_unbounded()
anddensity_histogram()
.The align family.
The breaks family.
The bandwidth family.
The blur family.
Partial application makes it easier to supply custom parameters to these
functions when using them inside other functions, such as geoms and stats.
For example, smoothers for geom_dots()
can be supplied in one of three
ways:
as a suffix:
geom_dots(smooth = "bounded")
as a function:
geom_dots(smooth = smooth_bounded)
as a partially-applied function with options:
geom_dots(smooth = smooth_bounded(kernel = "cosine"))
Many other common arguments for ggdist functions work similarly; e.g.
density
, align
, breaks
, bandwidth
, and point_interval
arguments.
These function families (except point_interval()
) also support passing
waivers to their optional arguments: if waiver()
is passed to any
of these arguments, their default value (or the most
recently-partially-applied non-waiver value) is used instead.
Use the auto_partial()
function to create new functions that support
automatic partial application.
Usage
auto_partial(f, name = NULL, waivable = TRUE)
Arguments
f |
<function> Function to automatically partially-apply. |
name |
<string> Name of the function, to be used when printing. |
waivable |
<scalar logical> If |
Value
A modified version of f
that will automatically be partially
applied if all of its required arguments are not given.
Examples
set.seed(1234)
x = rnorm(100)
# the first required argument, `x`, of the density_ family is the vector
# to calculate a kernel density estimate from. If it is not provided, the
# function is partially applied and returned as-is
density_unbounded()
# we could create a new function that uses half the default bandwidth
density_half_bw = density_unbounded(adjust = 0.5)
density_half_bw
# we can overwrite partially-applied arguments
density_quarter_bw_trimmed = density_half_bw(adjust = 0.25, trim = TRUE)
density_quarter_bw_trimmed
# when we eventually call the function and provide the required argument
# `x`, it is applied using the arguments we have "saved up" so far
density_quarter_bw_trimmed(x)
# create a custom automatically partially applied function
f = auto_partial(function(x, y, z = 3) (x + y) * z)
f()
f(1)
g = f(y = 2)(z = 4)
g
g(1)
# pass waiver() to optional arguments to use existing values
f(z = waiver())(1, 2) # uses default z = 3
f(z = 4)(z = waiver())(1, 2) # uses z = 4
Bandwidth estimators
Description
Bandwidth estimators for densities, used in the bandwidth
argument
to density functions (e.g. density_bounded()
, density_unbounded()
).
Supports automatic partial function application with waived arguments.
Usage
bandwidth_nrd0(x, ...)
bandwidth_nrd(x, ...)
bandwidth_ucv(x, ...)
bandwidth_bcv(x, ...)
bandwidth_SJ(x, ...)
bandwidth_dpi(x, ...)
Arguments
x |
<numeric> Vector containing a sample. |
... |
Arguments passed on to
|
Details
These are loose wrappers around the corresponding bw.
-prefixed functions
in stats. See, for example, bw.SJ()
.
bandwidth_dpi()
, which is the default bandwidth estimator in ggdist,
is the Sheather-Jones direct plug-in estimator, i.e. bw.SJ(..., method = "dpi")
.
With the exception of bandwidth_nrd0()
, these estimators may fail in some
cases, often when a sample contains many duplicates. If they do they will
automatically fall back to bandwidth_nrd0()
with a warning. However, these
failures are typically symptomatic of situations where you should not want to
use a kernel density estimator in the first place (e.g. data with duplicates
and/or discrete data). In these cases consider using a dotplot (geom_dots()
)
or histogram (density_histogram()
) instead.
Value
A single number giving the bandwidth
See Also
density_bounded()
, density_unbounded()
.
Bin data values using a dotplot algorithm
Description
Bins the provided data values using one of several dotplot algorithms.
Usage
bin_dots(
x,
y,
binwidth,
heightratio = 1,
stackratio = 1,
layout = c("bin", "weave", "hex", "swarm", "bar"),
side = c("topright", "top", "right", "bottomleft", "bottom", "left", "topleft",
"bottomright", "both"),
orientation = c("horizontal", "vertical", "y", "x"),
overlaps = "nudge"
)
Arguments
x |
<numeric> x values. |
y |
<numeric> y values (same length as |
binwidth |
<scalar numeric> Bin width. |
heightratio |
<scalar numeric> Ratio of bin width to dot height |
stackratio |
<scalar numeric> Ratio of dot height to vertical distance between dot centers |
layout |
<string> The layout method used for the dots. One of:
|
side |
Which side to place the slab on. |
orientation |
<string> Whether the dots are laid out horizontally
or vertically. Follows the naming scheme of
For compatibility with the base ggplot naming scheme for |
overlaps |
<string> How to handle overlapping dots or bins in the
|
Value
A data.frame
with three columns:
-
x
: the x position of each dot -
y
: the y position of each dot -
bin
: a unique number associated with each bin (supplied but not used whenlayout = "swarm"
)
See Also
find_dotplot_binwidth()
for an algorithm that finds good bin widths
to use with this function; geom_dotsinterval()
for geometries that use
these algorithms to create dotplots.
Examples
library(dplyr)
library(ggplot2)
x = qnorm(ppoints(20))
bin_df = bin_dots(x = x, y = 0, binwidth = 0.5, heightratio = 1)
bin_df
# we can manually plot the binning above, though this is only recommended
# if you are using find_dotplot_binwidth() and bin_dots() to build your own
# grob. For practical use it is much easier to use geom_dots(), which will
# automatically select good bin widths for you (and which uses
# find_dotplot_binwidth() and bin_dots() internally)
bin_df %>%
ggplot(aes(x = x, y = y)) +
geom_point(size = 4) +
coord_fixed()
Blur functions for blurry dot plots
Description
Methods for constructing blurs, as used in the blur
argument to
geom_blur_dots()
or stat_mcse_dots()
.
Supports automatic partial function application with waived arguments.
Usage
blur_gaussian(x, r, sd)
blur_interval(x, r, sd, .width = 0.95)
Arguments
x |
<numeric> Vector of positive distances from the center of the dot (assumed to be 0) to evaluate blur function at. |
r |
<scalar numeric> Radius of the dot that is being blurred. |
sd |
<scalar numeric> Standard deviation of the dot that is being blurred. |
.width |
<scalar numeric> For |
Details
These functions are passed x
, r
, and sd
when geom_blur_dots()
draws in order to create a radial gradient representing each dot in the
dotplot. They return values between 0
and 1
giving the opacity of the
dot at each value of x
.
blur_gaussian()
creates a dot with radius r
that has a Gaussian blur with
standard deviation sd
applied to it. It does this by calculating
\alpha(x; r, \sigma)
, the opacity at distance x
from the center
of a dot with radius r
that has had a Gaussian blur with standard
deviation \sigma
= sd
applied to it:
\alpha(x; r, \sigma) = \Phi \left(\frac{x + r}{\sigma} \right) -
\Phi \left(\frac{x - r}{\sigma} \right)
blur_interval()
creates an interval-type representation around the
dot at 50% opacity, where the interval is a Gaussian quantile interval with
mass equal to .width
and standard deviation sd
.
Value
A vector with the same length as x
giving the opacity of the radial
gradient representing the dot at each x
value.
See Also
geom_blur_dots()
and stat_mcse_dots()
for geometries making use of
blur
functions.
Examples
# see examples in geom_blur_dots()
Estimate bounds of a distribution using the CDF of its order statistics
Description
Estimate the bounds of the distribution a sample came from using the CDF of
the order statistics of the sample. Use with the bounder
argument to density_bounded()
.
Supports automatic partial function application with waived arguments.
Usage
bounder_cdf(x, p = 0.01)
Arguments
x |
<numeric> Sample to estimate the bounds of. |
p |
<scalar numeric> in |
Details
bounder_cdf()
uses the distribution of the order statistics of
X
to estimate where the first and last order statistics (i.e. the
min and max) of this distribution would be, assuming the sample x
is the
distribution. Then, it adjusts the boundary outwards from min(x)
(or max(x)
)
by the distance between min(x)
(or max(x)
) and the nearest estimated
order statistic.
Taking X
= x
, the distributions of the first and last order statistics are:
\begin{array}{rcl}
F_{X_{(1)}}(x) &=& 1 - \left[1 - F_X(x)\right]^n\\
F_{X_{(n)}}(x) &=& F_X(x)^n
\end{array}
Re-arranging, we can get the inverse CDFs (quantile functions) of each
order statistic in terms of the quantile function of X
(which we
can estimate from the data), giving us an estimate for the minimum
and maximum order statistic:
\begin{array}{rcrcl}
\hat{x_1} &=& F_{X_{(1)}}^{-1}(p) &=& F_X^{-1}\left[1 - (1 - p)^{1/n}\right]\\
\hat{x_n} &=& F_{X_{(n)}}^{-1}(p) &=& F_X^{-1}\left[p^{1/n}\right]
\end{array}
Then the estimated bounds are:
\left[2\min(x) - \hat{x_1}, 2\max(x) - \hat{x_n} \right]
These bounds depend on p
, the percentile of the distribution of the order
statistic used to form the estimate. While p = 0.5
(the median) might be
a reasonable choice (and gives results similar to bounder_cooke()
), this tends
to be a bit too aggressive in "detecting" bounded distributions, especially in
small sample sizes. Thus, we use a default of p = 0.01
, which tends to
be very conservative in small samples (in that it usually gives results
roughly equivalent to an unbounded distribution), but which still performs
well on bounded distributions when sample sizes are larger (in the thousands).
Value
A length-2 numeric vector giving an estimate of the minimum and maximum bounds
of the distribution that x
came from.
See Also
The bounder
argument to density_bounded()
.
Other bounds estimators:
bounder_cooke()
,
bounder_range()
Estimate bounds of a distribution using Cooke's method
Description
Estimate the bounds of the distribution a sample came from using Cooke's method.
Use with the bounder
argument to density_bounded()
.
Supports automatic partial function application with waived arguments.
Usage
bounder_cooke(x)
Arguments
x |
<numeric> Sample to estimate the bounds of. |
Details
Estimate the bounds of a distribution using the method from Cooke (1979); i.e. method 2.3 from Loh (1984). These bounds are:
\left[\begin{array}{l}
2X_{(1)} - \sum_{i = 1}^n \left[\left(1 - \frac{i - 1}{n}\right)^n -
\left(1 - \frac{i}{n}\right)^n \right] X_{(i)}\\
2X_{(n)} - \sum_{i = 1}^n \left[\left(1 - \frac{n - i}{n}\right)^n -
\left(1 - \frac{n + 1 - i}{n} \right)^n\right] X_{(i)}
\end{array}\right]
Where X_{(i)}
is the i
th order statistic of x
(i.e. its
i
th-smallest value).
Value
A length-2 numeric vector giving an estimate of the minimum and maximum bounds
of the distribution that x
came from.
References
Cooke, P. (1979). Statistical inference for bounds of random variables. Biometrika 66(2), 367–374. doi:10.1093/biomet/66.2.367.
Loh, W. Y. (1984). Estimating an endpoint of a distribution with resampling methods. The Annals of Statistics 12(4), 1543–1550. doi:10.1214/aos/1176346811
See Also
The bounder
argument to density_bounded()
.
Other bounds estimators:
bounder_cdf()
,
bounder_range()
Estimate bounds of a distribution using the range of the sample
Description
Estimate the bounds of the distribution a sample came from using the range of the sample.
Use with the bounder
argument to density_bounded()
.
Supports automatic partial function application with waived arguments.
Usage
bounder_range(x)
Arguments
x |
<numeric> Sample to estimate the bounds of. |
Details
Estimate the bounds of a distribution using range(x)
.
Value
A length-2 numeric vector giving an estimate of the minimum and maximum bounds
of the distribution that x
came from.
See Also
The bounder
argument to density_bounded()
.
Other bounds estimators:
bounder_cdf()
,
bounder_cooke()
Break (bin) selection algorithms for histograms
Description
Methods for determining breaks (bins) in histograms, as used in the breaks
argument to density_histogram()
.
Supports automatic partial function application with waived arguments.
Usage
breaks_fixed(x, weights = NULL, width = 1)
breaks_Sturges(x, weights = NULL)
breaks_Scott(x, weights = NULL)
breaks_FD(x, weights = NULL, digits = 5)
breaks_quantiles(x, weights = NULL, max_n = "Scott", min_width = 0.5)
Arguments
x |
<numeric> Sample values. |
weights |
<numeric | NULL> Optional weights to apply to |
width |
<scalar numeric> For |
digits |
<scalar numeric> For |
max_n |
<scalar numeric | function | string>
For |
min_width |
<scalar numeric> For |
Details
These functions take a sample and its weights and return a value suitable for
the breaks
argument to density_histogram()
that will determine the histogram
breaks.
-
breaks_fixed()
allows you to manually specify a fixed bin width. -
breaks_Sturges()
,breaks_Scott()
, andbreaks_FD()
implement weighted versions of their corresponding base functions. They return a scalar numeric giving the number of bins. Seenclass.Sturges()
,nclass.scott()
, andnclass.FD()
. -
breaks_quantiles()
constructs irregularly-sized bins usingmax_n + 1
(possibly weighted) quantiles ofx
. The final number of bins is at mostmax_n
, as small bins (ones whose bin width is less than half the range of the data divided bymax_n
timesmin_width
) will be merged into adjacent bins.
Value
Either a single number (giving the number of bins) or a vector giving the edges between bins.
See Also
Examples
library(ggplot2)
set.seed(1234)
x = rnorm(2000, 1, 2)
# Let's compare the different break-selection algorithms on this data:
ggplot(data.frame(x), aes(x)) +
stat_slab(
aes(y = "breaks_fixed(width = 0.5)"),
density = "histogram",
breaks = breaks_fixed(width = 0.5),
outline_bars = TRUE,
color = "black",
) +
stat_slab(
aes(y = "breaks_Sturges()\nor 'Sturges'"),
density = "histogram",
breaks = "Sturges",
outline_bars = TRUE,
color = "black",
) +
stat_slab(
aes(y = "breaks_Scott()\nor 'Scott'"),
density = "histogram",
breaks = "Scott",
outline_bars = TRUE,
color = "black",
) +
stat_slab(
aes(y = "breaks_FD()\nor 'FD'"),
density = "histogram",
breaks = "FD",
outline_bars = TRUE,
color = "black",
) +
stat_slab(
aes(y = "breaks_quantiles()\nor 'quantiles'"),
density = "histogram",
breaks = "quantiles",
outline_bars = TRUE,
color = "black",
) +
geom_point(aes(y = 0.7), alpha = 0.5) +
labs(
subtitle = "ggdist::stat_slab(density = 'histogram', ...)",
y = "breaks =",
x = NULL
)
Curvewise point and interval summaries for tidy data frames of draws from distributions
Description
Translates draws from distributions in a grouped data frame into a set of point and interval summaries using a curve boxplot-inspired approach.
Usage
curve_interval(
.data,
...,
.along = NULL,
.width = 0.5,
na.rm = FALSE,
.interval = c("mhd", "mbd", "bd", "bd-mbd")
)
## S3 method for class 'matrix'
curve_interval(
.data,
...,
.along = NULL,
.width = 0.5,
na.rm = FALSE,
.interval = c("mhd", "mbd", "bd", "bd-mbd")
)
## S3 method for class 'rvar'
curve_interval(
.data,
...,
.along = NULL,
.width = 0.5,
na.rm = FALSE,
.interval = c("mhd", "mbd", "bd", "bd-mbd")
)
## S3 method for class 'data.frame'
curve_interval(
.data,
...,
.along = NULL,
.width = 0.5,
na.rm = FALSE,
.interval = c("mhd", "mbd", "bd", "bd-mbd"),
.simple_names = TRUE,
.exclude = c(".chain", ".iteration", ".draw", ".row")
)
Arguments
.data |
<data.frame | rvar | matrix> One of:
|
... |
<bare language> Bare column names or expressions that, when evaluated in the context of
|
.along |
<tidyselect> Which columns are the input values to the function
describing the curve (e.g., the "x" values). Intervals are calculated jointly with
respect to these variables, conditional on all other grouping variables in the data frame. The default
( |
.width |
<numeric> Vector of probabilities to use that determine the widths of the resulting
intervals. If multiple probabilities are provided, multiple rows per group are generated, each with
a different probability interval (and value of the corresponding |
na.rm |
<scalar logical> Should |
.interval |
<string> The method used to calculate the intervals. Currently, all
methods rank the curves using some measure of data depth, then create envelopes containing the
|
.simple_names |
<scalar logical> When |
.exclude |
<character> Vector of names of columns to be excluded from summarization if no column names are specified to be summarized. Default ignores several meta-data column names used in ggdist and tidybayes. |
Details
Intervals are calculated by ranking the curves using some measure of data depth, then
using binary search to find a cutoff k
such that an envelope containing the k
% "deepest"
curves also contains .width
% of the curves, for each value of .width
(note that k
and .width
are not necessarily the same). This is in contrast to most functional boxplot
or curve boxplot approaches, which tend to simply take the .width
% deepest curves, and
are generally quite conservative (i.e. they may contain more than .width
% of the curves).
See Mirzargar et al. (2014) or Juul et al. (2020) for an accessible introduction to data depth and curve boxplots / functional boxplots.
Value
A data frame containing point summaries and intervals, with at least one column corresponding
to the point summary, one to the lower end of the interval, one to the upper end of the interval, the
width of the interval (.width
), the type of point summary (.point
), and the type of interval (.interval
).
Author(s)
Matthew Kay
References
Fraiman, Ricardo and Graciela Muniz. (2001). "Trimmed means for functional data". Test 10: 419–440. doi:10.1007/BF02595706.
Sun, Ying and Marc G. Genton. (2011). "Functional Boxplots". Journal of Computational and Graphical Statistics, 20(2): 316-334. doi:10.1198/jcgs.2011.09224
Mirzargar, Mahsa, Ross T Whitaker, and Robert M Kirby. (2014). "Curve Boxplot: Generalization of Boxplot for Ensembles of Curves". IEEE Transactions on Visualization and Computer Graphics. 20(12): 2654-2663. doi:10.1109/TVCG.2014.2346455
Juul Jonas, Kaare Græsbøll, Lasse Engbo Christiansen, and Sune Lehmann. (2020). "Fixed-time descriptive statistics underestimate extremes of epidemic curve ensembles". arXiv e-print. arXiv:2007.05035
See Also
point_interval()
for pointwise intervals. See vignette("lineribbon")
for more examples
and discussion of the differences between pointwise and curvewise intervals.
Examples
library(dplyr)
library(ggplot2)
# generate a set of curves
k = 11 # number of curves
n = 201
df = tibble(
.draw = rep(1:k, n),
mean = rep(seq(-5,5, length.out = k), n),
x = rep(seq(-15,15,length.out = n), each = k),
y = dnorm(x, mean, 3)
)
# see pointwise intervals...
df %>%
group_by(x) %>%
median_qi(y, .width = c(.5)) %>%
ggplot(aes(x = x, y = y)) +
geom_lineribbon(aes(ymin = .lower, ymax = .upper)) +
geom_line(aes(group = .draw), alpha=0.15, data = df) +
scale_fill_brewer() +
ggtitle("50% pointwise intervals with point_interval()") +
theme_ggdist()
# ... compare them to curvewise intervals
df %>%
group_by(x) %>%
curve_interval(y, .width = c(.5)) %>%
ggplot(aes(x = x, y = y)) +
geom_lineribbon(aes(ymin = .lower, ymax = .upper)) +
geom_line(aes(group = .draw), alpha=0.15, data = df) +
scale_fill_brewer() +
ggtitle("50% curvewise intervals with curve_interval()") +
theme_ggdist()
Categorize values from a CDF into quantile intervals
Description
Given a vector of probabilities from a cumulative distribution function (CDF)
and a list of desired quantile intervals, return a vector categorizing each
element of the input vector according to which quantile interval it falls into.
NOTE: While this function can be used for (and was originally designed for)
drawing slabs with intervals overlaid on the density, this is can now be
done more easily by mapping the .width
or level
computed variable to
slab fill or color. See Examples.
Usage
cut_cdf_qi(p, .width = c(0.66, 0.95, 1), labels = NULL)
Arguments
p |
<numeric> Vector of values from a cumulative distribution function,
such as values returned by |
.width |
<numeric> Vector of probabilities to use that determine the widths of the resulting intervals. |
labels |
<character | function | NULL> One of:
|
Value
An ordered factor of the same length as p
giving the quantile interval to
which each value of p
belongs.
See Also
See stat_slabinterval()
and
its shortcut stats, which generate cdf
aesthetics that can be used with
cut_cdf_qi()
to draw slabs colored by their intervals.
Examples
library(ggplot2)
library(dplyr)
library(scales)
library(distributional)
theme_set(theme_ggdist())
# NOTE: cut_cdf_qi() used to be the recommended way to do intervals overlaid
# on densities, like this...
tibble(x = dist_normal(0, 1)) %>%
ggplot(aes(xdist = x)) +
stat_slab(
aes(fill = after_stat(cut_cdf_qi(cdf)))
) +
scale_fill_brewer(direction = -1)
# ... however this is now more easily and flexibly accomplished by directly
# mapping .width or level onto fill:
tibble(x = dist_normal(0, 1)) %>%
ggplot(aes(xdist = x)) +
stat_slab(
aes(fill = after_stat(level)),
.width = c(.66, .95, 1)
) +
scale_fill_brewer()
# See vignette("slabinterval") for more examples. The remaining examples
# below using cut_cdf_qi() are kept for posterity.
# With a halfeye (or other geom with slab and interval), NA values will
# show up in the fill scale from the CDF function applied to the internal
# interval geometry data and can be ignored, hence na.translate = FALSE
tibble(x = dist_normal(0, 1)) %>%
ggplot(aes(xdist = x)) +
stat_halfeye(aes(
fill = after_stat(cut_cdf_qi(cdf, .width = c(.5, .8, .95, 1)))
)) +
scale_fill_brewer(direction = -1, na.translate = FALSE)
# we could also use the labels parameter to apply nicer formatting
# and provide a better name for the legend, and omit the 100% interval
# if desired
tibble(x = dist_normal(0, 1)) %>%
ggplot(aes(xdist = x)) +
stat_halfeye(aes(
fill = after_stat(cut_cdf_qi(
cdf,
.width = c(.5, .8, .95),
labels = percent_format(accuracy = 1)
))
)) +
labs(fill = "Interval") +
scale_fill_brewer(direction = -1, na.translate = FALSE)
Bounded density estimator using the reflection method
Description
Bounded density estimator using the reflection method.
Supports automatic partial function application with waived arguments.
Usage
density_bounded(
x,
weights = NULL,
n = 501,
bandwidth = "dpi",
adjust = 1,
kernel = "gaussian",
trim = TRUE,
bounds = c(NA, NA),
bounder = "cdf",
adapt = 1,
na.rm = FALSE,
...,
range_only = FALSE
)
Arguments
x |
<numeric> Sample to compute a density estimate for. |
weights |
|
n |
<scalar numeric> The number of grid points to evaluate the density estimator at. |
bandwidth |
<scalar numeric | function | string> Bandwidth of the density estimator. One of:
|
adjust |
<scalar numeric> Value to multiply the bandwidth of the density estimator by. Default |
kernel |
<string> The smoothing kernel to be used. This must partially
match one of |
trim |
<scalar logical> Should the density estimate be trimmed to the range of the data? Default |
bounds |
<length-2 numeric> Min and max bounds. If a bound is |
bounder |
<function | string> Method to use to find missing
(
|
adapt |
<positive integer> (very experimental) The name and interpretation of this argument
are subject to change without notice. If |
na.rm |
<scalar logical> Should missing ( |
... |
Additional arguments (ignored). |
range_only |
<scalar logical> If |
Value
An object of class "density"
, mimicking the output format of
stats::density()
, with the following components:
-
x
: The grid of points at which the density was estimated. -
y
: The estimated density values. -
bw
: The bandwidth. -
n
: The sample size of thex
input argument. -
call
: The call used to produce the result, as a quoted expression. -
data.name
: The deparsed name of thex
input argument. -
has.na
: AlwaysFALSE
(for compatibility). -
cdf
: Values of the (possibly weighted) empirical cumulative distribution function atx
. Seeweighted_ecdf()
.
This allows existing methods for density objects, like print()
and plot()
, to work if desired.
This output format (and in particular, the x
and y
components) is also
the format expected by the density
argument of the stat_slabinterval()
and the smooth_
family of functions.
References
Cooke, P. (1979). Statistical inference for bounds of random variables. Biometrika 66(2), 367–374. doi:10.1093/biomet/66.2.367.
Loh, W. Y. (1984). Estimating an endpoint of a distribution with resampling methods. The Annals of Statistics 12(4), 1543–1550. doi:10.1214/aos/1176346811
See Also
Other density estimators:
density_histogram()
,
density_unbounded()
Examples
library(distributional)
library(dplyr)
library(ggplot2)
# For compatibility with existing code, the return type of density_bounded()
# is the same as stats::density(), ...
set.seed(123)
x = rbeta(5000, 1, 3)
d = density_bounded(x)
d
# ... thus, while designed for use with the `density` argument of
# stat_slabinterval(), output from density_bounded() can also be used with
# base::plot():
plot(d)
# here we'll use the same data as above, but pick either density_bounded()
# or density_unbounded() (which is equivalent to stats::density()). Notice
# how the bounded density (green) is biased near the boundary of the support,
# while the unbounded density is not.
data.frame(x) %>%
ggplot() +
stat_slab(
aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)),
alpha = 0.25
) +
stat_slab(aes(x), density = "bounded", fill = NA, color = "#d95f02", alpha = 0.5) +
stat_slab(aes(x), density = "unbounded", fill = NA, color = "#1b9e77", alpha = 0.5) +
scale_thickness_shared() +
theme_ggdist()
# We can also supply arguments to the density estimators by using their
# full function names instead of the string suffix; e.g. we can supply
# the exact bounds of c(0,1) rather than using the bounds of the data.
data.frame(x) %>%
ggplot() +
stat_slab(
aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)),
alpha = 0.25
) +
stat_slab(
aes(x), fill = NA, color = "#d95f02", alpha = 0.5,
density = density_bounded(bounds = c(0,1))
) +
scale_thickness_shared() +
theme_ggdist()
Histogram density estimator
Description
Histogram density estimator.
Supports automatic partial function application with waived arguments.
Usage
density_histogram(
x,
weights = NULL,
breaks = "Scott",
align = "none",
outline_bars = FALSE,
right_closed = TRUE,
outermost_closed = TRUE,
na.rm = FALSE,
...,
range_only = FALSE
)
Arguments
x |
<numeric> Sample to compute a density estimate for. |
weights |
|
breaks |
<numeric | function | string> Determines the breakpoints defining bins. Default
For example, |
align |
<scalar numeric | function | string> Determines how to align the breakpoints defining bins. Default
For example, |
outline_bars |
<scalar logical> Should outlines in between the bars (i.e. density values of 0) be included? |
right_closed |
<scalar logical> Should the right edge of each bin be closed? For
a bin with endpoints
Equivalent to the |
outermost_closed |
<scalar logical> Should values on the edges of the outermost (first
or last) bins always be included in those bins? If Equivalent to the |
na.rm |
<scalar logical> Should missing ( |
... |
Additional arguments (ignored). |
range_only |
<scalar logical> If |
Value
An object of class "density"
, mimicking the output format of
stats::density()
, with the following components:
-
x
: The grid of points at which the density was estimated. -
y
: The estimated density values. -
bw
: The bandwidth. -
n
: The sample size of thex
input argument. -
call
: The call used to produce the result, as a quoted expression. -
data.name
: The deparsed name of thex
input argument. -
has.na
: AlwaysFALSE
(for compatibility). -
cdf
: Values of the (possibly weighted) empirical cumulative distribution function atx
. Seeweighted_ecdf()
.
This allows existing methods for density objects, like print()
and plot()
, to work if desired.
This output format (and in particular, the x
and y
components) is also
the format expected by the density
argument of the stat_slabinterval()
and the smooth_
family of functions.
See Also
Other density estimators:
density_bounded()
,
density_unbounded()
Examples
library(distributional)
library(dplyr)
library(ggplot2)
# For compatibility with existing code, the return type of density_unbounded()
# is the same as stats::density(), ...
set.seed(123)
x = rbeta(5000, 1, 3)
d = density_histogram(x)
d
# ... thus, while designed for use with the `density` argument of
# stat_slabinterval(), output from density_histogram() can also be used with
# base::plot():
plot(d)
# here we'll use the same data as above with stat_slab():
data.frame(x) %>%
ggplot() +
stat_slab(
aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)),
alpha = 0.25
) +
stat_slab(aes(x), density = "histogram", fill = NA, color = "#d95f02", alpha = 0.5) +
scale_thickness_shared() +
theme_ggdist()
Unbounded density estimator
Description
Unbounded density estimator using stats::density()
.
Supports automatic partial function application with waived arguments.
Usage
density_unbounded(
x,
weights = NULL,
n = 501,
bandwidth = "dpi",
adjust = 1,
kernel = "gaussian",
trim = TRUE,
adapt = 1,
na.rm = FALSE,
...,
range_only = FALSE
)
Arguments
x |
<numeric> Sample to compute a density estimate for. |
weights |
|
n |
<scalar numeric> The number of grid points to evaluate the density estimator at. |
bandwidth |
<scalar numeric | function | string> Bandwidth of the density estimator. One of:
|
adjust |
<scalar numeric> Value to multiply the bandwidth of the density estimator by. Default |
kernel |
<string> The smoothing kernel to be used. This must partially
match one of |
trim |
<scalar logical> Should the density estimate be trimmed to the range of the data? Default |
adapt |
<positive integer> (very experimental) The name and interpretation of this argument
are subject to change without notice. If |
na.rm |
<scalar logical> Should missing ( |
... |
Additional arguments (ignored). |
range_only |
<scalar logical> If |
Value
An object of class "density"
, mimicking the output format of
stats::density()
, with the following components:
-
x
: The grid of points at which the density was estimated. -
y
: The estimated density values. -
bw
: The bandwidth. -
n
: The sample size of thex
input argument. -
call
: The call used to produce the result, as a quoted expression. -
data.name
: The deparsed name of thex
input argument. -
has.na
: AlwaysFALSE
(for compatibility). -
cdf
: Values of the (possibly weighted) empirical cumulative distribution function atx
. Seeweighted_ecdf()
.
This allows existing methods for density objects, like print()
and plot()
, to work if desired.
This output format (and in particular, the x
and y
components) is also
the format expected by the density
argument of the stat_slabinterval()
and the smooth_
family of functions.
See Also
Other density estimators:
density_bounded()
,
density_histogram()
Examples
library(distributional)
library(dplyr)
library(ggplot2)
# For compatibility with existing code, the return type of density_unbounded()
# is the same as stats::density(), ...
set.seed(123)
x = rbeta(5000, 1, 3)
d = density_unbounded(x)
d
# ... thus, while designed for use with the `density` argument of
# stat_slabinterval(), output from density_unbounded() can also be used with
# base::plot():
plot(d)
# here we'll use the same data as above, but pick either density_bounded()
# or density_unbounded() (which is equivalent to stats::density()). Notice
# how the bounded density (green) is biased near the boundary of the support,
# while the unbounded density is not.
data.frame(x) %>%
ggplot() +
stat_slab(
aes(xdist = dist), data = data.frame(dist = dist_beta(1, 3)),
alpha = 0.25
) +
stat_slab(aes(x), density = "bounded", fill = NA, color = "#d95f02", alpha = 0.5) +
stat_slab(aes(x), density = "unbounded", fill = NA, color = "#1b9e77", alpha = 0.5) +
scale_thickness_shared() +
theme_ggdist()
Dynamically select a good bin width for a dotplot
Description
Searches for a nice-looking bin width to use to draw a dotplot such that
the height of the dotplot fits within a given space (maxheight
).
Usage
find_dotplot_binwidth(
x,
maxheight,
heightratio = 1,
stackratio = 1,
layout = c("bin", "weave", "hex", "swarm", "bar")
)
Arguments
x |
<numeric> Data values. |
maxheight |
<scalar numeric> Maximum height of the dotplot. |
heightratio |
<scalar numeric> Ratio of bin width to dot height. |
stackratio |
<scalar numeric> Ratio of dot height to vertical distance between dot centers |
layout |
<string> The layout method used for the dots. One of:
|
Details
This dynamic bin selection algorithm uses a binary search over the number of
bins to find a bin width such that if the input data (x
) is binned
using a Wilkinson-style dotplot algorithm the height of the tallest bin
will be less than maxheight
.
This algorithm is used by geom_dotsinterval()
(and its variants) to automatically
select bin widths. Unless you are manually implementing you own dotplot grob
or geom
, you probably do not need to use this function directly
Value
A suitable bin width such that a dotplot created with this bin width
and heightratio
should have its tallest bin be less than or equal to maxheight
.
See Also
bin_dots()
for an algorithm can bin dots using bin widths selected
by this function; geom_dotsinterval()
for geometries that use
these algorithms to create dotplots.
Examples
library(dplyr)
library(ggplot2)
x = qnorm(ppoints(20))
binwidth = find_dotplot_binwidth(x, maxheight = 4, heightratio = 1)
binwidth
bin_df = bin_dots(x = x, y = 0, binwidth = binwidth, heightratio = 1)
bin_df
# we can manually plot the binning above, though this is only recommended
# if you are using find_dotplot_binwidth() and bin_dots() to build your own
# grob. For practical use it is much easier to use geom_dots(), which will
# automatically select good bin widths for you (and which uses
# find_dotplot_binwidth() and bin_dots() internally)
bin_df %>%
ggplot(aes(x = x, y = y)) +
geom_point(size = 4) +
coord_fixed()
Blurry dot plot (geom)
Description
Variant of geom_dots()
for creating blurry dotplots. Accepts an sd
aesthetic that gives the standard deviation of the blur applied to the dots.
Requires a graphics engine supporting radial gradients. Unlike geom_dots()
,
this geom only supports circular and square shape
s.
Usage
geom_blur_dots(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
blur = "gaussian",
binwidth = NA,
dotsize = 1.07,
stackratio = 1,
layout = "bin",
overlaps = "nudge",
smooth = "none",
overflow = "warn",
verbose = FALSE,
orientation = NA,
subguide = "slab",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
blur |
<function | string> Blur function to apply to dots. One of:
|
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
layout |
<string> The layout method used for the dots. One of:
|
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in
geom_slabinterval()
and can be given x positions (or y positions when in a horizontal orientation).Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the
slab_shape
aesthetic (when using thedotsinterval
family) or theshape
orslab_shape
aesthetic (when using thedots
family)
Stats and geoms in this family include:
-
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size of the dots automatically (may result in very small dots). -
geom_swarm()
andgeom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots. Usedside = "both"
by default, and sets the default dot size to the same size asgeom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small. -
stat_dots()
: dotplots on raw data, distributional objects, andposterior::rvar()
s -
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated intervals (rarely useful directly). -
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects, andposterior::rvar()
s (will calculate intervals for you). -
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to each dot to be specified using thesd
aesthetic. -
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
Value
A ggplot2::Geom representing a blurry dot geometry which can
be added to a ggplot()
object.
Aesthetics
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
sd
: The standard deviation (in data units) of the blur associated with each dot.order
: The order in which data points are stacked within bins. Can be used to create the effect of "stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the value of the data points themselves are used to determine stacking order. Only applies whenlayout
is"bin"
or"hex"
, as the other layout methods fully determine both x and y positions.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.slab_shape
: Override forshape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
References
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See Also
See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_dots()
,
geom_dotsinterval()
,
geom_swarm()
,
geom_weave()
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
set.seed(1234)
x = rnorm(1000)
# manually calculate quantiles and their MCSE
# this could also be done more succinctly with stat_mcse_dots()
p = ppoints(100)
df = data.frame(
q = quantile(x, p),
se = posterior::mcse_quantile(x, p)
)
df %>%
ggplot(aes(x = q, sd = se)) +
geom_blur_dots()
df %>%
ggplot(aes(x = q, sd = se)) +
# or blur = blur_interval(.width = .95) to set the interval width
geom_blur_dots(blur = "interval")
Dot plot (shortcut geom)
Description
Shortcut version of geom_dotsinterval()
for creating dot plots.
Geoms based on geom_dotsinterval()
create dotplots that automatically
ensure the plot fits within the available space.
Roughly equivalent to:
geom_dotsinterval( show_point = FALSE, show_interval = FALSE )
Usage
geom_dots(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
binwidth = NA,
dotsize = 1.07,
stackratio = 1,
layout = "bin",
overlaps = "nudge",
smooth = "none",
overflow = "warn",
verbose = FALSE,
orientation = NA,
subguide = "slab",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
layout |
<string> The layout method used for the dots. One of:
|
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in
geom_slabinterval()
and can be given x positions (or y positions when in a horizontal orientation).Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the
slab_shape
aesthetic (when using thedotsinterval
family) or theshape
orslab_shape
aesthetic (when using thedots
family)
Stats and geoms in this family include:
-
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size of the dots automatically (may result in very small dots). -
geom_swarm()
andgeom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots. Usedside = "both"
by default, and sets the default dot size to the same size asgeom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small. -
stat_dots()
: dotplots on raw data, distributional objects, andposterior::rvar()
s -
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated intervals (rarely useful directly). -
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects, andposterior::rvar()
s (will calculate intervals for you). -
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to each dot to be specified using thesd
aesthetic. -
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
Value
A ggplot2::Geom representing a dot geometry which can
be added to a ggplot()
object.
Aesthetics
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.order
: The order in which data points are stacked within bins. Can be used to create the effect of "stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the value of the data points themselves are used to determine stacking order. Only applies whenlayout
is"bin"
or"hex"
, as the other layout methods fully determine both x and y positions.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.slab_shape
: Override forshape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
References
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See Also
See stat_dots()
for the stat version, intended for
use on sample data or analytical distributions.
See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_blur_dots()
,
geom_dotsinterval()
,
geom_swarm()
,
geom_weave()
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
set.seed(12345)
df = tibble(
g = rep(c("a", "b"), 200),
value = rnorm(400, c(0, 3), c(0.75, 1))
)
# orientation is detected automatically based on
# which axis is discrete
df %>%
ggplot(aes(x = value, y = g)) +
geom_dots()
df %>%
ggplot(aes(y = value, x = g)) +
geom_dots()
Automatic dotplot + point + interval meta-geom
Description
This meta-geom supports drawing combinations of dotplots, points, and intervals.
Geoms and stats based on geom_dotsinterval()
create dotplots that automatically determine a bin width that
ensures the plot fits within the available space. They also ensure dots do not overlap, and allow
the generation of quantile dotplots using the quantiles
argument to stat_dotsinterval()
/stat_dots()
.
Generally follows the naming scheme and
arguments of the geom_slabinterval()
and stat_slabinterval()
family of
geoms and stats.
Usage
geom_dotsinterval(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
binwidth = NA,
dotsize = 1.07,
stackratio = 1,
layout = "bin",
overlaps = "nudge",
smooth = "none",
overflow = "warn",
verbose = FALSE,
orientation = NA,
interval_size_domain = c(1, 6),
interval_size_range = c(0.6, 1.4),
fatten_point = 1.8,
arrow = NULL,
show_slab = TRUE,
show_point = TRUE,
show_interval = TRUE,
subguide = "slab",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
layout |
<string> The layout method used for the dots. One of:
|
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
interval_size_domain |
<length-2 numeric> Minimum and maximum of the values of the |
interval_size_range |
<length-2 numeric> This geom scales the raw size aesthetic values when
drawing interval and point sizes, as they tend to be too thick when using
the default settings of |
fatten_point |
<scalar numeric> A multiplicative factor used to adjust the size of the point relative to the
size of the thickest interval line. If you wish to specify point sizes directly, you can also use
the |
arrow |
<arrow | NULL> Type of arrow heads to use on the interval, or |
show_slab |
<scalar logical> Should the slab portion of the geom be drawn? |
show_point |
<scalar logical> Should the point portion of the geom be drawn? |
show_interval |
<scalar logical> Should the interval portion of the geom be drawn? |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in
geom_slabinterval()
and can be given x positions (or y positions when in a horizontal orientation).Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the
slab_shape
aesthetic (when using thedotsinterval
family) or theshape
orslab_shape
aesthetic (when using thedots
family)
Stats and geoms in this family include:
-
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size of the dots automatically (may result in very small dots). -
geom_swarm()
andgeom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots. Usedside = "both"
by default, and sets the default dot size to the same size asgeom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small. -
stat_dots()
: dotplots on raw data, distributional objects, andposterior::rvar()
s -
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated intervals (rarely useful directly). -
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects, andposterior::rvar()
s (will calculate intervals for you). -
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to each dot to be specified using thesd
aesthetic. -
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Geom or ggplot2::Stat representing a dotplot or combined dotplot+interval geometry which can
be added to a ggplot()
object.
Aesthetics
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.order
: The order in which data points are stacked within bins. Can be used to create the effect of "stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the value of the data points themselves are used to determine stacking order. Only applies whenlayout
is"bin"
or"hex"
, as the other layout methods fully determine both x and y positions.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.slab_shape
: Override forshape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Author(s)
Matthew Kay
References
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See Also
See the stat_slabinterval()
family for other
stats built on top of geom_slabinterval()
.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_blur_dots()
,
geom_dots()
,
geom_swarm()
,
geom_weave()
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
set.seed(12345)
df = tibble(
g = rep(c("a", "b"), 200),
value = rnorm(400, c(0, 3), c(0.75, 1))
)
# orientation is detected automatically based on
# which axis is discrete
df %>%
ggplot(aes(x = value, y = g)) +
geom_dotsinterval()
df %>%
ggplot(aes(y = value, x = g)) +
geom_dotsinterval()
# stat_dots can summarize quantiles, creating quantile dotplots
data(RankCorr_u_tau, package = "ggdist")
RankCorr_u_tau %>%
ggplot(aes(x = u_tau, y = factor(i))) +
stat_dots(quantiles = 100)
# color and fill aesthetics can be mapped within the geom
# dotsinterval adds an interval
RankCorr_u_tau %>%
ggplot(aes(x = u_tau, y = factor(i), fill = after_stat(x > 6))) +
stat_dotsinterval(quantiles = 100)
Multiple-interval plot (shortcut geom)
Description
Shortcut version of geom_slabinterval()
for creating multiple-interval plots.
Roughly equivalent to:
geom_slabinterval( aes( datatype = "interval", side = "both" ), interval_size_range = c(1, 6), show_slab = FALSE, show_point = FALSE )
Usage
geom_interval(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
orientation = NA,
interval_size_range = c(1, 6),
interval_size_domain = c(1, 6),
arrow = NULL,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
interval_size_range |
<length-2 numeric> This geom scales the raw size aesthetic values when
drawing interval and point sizes, as they tend to be too thick when using
the default settings of |
interval_size_domain |
<length-2 numeric> Minimum and maximum of the values of the |
arrow |
<arrow | NULL> Type of arrow heads to use on the interval, or |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
This geom wraps geom_slabinterval()
with defaults designed to produce
multiple-interval plots. Default aesthetic mappings are applied if the .width
column
is present in the input data (e.g., as generated by the point_interval()
family of functions),
making this geom often more convenient than vanilla ggplot2 geometries when used with
functions like median_qi()
, mean_qi()
, mode_hdi()
, etc.
Specifically, if .width
is present in the input, geom_interval()
acts
as if its default aesthetics are aes(colour = forcats::fct_rev(ordered(.width)))
Value
A ggplot2::Geom representing a multiple-interval geometry which can
be added to a ggplot()
object.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Deprecated aesthetics
interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See stat_interval()
for the stat version, intended for
use on sample data or analytical distributions.
See geom_slabinterval()
for the geometry this shortcut is based on.
Other slabinterval geoms:
geom_pointinterval()
,
geom_slab()
,
geom_spike()
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
data(RankCorr_u_tau, package = "ggdist")
# orientation is detected automatically based on
# use of xmin/xmax or ymin/ymax
RankCorr_u_tau %>%
group_by(i) %>%
median_qi(.width = c(.5, .8, .95, .99)) %>%
ggplot(aes(y = i, x = u_tau, xmin = .lower, xmax = .upper)) +
geom_interval() +
scale_color_brewer()
RankCorr_u_tau %>%
group_by(i) %>%
median_qi(.width = c(.5, .8, .95, .99)) %>%
ggplot(aes(x = i, y = u_tau, ymin = .lower, ymax = .upper)) +
geom_interval() +
scale_color_brewer()
Line + multiple-ribbon plots (ggplot geom)
Description
A combination of geom_line()
and geom_ribbon()
with default aesthetics designed for use with output from point_interval()
.
Usage
geom_lineribbon(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
step = FALSE,
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed to |
step |
<scalar logical | string> Should the line/ribbon be drawn as a step function? One of:
|
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
geom_lineribbon()
is a combination of a geom_line()
and
geom_ribbon()
designed for use with output from point_interval()
.
This geom sets some default aesthetics equal to the .width
column generated by the
point_interval()
family of functions, making them often more convenient than a vanilla
geom_ribbon()
+ geom_line()
.
Specifically, geom_lineribbon()
acts as if its default aesthetics are
aes(fill = forcats::fct_rev(ordered(.width)))
.
Value
A ggplot2::Geom representing a combined line + multiple-ribbon geometry which can
be added to a ggplot()
object.
Aesthetics
The line+ribbon stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their two sub-geometries: the line and the ribbon.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Ribbon-specific aesthetics
xmin
: Left edge of the ribbon sub-geometry (iforientation = "horizontal"
).xmax
: Right edge of the ribbon sub-geometry (iforientation = "horizontal"
).ymin
: Lower edge of the ribbon sub-geometry (iforientation = "vertical"
).ymax
: Upper edge of the ribbon sub-geometry (iforientation = "vertical"
).order
: The order in which ribbons are drawn. Ribbons with the smallest mean value oforder
are drawn first (i.e., will be drawn below ribbons with larger mean values oforder
). Iforder
is not supplied togeom_lineribbon()
,-abs(xmax - xmin)
or-abs(ymax - ymax)
(depending onorientation
) is used, having the effect of drawing the widest (on average) ribbons on the bottom.stat_lineribbon()
usesorder = after_stat(level)
by default, causing the ribbons generated from the largest.width
to be drawn on the bottom.
Color aesthetics
colour
: (orcolor
) The color of the line sub-geometry.fill
: The fill color of the ribbon sub-geometry.alpha
: The opacity of the line and ribbon sub-geometries.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of line. In ggplot2 < 3.4, was calledsize
.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc)
Other aesthetics (these work as in standard geom
s)
group
See examples of some of these aesthetics in action in vignette("lineribbon")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Author(s)
Matthew Kay
See Also
See stat_lineribbon()
for a version that does summarizing of samples into points and intervals
within ggplot. See geom_pointinterval()
for a similar geom intended
for point summaries and intervals. See geom_line()
and
geom_ribbon()
and for the geoms this is based on.
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
set.seed(12345)
tibble(
x = rep(1:10, 100),
y = rnorm(1000, x)
) %>%
group_by(x) %>%
median_qi(.width = c(.5, .8, .95)) %>%
ggplot(aes(x = x, y = y, ymin = .lower, ymax = .upper)) +
# automatically uses aes(fill = forcats::fct_rev(ordered(.width)))
geom_lineribbon() +
scale_fill_brewer()
Point + multiple-interval plot (shortcut geom)
Description
Shortcut version of geom_slabinterval()
for creating point + multiple-interval plots.
Roughly equivalent to:
geom_slabinterval( aes( datatype = "interval", side = "both" ), show_slab = FALSE, show.legend = c(size = FALSE) )
Usage
geom_pointinterval(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
orientation = NA,
interval_size_domain = c(1, 6),
interval_size_range = c(0.6, 1.4),
fatten_point = 1.8,
arrow = NULL,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
interval_size_domain |
<length-2 numeric> Minimum and maximum of the values of the |
interval_size_range |
<length-2 numeric> This geom scales the raw size aesthetic values when
drawing interval and point sizes, as they tend to be too thick when using
the default settings of |
fatten_point |
<scalar numeric> A multiplicative factor used to adjust the size of the point relative to the
size of the thickest interval line. If you wish to specify point sizes directly, you can also use
the |
arrow |
<arrow | NULL> Type of arrow heads to use on the interval, or |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends?
Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
This geom wraps geom_slabinterval()
with defaults designed to produce
point + multiple-interval plots. Default aesthetic mappings are applied if the .width
column
is present in the input data (e.g., as generated by the point_interval()
family of functions),
making this geom often more convenient than vanilla ggplot2 geometries when used with
functions like median_qi()
, mean_qi()
, mode_hdi()
, etc.
Specifically, if .width
is present in the input, geom_pointinterval()
acts
as if its default aesthetics are aes(size = -.width)
Value
A ggplot2::Geom representing a point + multiple-interval geometry which can
be added to a ggplot()
object.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See stat_pointinterval()
for the stat version, intended for
use on sample data or analytical distributions.
See geom_slabinterval()
for the geometry this shortcut is based on.
Other slabinterval geoms:
geom_interval()
,
geom_slab()
,
geom_spike()
Examples
library(dplyr)
library(ggplot2)
data(RankCorr_u_tau, package = "ggdist")
# orientation is detected automatically based on
# use of xmin/xmax or ymin/ymax
RankCorr_u_tau %>%
group_by(i) %>%
median_qi(.width = c(.8, .95)) %>%
ggplot(aes(y = i, x = u_tau, xmin = .lower, xmax = .upper)) +
geom_pointinterval()
RankCorr_u_tau %>%
group_by(i) %>%
median_qi(.width = c(.8, .95)) %>%
ggplot(aes(x = i, y = u_tau, ymin = .lower, ymax = .upper)) +
geom_pointinterval()
Slab (ridge) plot (shortcut geom)
Description
Shortcut version of geom_slabinterval()
for creating slab (ridge) plots.
Roughly equivalent to:
geom_slabinterval( show_point = FALSE, show_interval = FALSE )
Usage
geom_slab(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
orientation = NA,
subscale = "thickness",
normalize = "all",
fill_type = "segments",
subguide = "slab",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subscale |
<function | string> Sub-scale used to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
fill_type |
<string> What type of fill to use when the fill color or alpha varies within a slab. One of:
|
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Value
A ggplot2::Geom representing a slab (ridge) geometry which can
be added to a ggplot()
object.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See stat_slab()
for the stat version, intended for
use on sample data or analytical distributions.
See geom_slabinterval()
for the geometry this shortcut is based on.
Other slabinterval geoms:
geom_interval()
,
geom_pointinterval()
,
geom_spike()
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
# we will manually demonstrate plotting a density with geom_slab(),
# though generally speaking this is easier to do using stat_slab(), which
# will determine sensible limits automatically and correctly adjust
# densities when using scale transformations
df = expand.grid(
mean = 1:3,
input = seq(-2, 6, length.out = 100)
) %>%
mutate(
group = letters[4 - mean],
density = dnorm(input, mean, 1)
)
# orientation is detected automatically based on
# use of x or y
df %>%
ggplot(aes(y = group, x = input, thickness = density)) +
geom_slab()
df %>%
ggplot(aes(x = group, y = input, thickness = density)) +
geom_slab()
# RIDGE PLOTS
# "ridge" plots can be created by increasing the slab height and
# setting the slab color
df %>%
ggplot(aes(y = group, x = input, thickness = density)) +
geom_slab(height = 2, color = "black")
Slab + point + interval meta-geom
Description
This meta-geom supports drawing combinations of functions (as slabs, aka ridge plots or joy plots), points, and
intervals. It acts as a meta-geom for many other ggdist geoms that are wrappers around this geom, including
eye plots, half-eye plots, CCDF barplots, and point+multiple interval plots, and supports both horizontal and
vertical orientations, dodging (via the position
argument), and relative justification of slabs with their
corresponding intervals.
Usage
geom_slabinterval(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
orientation = NA,
subscale = "thickness",
normalize = "all",
fill_type = "segments",
interval_size_domain = c(1, 6),
interval_size_range = c(0.6, 1.4),
fatten_point = 1.8,
arrow = NULL,
show_slab = TRUE,
show_point = TRUE,
show_interval = TRUE,
subguide = "slab",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subscale |
<function | string> Sub-scale used to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
fill_type |
<string> What type of fill to use when the fill color or alpha varies within a slab. One of:
|
interval_size_domain |
<length-2 numeric> Minimum and maximum of the values of the |
interval_size_range |
<length-2 numeric> This geom scales the raw size aesthetic values when
drawing interval and point sizes, as they tend to be too thick when using
the default settings of |
fatten_point |
<scalar numeric> A multiplicative factor used to adjust the size of the point relative to the
size of the thickest interval line. If you wish to specify point sizes directly, you can also use
the |
arrow |
<arrow | NULL> Type of arrow heads to use on the interval, or |
show_slab |
<scalar logical> Should the slab portion of the geom be drawn? |
show_point |
<scalar logical> Should the point portion of the geom be drawn? |
show_interval |
<scalar logical> Should the interval portion of the geom be drawn? |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
geom_slabinterval()
is a flexible meta-geom that you can use directly or through a variety of "shortcut"
geoms that represent useful combinations of the various parameters of this geom. In many cases you will want to
use the shortcut geoms instead as they create more useful mnemonic primitives, such as eye plots,
half-eye plots, point+interval plots, or CCDF barplots.
The slab portion of the geom is much like a ridge or "joy" plot: it represents the value of a function
scaled to fit between values on the x
or y
axis (depending on the value of orientation
). Values of
the functions are specified using the thickness
aesthetic and are scaled to fit into scale
times the distance between points on the relevant axis. E.g., if orientation
is "horizontal"
,
scale
is 0.9
, and y
is a discrete variable, then the thickness
aesthetic specifies the
value of some function of x
that is drawn for every y
value and scaled to fit into 0.9
times
the distance between points on the y
axis.
For the interval portion of the geom, x
and y
aesthetics specify the location of the
point, and ymin
/ymax
or xmin
/xmax
(depending on the value of orientation
)
specify the endpoints of the interval. A scaling factor for interval line width and point size is applied
through the interval_size_domain
, interval_size_range
, and fatten_point
parameters.
These scaling factors are designed to give multiple uncertainty intervals reasonable
scaling at the default settings for scale_size_continuous()
.
As a combination geom, this geom expects a datatype
aesthetic specifying which part of the geom a given
row in the input data corresponds to: "slab"
or "interval"
. However, specifying this aesthetic
manually is typically only necessary if you use this geom directly; the numerous wrapper geoms will
usually set this aesthetic for you as needed, and their use is recommended unless you have a very custom
use case.
Wrapper geoms include:
In addition, the stat_slabinterval()
family of stats uses geoms from the
geom_slabinterval()
family, and is often easier to use than using these geoms
directly. Typically, the geom_*
versions are meant for use with already-summarized data (such as intervals) and the
stat_*
versions are summarize the data themselves (usually draws from a distribution) to produce the geom.
Value
A ggplot2::Geom representing a slab or combined slab+interval geometry which can
be added to a ggplot()
object.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Author(s)
Matthew Kay
See Also
See geom_lineribbon()
for a combination geom designed for fit curves plus probability bands.
See geom_dotsinterval()
for a combination geom designed for plotting dotplots with intervals.
See stat_slabinterval()
for families of stats
built on top of this geom for common use cases (like stat_halfeye()
).
See vignette("slabinterval")
for a variety of examples of use.
Examples
# geom_slabinterval() is typically not that useful on its own.
# See vignette("slabinterval") for a variety of examples of the use of its
# shortcut geoms and stats, which are more useful than using
# geom_slabinterval() directly.
Spike plot (ggplot2 geom)
Description
Geometry for drawing "spikes" (optionally with points on them) on top of
geom_slabinterval()
geometries: this geometry understands the scaling and
positioning of the thickness
aesthetic from geom_slabinterval()
, which
allows you to position spikes and points along a slab.
Usage
geom_spike(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
subguide = "spike",
orientation = NA,
subscale = "thickness",
normalize = "all",
arrow = NULL,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
subguide |
<function | string> Sub-guide used to annotate the
|
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subscale |
<function | string> Sub-scale used to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
arrow |
<arrow | NULL> Type of arrow heads to use on the spike, or |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
This geometry consists of a "spike" (vertical/horizontal line segment) and a
"point" (at the end of the line segment). It uses the thickness
aesthetic
to determine where the endpoint of the line is, which allows it to be used
with geom_slabinterval()
geometries for labeling specific values of the
thickness function.
Value
A ggplot2::Geom representing a spike geometry which can
be added to a ggplot()
object.
rd_slabinterval_aesthetics(geom_name),
Aesthetics
The spike geom
has a wide variety of aesthetics that control
the appearance of its two sub-geometries: the spike and the point.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Spike-specific (aka Slab-specific) aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.
Color aesthetics
colour
: (orcolor
) The color of the spike and point sub-geometries.fill
: The fill color of the point sub-geometry.alpha
: The opacity of the spike and point sub-geometries.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the spike sub-geometry.size
: Size of the point sub-geometry.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the spike.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See stat_spike()
for the stat version, intended for
use on sample data or analytical distributions.
Other slabinterval geoms:
geom_interval()
,
geom_pointinterval()
,
geom_slab()
Examples
library(ggplot2)
library(distributional)
library(dplyr)
# geom_spike is easiest to use with distributional or
# posterior::rvar objects
df = tibble(
d = dist_normal(1:2, 1:2), g = c("a", "b")
)
# annotate the density at the mean of a distribution
df %>% mutate(
mean = mean(d),
density(d, list(density_at_mean = mean))
) %>%
ggplot(aes(y = g)) +
stat_slab(aes(xdist = d)) +
geom_spike(aes(x = mean, thickness = density_at_mean)) +
# need shared thickness scale so that stat_slab and geom_spike line up
scale_thickness_shared()
# annotate the endpoints of intervals of a distribution
# here we'll use an arrow instead of a point by setting size = 0
arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt"))
df %>% mutate(
median_qi(d, .width = 0.9),
density(d, list(density_lower = .lower, density_upper = .upper))
) %>%
ggplot(aes(y = g)) +
stat_halfeye(aes(xdist = d), .width = 0.9, color = "gray35") +
geom_spike(
aes(x = .lower, thickness = density_lower),
size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75
) +
geom_spike(
aes(x = .upper, thickness = density_upper),
size = 0, arrow = arrow_spec, color = "red", linewidth = 0.75
) +
scale_thickness_shared()
Beeswarm plot (shortcut geom)
Description
Shortcut version of geom_dotsinterval()
for creating beeswarm plots.
Geoms based on geom_dotsinterval()
create dotplots that automatically
ensure the plot fits within the available space.
Roughly equivalent to:
geom_dots( aes(side = "both"), overflow = "compress", binwidth = unit(1.5, "mm"), layout = "swarm" )
Usage
geom_swarm(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
overflow = "compress",
binwidth = unit(1.5, "mm"),
layout = "swarm",
dotsize = 1.07,
stackratio = 1,
overlaps = "nudge",
smooth = "none",
verbose = FALSE,
orientation = NA,
subguide = "slab",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
layout |
<string> The layout method used for the dots. One of:
|
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in
geom_slabinterval()
and can be given x positions (or y positions when in a horizontal orientation).Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the
slab_shape
aesthetic (when using thedotsinterval
family) or theshape
orslab_shape
aesthetic (when using thedots
family)
Stats and geoms in this family include:
-
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size of the dots automatically (may result in very small dots). -
geom_swarm()
andgeom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots. Usedside = "both"
by default, and sets the default dot size to the same size asgeom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small. -
stat_dots()
: dotplots on raw data, distributional objects, andposterior::rvar()
s -
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated intervals (rarely useful directly). -
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects, andposterior::rvar()
s (will calculate intervals for you). -
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to each dot to be specified using thesd
aesthetic. -
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
Value
A ggplot2::Geom representing a beeswarm geometry which can
be added to a ggplot()
object.
Aesthetics
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.order
: The order in which data points are stacked within bins. Can be used to create the effect of "stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the value of the data points themselves are used to determine stacking order. Only applies whenlayout
is"bin"
or"hex"
, as the other layout methods fully determine both x and y positions.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.slab_shape
: Override forshape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
References
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See Also
See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_blur_dots()
,
geom_dots()
,
geom_dotsinterval()
,
geom_weave()
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
set.seed(12345)
df = tibble(
g = rep(c("a", "b"), 200),
value = rnorm(400, c(0, 3), c(0.75, 1))
)
# orientation is detected automatically based on
# which axis is discrete
df %>%
ggplot(aes(x = value, y = g)) +
geom_swarm()
df %>%
ggplot(aes(y = value, x = g)) +
geom_swarm()
Dot-weave plot (shortcut geom)
Description
Shortcut version of geom_dotsinterval()
for creating dot-weave plots.
Geoms based on geom_dotsinterval()
create dotplots that automatically
ensure the plot fits within the available space.
Roughly equivalent to:
geom_dots( aes(side = "both"), layout = "weave", overflow = "compress", binwidth = unit(1.5, "mm") )
Usage
geom_weave(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
layout = "weave",
overflow = "compress",
binwidth = unit(1.5, "mm"),
dotsize = 1.07,
stackratio = 1,
overlaps = "nudge",
smooth = "none",
verbose = FALSE,
orientation = NA,
subguide = "slab",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to |
layout |
<string> The layout method used for the dots. One of:
|
overflow |
<string> How to handle overflow of dots beyond the extent of the geom
when a minimum
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting |
binwidth |
<numeric | unit> The bin width to use for laying out the dots. One of:
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using |
dotsize |
<scalar numeric> The width of the dots relative to the |
stackratio |
<scalar numeric> The distance between the center of the dots in the same
stack relative to the dot height. The default, |
overlaps |
<string> How to handle overlapping dots or bins in the
|
smooth |
<function | string> Smoother to apply to dot positions. One of:
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using |
verbose |
<scalar logical> If |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
subguide |
<function | string> Sub-guide used to annotate the
|
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in
geom_slabinterval()
and can be given x positions (or y positions when in a horizontal orientation).Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the
slab_shape
aesthetic (when using thedotsinterval
family) or theshape
orslab_shape
aesthetic (when using thedots
family)
Stats and geoms in this family include:
-
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size of the dots automatically (may result in very small dots). -
geom_swarm()
andgeom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots. Usedside = "both"
by default, and sets the default dot size to the same size asgeom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small. -
stat_dots()
: dotplots on raw data, distributional objects, andposterior::rvar()
s -
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated intervals (rarely useful directly). -
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects, andposterior::rvar()
s (will calculate intervals for you). -
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to each dot to be specified using thesd
aesthetic. -
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
Value
A ggplot2::Geom representing a dot-weave geometry which can
be added to a ggplot()
object.
Aesthetics
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometryy
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.order
: The order in which data points are stacked within bins. Can be used to create the effect of "stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the value of the data points themselves are used to determine stacking order. Only applies whenlayout
is"bin"
or"hex"
, as the other layout methods fully determine both x and y positions.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.slab_shape
: Override forshape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
References
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See Also
See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_blur_dots()
,
geom_dots()
,
geom_dotsinterval()
,
geom_swarm()
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
set.seed(12345)
df = tibble(
g = rep(c("a", "b"), 200),
value = rnorm(400, c(0, 3), c(0.75, 1))
)
# orientation is detected automatically based on
# which axis is discrete
df %>%
ggplot(aes(x = value, y = g)) +
geom_weave()
df %>%
ggplot(aes(y = value, x = g)) +
geom_weave()
Deprecated functions and arguments in ggdist
Description
Deprecated functions and arguments and their alternatives are listed below.
Deprecated stats and geoms
The stat_sample_...
and stat_dist_...
families of stats were merged in ggdist 3.1.
This means:
-
stat_dist_...
is deprecated. For any code usingstat_dist_XXX()
, you should now be able to usestat_XXX()
instead without additional modifications in almost all cases. -
stat_sample_slabinterval()
is deprecated. You should be able to usestat_slabinterval()
instead without additional modifications in almost all cases.
The old stat_dist_...
names are currently kept as aliases, but may be removed in the future.
Deprecated arguments
Deprecated parameters for stat_slabinterval()
and family:
The
.prob
argument, which is a long-deprecated alias for.width
, was removed in ggdist 3.1.The
limits_function
argument: this was a parameter for determining the function to compute limits of the slab instat_slabinterval()
and its derived stats. This function is really an internal function only needed by subclasses of the base class, yet added a lot of noise to the documentation, so it was replaced withAbstractStatSlabInterval$compute_limits()
.The
limits_args
argument: extra stat parameters are now passed through to the...
arguments toAbstractStatSlabInterval$compute_limits()
; use these instead.The
slab_function
argument: this was a parameter for determining the function to compute slabs instat_slabinterval()
and its derived stats. This function is really an internal function only needed by subclasses of the base class, yet added a lot of noise to the documentation, so it was replaced withAbstractStatSlabInterval$compute_slab()
.The
slab_args
argument: extra stat parameters are now passed through to the...
arguments toAbstractStatSlabInterval$compute_slab()
; use these instead.The
slab_type
argument: instead of setting the slab type, either adjust thedensity
argument (e.g. usedensity = "histogram"
to replaceslab_type = "histogram"
) or use thepdf
orcdf
computed variables mapped onto an appropriate aesthetic (e.g. useaes(thickness = after_stat(cdf))
to create a CDF).The
interval_function
andfun.data
arguments: these were parameters for determining the function to compute intervals instat_slabinterval()
and its derived stats. This function is really an internal function only needed by subclasses of the base class, yet added a lot of noise to the documentation, so it was replaced withAbstractStatSlabInterval$compute_interval()
.The
interval_args
andfun.args
arguments: to pass extra arguments to apoint_interval
replace the value of thepoint_interval
argument with a simple wrapper; e.g.stat_halfeye(point_interval = \(...) point_interval(..., extra_arg = XXX))
Deprecated parameters for geom_slabinterval()
and family:
The
size_domain
andsize_range
arguments, which are long-deprecated aliases forinterval_size_domain
andinterval_size_range
, were removed in ggdist 3.1.
Author(s)
Matthew Kay
Continuous guide for colour ramp scales (ggplot2 guide)
Description
A colour ramp bar guide that shows continuous colour ramp scales mapped onto
values as a smooth gradient. Designed for use with scale_fill_ramp_continuous()
and scale_colour_ramp_continuous()
. Based on guide_colourbar()
.
Usage
guide_rampbar(
...,
to = "gray65",
available_aes = c("fill_ramp", "colour_ramp")
)
Arguments
... |
Arguments passed on to
|
to |
<string> The color to ramp to in the guide. Corresponds to |
available_aes |
<character> Vector listing the aesthetics for which a |
Details
This guide creates smooth gradient color bars for use with scale_fill_ramp_continuous()
and scale_colour_ramp_continuous()
. The color to ramp from is determined by the from
argument of the scale_*
function, and the color to ramp to is determined by the to
argument
to guide_rampbar()
.
Guides can be specified in each scale_*
function or in guides()
.
guide = "rampbar"
in scale_*
is syntactic sugar for guide = guide_rampbar()
;
e.g. scale_colour_ramp_continuous(guide = "rampbar")
. For how to specify
the guide for each scale in more detail, see guides()
.
Value
A guide object.
Author(s)
Matthew Kay
See Also
Other colour ramp functions:
partial_colour_ramp()
,
ramp_colours()
,
scale_colour_ramp
Examples
library(dplyr)
library(ggplot2)
library(distributional)
# The default guide for ramp scales is guide_legend(), which creates a
# discrete style scale:
tibble(d = dist_uniform(0, 1)) %>%
ggplot(aes(y = 0, xdist = d)) +
stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") +
scale_fill_ramp_continuous(from = "red")
# We can use guide_rampbar() to instead create a continuous guide, but
# it does not know what color to ramp to (defaults to "gray65"):
tibble(d = dist_uniform(0, 1)) %>%
ggplot(aes(y = 0, xdist = d)) +
stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") +
scale_fill_ramp_continuous(from = "red", guide = guide_rampbar())
# We can tell the guide what color to ramp to using the `to` argument:
tibble(d = dist_uniform(0, 1)) %>%
ggplot(aes(y = 0, xdist = d)) +
stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") +
scale_fill_ramp_continuous(from = "red", guide = guide_rampbar(to = "blue"))
Nicely-spaced sets of interval widths
Description
Create nicely-spaced sets of nested interval widths for use with (e.g.)
the .width
parameter of point_interval()
, stat_slabinterval()
, or
stat_lineribbon()
:
-
interval_widths(n)
creates a sequence ofn
interval widthsp_1 \ldots p_n
, where0 < p_i \le \textrm{max} < 1
, corresponding to the masses of nested intervals that are evenly-spaced on a reference distribution (by default a Normal distribution). This generalizes the idea behind the default ~66% and 95% intervals instat_slabinterval()
and 50%, 80%, and 95% intervals instat_lineribbon()
: when applied to a Normal distribution, those intervals are roughly evenly-spaced and allow one to see deviations from the reference distribution (such as excess kurtosis) when the resulting intervals are not evenly spaced. -
pretty_widths(n)
is a variant ofinterval_widths()
with defaults formax
andprecision
that make the resulting intervals more human-readable, for labeling purposes.
Intervals should be evenly-spaced on any symmetric reference distribution
when applied to data from distributions with the same shape. If dist
is not symmetric, intervals may only be approximately evenly-spaced above the
median.
Usage
interval_widths(n, dist = dist_normal(), max = 1 - 0.1/n, precision = NULL)
pretty_widths(
n,
dist = dist_normal(),
max = if (n <= 4) 0.95 else 1 - 0.1/n,
precision = if (n <= 4) 0.05 else 0.01
)
Arguments
n |
<numeric> in |
dist |
<distribution>: Reference distribution. |
max |
<numeric> in |
precision |
<numeric | NULL>: If not |
Details
Given the cumulative distribution function F_\textrm{dist}(q)
and the quantile function F^{-1}_\textrm{dist}(p)
of dist
, the
following is a sequence of n + 1
evenly-spaced quantiles of dist
that could represent upper limits of nested intervals, where
q_i = q_0 + i\frac{q_n - q_0}{n}
:
\begin{array}{rcl}
q_0, \ldots, q_n &=& F^{-1}_\textrm{dist}(0.5), \ldots, F^{-1}_\textrm{dist}(0.5 + \frac{\textrm{max}}{2})
\end{array}
interval_widths(n)
returns the n
interval widths corresponding to the
upper interval limits q_1, \ldots, q_n
:
2\cdot\left[F_\textrm{dist}(q_1) - 0.5\right], \ldots, 2\cdot\left[F_\textrm{dist}(q_n) - 0.5\right]
Value
A length-n
numeric vector of interval widths (masses) between
0
and 1
(exclusive) in increasing order.
See Also
The .width
argument to point_interval()
, stat_slabinterval()
,
stat_lineribbon()
, etc.
Examples
library(ggplot2)
library(distributional)
interval_widths(1) # 0.9
# this is roughly +/- 1 SD and +/- 2 SD
interval_widths(2) # 0.672..., 0.95
interval_widths(3) # 0.521..., 0.844..., 0.966...
# "pretty" widths may be useful for legends with a small number of widths
pretty_widths(1) # 0.95
pretty_widths(2) # 0.65, 0.95
pretty_widths(3) # 0.50, 0.80, 0.95
# larger numbers of intervals can be useful for plots
ggplot(data.frame(x = 1:20/20)) +
aes(x, ydist = dist_normal((x * 5)^2, 1 + x * 5)) +
stat_lineribbon(.width = pretty_widths(10))
# large numbers of intervals can be used to create gradients -- particularly
# useful if you shade ribbons according to density (not interval width)
# (this is currently experimental)
withr::with_options(list(ggdist.experimental.slab_data_in_intervals = TRUE), print(
ggplot(data.frame(x = 1:20/20)) +
aes(x, ydist = dist_normal((x * 5)^2, 1 + x * 5)) +
stat_lineribbon(
aes(fill_ramp = after_stat(ave(pdf_min, level))),
.width = interval_widths(40),
fill = "gray50"
) +
theme_ggdist()
))
Marginal distribution of a single correlation from an LKJ distribution
Description
Marginal distribution for the correlation in a single cell from a correlation matrix distributed according to an LKJ distribution.
Usage
dlkjcorr_marginal(x, K, eta, log = FALSE)
plkjcorr_marginal(q, K, eta, lower.tail = TRUE, log.p = FALSE)
qlkjcorr_marginal(p, K, eta, lower.tail = TRUE, log.p = FALSE)
rlkjcorr_marginal(n, K, eta)
Arguments
x , q |
vector of quantiles. |
K |
<numeric> Dimension of the correlation matrix. Must be greater than or equal to 2. |
eta |
<numeric> Parameter controlling the shape of the distribution |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities are
|
p |
vector of probabilities. |
n |
number of observations. If |
Details
The LKJ distribution is a distribution over correlation matrices with a single parameter, \eta
.
For a given \eta
and a K \times K
correlation matrix R
:
R \sim \textrm{LKJ}(\eta)
Each off-diagonal entry of R
, r_{ij}: i \ne j
, has the
following marginal distribution (Lewandowski, Kurowicka, and Joe 2009):
\frac{r_{ij} + 1}{2} \sim \textrm{Beta}\left(\eta - 1 + \frac{K}{2}, \eta - 1 + \frac{K}{2}\right)
In other words, r_{ij}
is marginally distributed according to the above Beta
distribution scaled into (-1,1)
.
Value
-
dlkjcorr_marginal
gives the density -
plkjcorr_marginal
gives the cumulative distribution function (CDF) -
qlkjcorr_marginal
gives the quantile function (inverse CDF) -
rlkjcorr_marginal
generates random draws.
The length of the result is determined by n
for rlkjcorr_marginal
, and is the maximum of the lengths of
the numerical arguments for the other functions.
The numerical arguments other than n
are recycled to the length of the result. Only the first elements
of the logical arguments are used.
References
Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis, 100(9), 1989–2001. doi:10.1016/j.jmva.2009.04.008.
See Also
parse_dist()
and marginalize_lkjcorr()
for parsing specs that use the
LKJ correlation distribution and the stat_slabinterval()
family of stats for visualizing them.
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
expand.grid(
eta = 1:6,
K = 2:6
) %>%
ggplot(aes(y = ordered(eta), dist = "lkjcorr_marginal", arg1 = K, arg2 = eta)) +
stat_slab() +
facet_grid(~ paste0(K, "x", K)) +
scale_y_discrete(limits = rev) +
labs(
title = paste0(
"Marginal correlation for LKJ(eta) prior on different matrix sizes:\n",
"dlkjcorr_marginal(K, eta)"
),
subtitle = "Correlation matrix size (KxK)",
y = "eta",
x = "Marginal correlation"
) +
theme(axis.title = element_text(hjust = 0))
Turn spec for LKJ distribution into spec for marginal LKJ distribution
Description
Turns specs for an LKJ correlation matrix distribution as returned by
parse_dist()
into specs for the marginal distribution of
a single cell in an LKJ-distributed correlation matrix (i.e., lkjcorr_marginal()
).
Useful for visualizing prior correlations from LKJ distributions.
Usage
marginalize_lkjcorr(
data,
K,
predicate = NULL,
dist = ".dist",
args = ".args",
dist_obj = ".dist_obj"
)
Arguments
data |
<data.frame> A data frame containing a column with distribution names ( |
K |
<numeric> Dimension of the correlation matrix. Must be greater than or equal to 2. |
predicate |
<bare language | NULL> Expression for selecting the rows of If |
dist |
<string> The name of the column containing distribution names. See |
args |
<string> The name of the column containing distribution arguments. See |
dist_obj |
<string> The name of the output column to contain a distributional
object representing the distribution. See |
Details
The LKJ(eta) prior on a correlation matrix induces a marginal prior on each correlation
in the matrix that depends on both the value of eta
and K
, the dimension
of the K \times K
correlation matrix. Thus to visualize the marginal prior
on the correlations, it is necessary to specify the value of K
, which depends
on what your model specification looks like.
Given a data frame representing parsed distribution specifications (such
as returned by parse_dist()
), this function updates any rows with .dist == "lkjcorr"
so that the first argument to the distribution (stored in .args
) is equal to the specified dimension
of the correlation matrix (K
), changes the distribution name in .dist
to "lkjcorr_marginal"
,
and assigns a distributional object representing this distribution to .dist_obj
.
This allows the distribution to be easily visualized using the stat_slabinterval()
family of ggplot2 stats.
Value
A data frame of the same size and column names as the input, with the dist
, and args
,
and dist_obj
columns modified on rows where dist == "lkjcorr"
such that they represent a
marginal LKJ correlation distribution with name lkjcorr_marginal
and args
having
K
equal to the input value of K
.
See Also
parse_dist()
, lkjcorr_marginal()
Examples
library(dplyr)
library(ggplot2)
# Say we have an LKJ(3) prior on a 2x2 correlation matrix. We can visualize
# its marginal distribution as follows...
data.frame(prior = "lkjcorr(3)") %>%
parse_dist(prior) %>%
marginalize_lkjcorr(K = 2) %>%
ggplot(aes(y = prior, xdist = .dist_obj)) +
stat_halfeye() +
xlim(-1, 1) +
xlab("Marginal correlation for LKJ(3) prior on 2x2 correlation matrix")
# Say our prior list has multiple LKJ priors on correlation matrices
# of different sizes, we can supply a predicate expression to select
# only those rows we want to modify
data.frame(coef = c("a", "b"), prior = "lkjcorr(3)") %>%
parse_dist(prior) %>%
marginalize_lkjcorr(K = 2, coef == "a") %>%
marginalize_lkjcorr(K = 4, coef == "b")
Parse distribution specifications into columns of a data frame
Description
Parses simple string distribution specifications, like "normal(0, 1)"
, into two columns of
a data frame, suitable for use with the dist
and args
aesthetics of stat_slabinterval()
and its shortcut stats (like stat_halfeye()
). This format is output
by brms::get_prior
, making it particularly useful for visualizing priors from
brms models.
Usage
parse_dist(
object,
...,
dist = ".dist",
args = ".args",
dist_obj = ".dist_obj",
package = NULL,
to_r_names = TRUE
)
## Default S3 method:
parse_dist(object, ...)
## S3 method for class 'data.frame'
parse_dist(
object,
dist_col,
...,
dist = ".dist",
args = ".args",
dist_obj = ".dist_obj",
package = NULL,
lb = "lb",
ub = "ub",
to_r_names = TRUE
)
## S3 method for class 'character'
parse_dist(
object,
...,
dist = ".dist",
args = ".args",
dist_obj = ".dist_obj",
package = NULL,
to_r_names = TRUE
)
## S3 method for class 'factor'
parse_dist(
object,
...,
dist = ".dist",
args = ".args",
dist_obj = ".dist_obj",
package = NULL,
to_r_names = TRUE
)
## S3 method for class 'brmsprior'
parse_dist(
object,
dist_col = prior,
...,
dist = ".dist",
args = ".args",
dist_obj = ".dist_obj",
package = NULL,
to_r_names = TRUE
)
r_dist_name(dist_name)
Arguments
object |
<character | data.frame> One of:
|
... |
Arguments passed to other implementations of |
dist |
<string> The name of the output column to contain the distribution name. |
args |
<string> The name of the output column to contain the arguments to the distribution. |
dist_obj |
<string> The name of the output column to contain a distributional object representing the distribution. |
package |
<string | environment | NULL> The package or environment to search for
distribution functions in. Passed to
|
to_r_names |
<scalar logical> If |
dist_col |
<bare language> Column or column expression of |
lb |
<string> The name of an input column (for |
ub |
<string> The name of an input column (for |
dist_name |
<character> For |
Details
parse_dist()
can be applied to character vectors or to a data frame + bare column name of the
column to parse, and returns a data frame with ".dist"
and ".args"
columns added.
parse_dist()
uses r_dist_name()
to translate distribution names into names recognized
by R.
r_dist_name()
takes a character vector of names and translates common names into R
distribution names. Names are first made into valid R names using make.names()
,
then translated (ignoring character case, "."
, and "_"
). Thus, "lognormal"
,
"LogNormal"
, "log_normal"
, "log-Normal"
, and any number of other variants
all get translated into "lnorm"
.
Value
-
parse_dist
returns a data frame containing at least two columns named after thedist
andargs
parameters. If the input is a data frame, the output is a data frame of the same length with those two columns added. If the input is a character vector or factor, the output is a two-column data frame with the same number of rows as the length of the input. -
r_dist_name
returns a character vector the same length as the input containing translations of the input names into distribution names R can recognize.
See Also
See stat_slabinterval()
and its shortcut stats, which can easily make use of
the output of this function using the dist
and args
aesthetics.
Examples
library(dplyr)
# parse dist can operate on strings directly...
parse_dist(c("normal(0,1)", "student_t(3,0,1)"))
# ... or on columns of a data frame, where it adds the
# parsed specs back on as columns
data.frame(prior = c("normal(0,1)", "student_t(3,0,1)")) %>%
parse_dist(prior)
# parse_dist is particularly useful with the output of brms::prior(),
# which follows the same format as above
Partial colour ramp (datatype)
Description
A representation of a partial ramp between two colours: the origin colour
(from
) and the distance from the origin colour to the target colour
(amount
, a value between 0
and 1
). The target colour of the ramp
can be filled in later using ramp_colours()
, producing a colour.
Usage
partial_colour_ramp(amount = double(), from = "white")
Arguments
amount |
<numeric> Vector of values between |
from |
<character> Vector giving colours to ramp from. |
Details
This datatype is used by scale_colour_ramp to create ramped colours in
ggdist geoms. It is a vctrs::rcrd datatype with two fields:
"amount"
, the amount to ramp, and "from"
, the colour to ramp from.
Colour ramps can be applied (i.e. translated into colours) using
ramp_colours()
, which can be used with partial_colour_ramp()
to implement geoms that make use of colour_ramp
or fill_ramp
scales.
Value
A vctrs::rcrd of class "ggdist_partial_colour_ramp"
with fields
"amount"
and "from"
.
Author(s)
Matthew Kay
See Also
Other colour ramp functions:
guide_rampbar()
,
ramp_colours()
,
scale_colour_ramp
Examples
pcr = partial_colour_ramp(c(0, 0.25, 0.75, 1), "red")
pcr
ramp_colours("blue", pcr)
Point and interval summaries for tidy data frames of draws from distributions
Description
Translates draws from distributions in a (possibly grouped) data frame into point and interval summaries (or set of point and interval summaries, if there are multiple groups in a grouped data frame).
Supports automatic partial function application.
Usage
point_interval(
.data,
...,
.width = 0.95,
.point = median,
.interval = qi,
.simple_names = TRUE,
na.rm = FALSE,
.exclude = c(".chain", ".iteration", ".draw", ".row"),
.prob
)
## Default S3 method:
point_interval(
.data,
...,
.width = 0.95,
.point = median,
.interval = qi,
.simple_names = TRUE,
na.rm = FALSE,
.exclude = c(".chain", ".iteration", ".draw", ".row"),
.prob
)
## S3 method for class 'tbl_df'
point_interval(.data, ...)
## S3 method for class 'numeric'
point_interval(
.data,
...,
.width = 0.95,
.point = median,
.interval = qi,
.simple_names = FALSE,
na.rm = FALSE,
.exclude = c(".chain", ".iteration", ".draw", ".row"),
.prob
)
## S3 method for class 'rvar'
point_interval(
.data,
...,
.width = 0.95,
.point = median,
.interval = qi,
.simple_names = TRUE,
na.rm = FALSE
)
## S3 method for class 'distribution'
point_interval(
.data,
...,
.width = 0.95,
.point = median,
.interval = qi,
.simple_names = TRUE,
na.rm = FALSE
)
qi(x, .width = 0.95, .prob, na.rm = FALSE)
ll(x, .width = 0.95, na.rm = FALSE)
ul(x, .width = 0.95, na.rm = FALSE)
hdi(
x,
.width = 0.95,
na.rm = FALSE,
...,
density = density_bounded(trim = TRUE),
n = 4096,
.prob
)
Mode(x, na.rm = FALSE, ...)
## Default S3 method:
Mode(
x,
na.rm = FALSE,
...,
density = density_bounded(trim = TRUE),
n = 2001,
weights = NULL
)
## S3 method for class 'rvar'
Mode(x, na.rm = FALSE, ...)
## S3 method for class 'distribution'
Mode(x, na.rm = FALSE, ...)
hdci(x, .width = 0.95, na.rm = FALSE)
mean_qi(.data, ..., .width = 0.95)
median_qi(.data, ..., .width = 0.95)
mode_qi(.data, ..., .width = 0.95)
mean_ll(.data, ..., .width = 0.95)
median_ll(.data, ..., .width = 0.95)
mode_ll(.data, ..., .width = 0.95)
mean_ul(.data, ..., .width = 0.95)
median_ul(.data, ..., .width = 0.95)
mode_ul(.data, ..., .width = 0.95)
mean_hdi(.data, ..., .width = 0.95)
median_hdi(.data, ..., .width = 0.95)
mode_hdi(.data, ..., .width = 0.95)
mean_hdci(.data, ..., .width = 0.95)
median_hdci(.data, ..., .width = 0.95)
mode_hdci(.data, ..., .width = 0.95)
Arguments
.data |
<data.frame | grouped_df> Data frame (or grouped
data frame as returned by |
... |
<bare language> Column names or expressions that, when evaluated in the context of
|
.width |
<numeric> vector of probabilities to use that determine the widths of
the resulting intervals. If multiple probabilities are provided, multiple rows per
group are generated, each with a different probability interval (and value of the
corresponding |
.point |
<function> Point summary function, which takes a vector and returns a single
value, e.g. |
.interval |
<function> Interval function, which takes a vector and a probability
( |
.simple_names |
<scalar logical> When |
na.rm |
<scalar logical> Should |
.exclude |
<character> Vector of names of columns to be excluded from summarization
if no column names are specified to be summarized in |
.prob |
Deprecated. Use |
x |
<numeric> Vector to summarize (for interval functions: |
density |
<function | string> For |
n |
<scalar numeric> For |
weights |
<numeric | NULL> For |
Details
If .data
is a data frame, then ...
is a list of bare names of
columns (or expressions derived from columns) of .data
, on which
the point and interval summaries are derived. Column expressions are processed
using the tidy evaluation framework (see rlang::eval_tidy()
).
For a column named x
, the resulting data frame will have a column
named x
containing its point summary. If there is a single
column to be summarized and .simple_names
is TRUE
, the output will
also contain columns .lower
(the lower end of the interval),
.upper
(the upper end of the interval).
Otherwise, for every summarized column x
, the output will contain
x.lower
(the lower end of the interval) and x.upper
(the upper
end of the interval). Finally, the output will have a .width
column
containing the' probability for the interval on each output row.
If .data
includes groups (see e.g. dplyr::group_by()
),
the points and intervals are calculated within the groups.
If .data
is a vector, ...
is ignored and the result is a
data frame with one row per value of .width
and three columns:
y
(the point summary), ymin
(the lower end of the interval),
ymax
(the upper end of the interval), and .width
, the probability
corresponding to the interval. This behavior allows point_interval
and its derived functions (like median_qi
, mean_qi
, mode_hdi
, etc)
to be easily used to plot intervals in ggplot stats using methods like
stat_eye()
, stat_halfeye()
, or stat_summary()
.
median_qi
, mode_hdi
, etc are short forms for
point_interval(..., .point = median, .interval = qi)
, etc.
qi
yields the quantile interval (also known as the percentile interval or
equi-tailed interval) as a 1x2 matrix.
hdi
yields the highest-density interval(s) (also known as the highest posterior
density interval). Note: If the distribution is multimodal, hdi
may return multiple
intervals for each probability level (these will be spread over rows). You may wish to use
hdci
(below) instead if you want a single highest-density interval, with the caveat that when
the distribution is multimodal hdci
is not a highest-density interval.
hdci
yields the highest-density continuous interval, also known as the shortest
probability interval. Note: If the distribution is multimodal, this may not actually
be the highest-density interval (there may be a higher-density
discontinuous interval, which can be found using hdi
).
ll
and ul
yield lower limits and upper limits, respectively (where the opposite
limit is set to either Inf
or -Inf
).
Value
A data frame containing point summaries and intervals, with at least one column corresponding
to the point summary, one to the lower end of the interval, one to the upper end of the interval, the
width of the interval (.width
), the type of point summary (.point
), and the type of interval (.interval
).
Author(s)
Matthew Kay
Examples
library(dplyr)
library(ggplot2)
set.seed(123)
rnorm(1000) %>%
median_qi()
data.frame(x = rnorm(1000)) %>%
median_qi(x, .width = c(.50, .80, .95))
data.frame(
x = rnorm(1000),
y = rnorm(1000, mean = 2, sd = 2)
) %>%
median_qi(x, y)
data.frame(
x = rnorm(1000),
group = "a"
) %>%
rbind(data.frame(
x = rnorm(1000, mean = 2, sd = 2),
group = "b")
) %>%
group_by(group) %>%
median_qi(.width = c(.50, .80, .95))
multimodal_draws = data.frame(
x = c(rnorm(5000, 0, 1), rnorm(2500, 4, 1))
)
multimodal_draws %>%
mode_hdi(.width = c(.66, .95))
multimodal_draws %>%
ggplot(aes(x = x, y = 0)) +
stat_halfeye(point_interval = mode_hdi, .width = c(.66, .95))
Dodge overlapping objects side-to-side, preserving justification
Description
A justification-preserving variant of ggplot2::position_dodge()
which preserves the
vertical position of a geom while adjusting the horizontal position (or vice
versa when in a horizontal orientation). Unlike ggplot2::position_dodge()
,
position_dodgejust()
attempts to preserve the "justification" of x
positions relative to the bounds containing them (xmin
/xmax
) (or y
positions relative to ymin
/ymax
when in a horizontal orientation). This
makes it useful for dodging annotations to geoms and stats from the
geom_slabinterval()
family, which also preserve the justification of their
intervals relative to their slabs when dodging.
Usage
position_dodgejust(
width = NULL,
preserve = c("total", "single"),
justification = NULL
)
Arguments
width |
Dodging width, when different to the width of the individual elements. This is useful when you want to align narrow geoms with wider geoms. See the examples. |
preserve |
Should dodging preserve the |
justification |
<scalar numeric> Justification of the point position ( |
Examples
library(dplyr)
library(ggplot2)
library(distributional)
dist_df = tribble(
~group, ~subgroup, ~mean, ~sd,
1, "h", 5, 1,
2, "h", 7, 1.5,
3, "h", 8, 1,
3, "i", 9, 1,
3, "j", 7, 1
)
# An example with normal "dodge" positioning
# Notice how dodge points are placed in the center of their bounding boxes,
# which can cause slabs to be positioned outside their bounds.
dist_df %>%
ggplot(aes(
x = factor(group), ydist = dist_normal(mean, sd),
fill = subgroup
)) +
stat_halfeye(
position = "dodge"
) +
geom_rect(
aes(xmin = group, xmax = group + 1, ymin = 2, ymax = 13, color = subgroup),
position = "dodge",
data = . %>% filter(group == 3),
alpha = 0.1
) +
geom_point(
aes(x = group, y = 7.5, color = subgroup),
position = position_dodge(width = 1),
data = . %>% filter(group == 3),
shape = 1,
size = 4,
stroke = 1.5
) +
scale_fill_brewer(palette = "Set2") +
scale_color_brewer(palette = "Dark2")
# This same example with "dodgejust" positioning. For the points we
# supply a justification parameter to position_dodgejust which mimics the
# justification parameter of stat_halfeye, ensuring that they are
# placed appropriately. On slabinterval family geoms, position_dodgejust()
# will automatically detect the appropriate justification.
dist_df %>%
ggplot(aes(
x = factor(group), ydist = dist_normal(mean, sd),
fill = subgroup
)) +
stat_halfeye(
position = "dodgejust"
) +
geom_rect(
aes(xmin = group, xmax = group + 1, ymin = 2, ymax = 13, color = subgroup),
position = "dodgejust",
data = . %>% filter(group == 3),
alpha = 0.1
) +
geom_point(
aes(x = group, y = 7.5, color = subgroup),
position = position_dodgejust(width = 1, justification = 0),
data = . %>% filter(group == 3),
shape = 1,
size = 4,
stroke = 1.5
) +
scale_fill_brewer(palette = "Set2") +
scale_color_brewer(palette = "Dark2")
Apply partial colour ramps
Description
Given vectors of colours and partial_colour_ramp
s, ramps the colours
according to the parameters of the partial colour ramps, returning
a vector of the same length as the inputs giving the transformed
(ramped) colours.
Usage
ramp_colours(colour, ramp)
Arguments
colour |
<character> Vector of colours to ramp to. |
ramp |
<partial_colour_ramp> Vector of colour ramps (same length as
|
Details
Takes vectors of colours and partial_colour_ramp
s and produces
colours by interpolating between each from
colour and the target colour
the specified amount
(where amount
and from
are the corresponding
fields of the ramp
).
For example, to add support for the fill_ramp
aesthetic to a geometry,
this line could be used inside the draw_group()
or draw_panel()
method
of a geom:
data$fill = ramp_colours(data$fill, data$fill_ramp)
Value
A character vector of colours.
Author(s)
Matthew Kay
See Also
Other colour ramp functions:
guide_rampbar()
,
partial_colour_ramp()
,
scale_colour_ramp
Examples
pcr = partial_colour_ramp(c(0, 0.25, 0.75, 1), "red")
pcr
ramp_colours("blue", pcr)
Secondary color scale that ramps from another color (ggplot2 scale)
Description
This scale creates a secondary scale that modifies the fill
or color
scale of
geoms that support it (geom_lineribbon()
and geom_slabinterval()
) to "ramp"
from a secondary color (by default white) to the primary fill color (determined
by the standard color
or fill
aesthetics). It uses the
partial_colour_ramp()
data type.
Usage
scale_colour_ramp_continuous(
from = "white",
...,
limits = function(l) c(min(0, l[[1]]), l[[2]]),
range = c(0, 1),
guide = "legend",
aesthetics = "colour_ramp"
)
scale_color_ramp_continuous(
from = "white",
...,
limits = function(l) c(min(0, l[[1]]), l[[2]]),
range = c(0, 1),
guide = "legend",
aesthetics = "colour_ramp"
)
scale_colour_ramp_discrete(
from = "white",
...,
range = c(0.2, 1),
aesthetics = "colour_ramp"
)
scale_color_ramp_discrete(
from = "white",
...,
range = c(0.2, 1),
aesthetics = "colour_ramp"
)
scale_fill_ramp_continuous(..., aesthetics = "fill_ramp")
scale_fill_ramp_discrete(..., aesthetics = "fill_ramp")
Arguments
from |
<string> The color to ramp from. Corresponds to |
... |
Arguments passed to underlying scale or guide functions. E.g.
|
limits |
One of:
|
range |
<length-2 numeric> Minimum and maximum
values after the scale transformation. These values should be between |
guide |
<Guide | string> A function used to create a guide or its name. For
|
aesthetics |
<character> Names of aesthetics to set scales for. |
Details
These scales transform data into partial_colour_ramp
s. Each partial_colour_ramp
is a pair of two values: a from
colour and a numeric amount
between 0
and 1
representing a distance between from
and the target color (where 0
indicates the from
color and 1
the target color).
The target color is determined by the corresponding aesthetic: for example,
the colour_ramp
aesthetic creates ramps between from
and whatever the
value of the colour
aesthetic is; the fill_ramp
aesthetic creates ramps
between from
and whatever the value of the fill
aesthetic is. When the
colour_ramp
aesthetic is set, ggdist geometries will modify their
colour
by applying the colour ramp between from
and colour
(and
similarly for fill_ramp
and fill
).
Colour ramps can be applied (i.e. translated into colours) using
ramp_colours()
, which can be used with partial_colour_ramp()
to implement geoms that make use of colour_ramp
or fill_ramp
scales.
Value
A ggplot2::Scale representing a scale for the colour_ramp
and/or fill_ramp
aesthetics for ggdist
geoms. Can be added to a ggplot()
object.
Author(s)
Matthew Kay
See Also
Other ggdist scales:
scale_side_mirrored()
,
scale_thickness
,
sub-geometry-scales
Other colour ramp functions:
guide_rampbar()
,
partial_colour_ramp()
,
ramp_colours()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
tibble(d = dist_uniform(0, 1)) %>%
ggplot(aes(y = 0, xdist = d)) +
stat_slab(aes(fill_ramp = after_stat(x)))
tibble(d = dist_uniform(0, 1)) %>%
ggplot(aes(y = 0, xdist = d)) +
stat_slab(aes(fill_ramp = after_stat(x)), fill = "blue") +
scale_fill_ramp_continuous(from = "red")
# you can invert the order of `range` to change the order of the blend
tibble(d = dist_normal(0, 1)) %>%
ggplot(aes(y = 0, xdist = d)) +
stat_slab(aes(fill_ramp = after_stat(cut_cdf_qi(cdf))), fill = "blue") +
scale_fill_ramp_discrete(from = "red", range = c(1, 0))
Side scale for mirrored slabs (ggplot2 scale)
Description
This scale creates mirrored slabs for the side
aesthetic of the geom_slabinterval()
and geom_dotsinterval()
family of geoms and stats. It works on discrete variables
of two or three levels.
Usage
scale_side_mirrored(start = "topright", ..., aesthetics = "side")
Arguments
start |
<string> The side to start from. Can be any valid value of the |
... |
Arguments passed on to
|
aesthetics |
<character> Names of aesthetics to set scales for. |
Value
A ggplot2::Scale representing a scale for the side
aesthetic for ggdist geoms. Can be added to a ggplot()
object.
Author(s)
Matthew Kay
See Also
Other ggdist scales:
scale_colour_ramp
,
scale_thickness
,
sub-geometry-scales
Examples
library(dplyr)
library(ggplot2)
set.seed(1234)
data.frame(
x = rnorm(400, c(1,4)),
g = c("a","b")
) %>%
ggplot(aes(x, fill = g, side = g)) +
geom_weave(linewidth = 0, scale = 0.5) +
scale_side_mirrored()
Slab thickness scale (ggplot2 scale)
Description
This ggplot2 scale linearly scales all thickness
values of geoms
that support the thickness
aesthetic (such as geom_slabinterval()
). It
can be used to align the thickness
scales across multiple geoms (by default,
thickness
is normalized on a per-geom level instead of as a global scale).
For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
Usage
scale_thickness_shared(
name = waiver(),
breaks = waiver(),
labels = waiver(),
limits = function(l) c(min(0, l[[1]]), l[[2]]),
renormalize = FALSE,
oob = scales::oob_keep,
guide = "none",
expand = c(0, 0),
...
)
scale_thickness_identity(..., guide = "none")
Arguments
name |
The name of the scale. Used as the axis or legend title. If
|
breaks |
One of:
|
labels |
One of:
|
limits |
One of:
|
renormalize |
<scalar logical> When mapping values to the |
oob |
One of:
|
guide |
A function used to create a guide or its name. See
|
expand |
<numeric> Vector of limit expansion constants of length
2 or 4, following the same format used by the |
... |
Arguments passed on to
|
Details
By default, normalization/scaling of slab thicknesses is controlled by geometries,
not by a ggplot2 scale function. This allows various functionality not
otherwise possible, such as (1) allowing different geometries to have different
thickness scales and (2) allowing the user to control at what level of aggregation
(panels, groups, the entire plot, etc) thickness scaling is done via the normalize
parameter to geom_slabinterval()
.
However, this default approach has one drawback: two different geoms will always
have their own scaling of thickness
. scale_thickness_shared()
offers an
alternative approach: when added to a chart, all geoms will use the same
thickness
scale, and geom-level normalization (via their normalize
parameters)
is ignored. This is achieved by "marking" thickness values as already
normalized by wrapping them in the thickness()
data type (this can be
disabled by setting renormalize = TRUE
).
Note: while a slightly more typical name for scale_thickness_shared()
might
be scale_thickness_continuous()
, the latter name would cause this scale
to be applied to all thickness
aesthetics by default according to the rules
ggplot2 uses to find default scales. Thus, to retain the usual behavior
of stat_slabinterval()
(per-geom normalization of thickness
), this scale
is called scale_thickness_shared()
.
Value
A ggplot2::Scale representing a scale for the thickness
aesthetic for ggdist
geoms. Can be added to a ggplot()
object.
Author(s)
Matthew Kay
See Also
The thickness datatype.
The thickness
aesthetic of geom_slabinterval()
.
subscale_thickness()
, for setting a thickness
sub-scale within
a single geom_slabinterval()
.
Other ggdist scales:
scale_colour_ramp
,
scale_side_mirrored()
,
sub-geometry-scales
Examples
library(distributional)
library(ggplot2)
library(dplyr)
prior_post = data.frame(
prior = dist_normal(0, 1),
posterior = dist_normal(0.1, 0.5)
)
# By default, separate geoms have their own thickness scales, which means
# distributions plotted using two separate geoms will not have their slab
# functions drawn on the same scale (thus here, the two distributions have
# different areas under their density curves):
prior_post %>%
ggplot() +
stat_halfeye(aes(xdist = posterior)) +
stat_slab(aes(xdist = prior), fill = NA, color = "red")
# For this kind of prior/posterior chart, it makes more sense to have the
# densities on the same scale; thus, the areas under both would be the same.
# We can do that using scale_thickness_shared():
prior_post %>%
ggplot() +
stat_halfeye(aes(xdist = posterior)) +
stat_slab(aes(xdist = prior), fill = NA, color = "#e41a1c") +
scale_thickness_shared()
Smooth dot positions in a dotplot using a kernel density estimator ("density dotplots")
Description
Smooths x
values using a density estimator, returning new x
of the same
length. Can be used with a dotplot (e.g. geom_dots
(smooth = ...)
) to create
"density dotplots".
Supports automatic partial function application with waived arguments.
Usage
smooth_bounded(
x,
density = "bounded",
bounds = c(NA, NA),
bounder = "cooke",
trim = FALSE,
...
)
smooth_unbounded(x, density = "unbounded", trim = FALSE, ...)
Arguments
x |
<numeric> Values to smooth. |
density |
<function | string> Density estimator to use for smoothing. One of:
|
bounds |
<length-2 numeric> Min and max bounds. If a bound is |
bounder |
<function | string> Method to use to find missing
(
|
trim |
<scalar logical> Passed to |
... |
Arguments passed to the density estimator specified by |
Details
Applies a kernel density estimator (KDE) to x
, then uses weighted quantiles
of the KDE to generate a new set of x
values with smoothed values. Plotted
using a dotplot (e.g. geom_dots(smooth = "bounded")
or
geom_dots(smooth = smooth_bounded(...)
), these values create a variation on
a "density dotplot" (Zvinca 2018).
Such plots are recommended only in very large sample sizes where precise positions of individual values are not particularly meaningful. In small samples, normal dotplots should generally be used.
Two variants are supplied by default:
-
smooth_bounded()
, which usesdensity_bounded()
. Passes thebounds
arguments to the estimator. -
smooth_unbounded()
, which usesdensity_unbounded()
.
It is generally recommended to pick the smooth based on the known bounds of
your data, e.g. by using smooth_bounded()
with the bounds
parameter if
there are finite bounds, or smooth_unbounded()
if both bounds are infinite.
Value
A numeric vector of length(x)
, where each entry is a smoothed version of
the corresponding entry in x
.
If x
is missing, returns a partial application of itself. See automatic-partial-functions.
References
Zvinca, Daniel. "In the pursuit of diversity in data visualization. Jittering data to access details." https://www.linkedin.com/pulse/pursuit-diversity-data-visualization-jittering-access-daniel-zvinca/.
See Also
Other dotplot smooths:
smooth_discrete()
,
smooth_none()
Examples
library(ggplot2)
set.seed(1234)
x = rnorm(1000)
# basic dotplot is noisy
ggplot(data.frame(x), aes(x)) +
geom_dots()
# density dotplot is smoother, but does move points (most noticeable
# in areas of low density)
ggplot(data.frame(x), aes(x)) +
geom_dots(smooth = "unbounded")
# you can adjust the kernel and bandwidth...
ggplot(data.frame(x), aes(x)) +
geom_dots(smooth = smooth_unbounded(kernel = "triangular", adjust = 0.5))
# for bounded data, you should use the bounded smoother
x_beta = rbeta(1000, 0.5, 0.5)
ggplot(data.frame(x_beta), aes(x_beta)) +
geom_dots(smooth = smooth_bounded(bounds = c(0, 1)))
Smooth dot positions in a dotplot of discrete values ("bar dotplots")
Description
Note: Better-looking bar dotplots are typically easier to achieve using
layout = "bar"
with the geom_dotsinterval()
family instead of
smooth = "bar"
or smooth = "discrete"
.
Smooths x
values where x
is presumed to be discrete, returning a new x
of the same length. Both smooth_discrete()
and smooth_bar()
use the
resolution()
of the data to apply smoothing around unique values in the
dataset; smooth_discrete()
uses a kernel density estimator and smooth_bar()
places values in an evenly-spaced grid. Can be used with a dotplot
(e.g. geom_dots
(smooth = ...)
) to create "bar dotplots".
Supports automatic partial function application with waived arguments.
Usage
smooth_discrete(
x,
kernel = c("rectangular", "gaussian", "epanechnikov", "triangular", "biweight",
"cosine", "optcosine"),
width = 0.7,
...
)
smooth_bar(x, width = 0.7, ...)
Arguments
x |
<numeric> Values to smooth. |
kernel |
<string> The smoothing kernel to be used. This must partially
match one of |
width |
<scalar numeric> approximate width of the bars as a fraction
of data |
... |
additional parameters; |
Details
smooth_discrete()
applies a kernel density estimator (default: rectangular)
to x
. It automatically sets the bandwidth to be such that the kernel's
width (for each kernel type) is approximately width
times the resolution()
of the data. This means it essentially creates smoothed bins around each
unique value. It calls down to smooth_unbounded()
.
smooth_bar()
generates an evenly-spaced grid of values spanning +/- width/2
around each unique value in x
.
Value
A numeric vector of length(x)
, where each entry is a smoothed version of
the corresponding entry in x
.
If x
is missing, returns a partial application of itself. See automatic-partial-functions.
See Also
Other dotplot smooths:
smooth_density
,
smooth_none()
Examples
library(ggplot2)
set.seed(1234)
x = rpois(1000, 2)
# automatic binwidth in basic dotplot on large counts in discrete
# distributions is very small
ggplot(data.frame(x), aes(x)) +
geom_dots()
# NOTE: It is now recommended to use layout = "bar" instead of
# smooth = "discrete" or smooth = "bar"; the latter are retained because
# they can sometimes be useful in combination with other layouts for
# more specialized (but finicky) applications.
ggplot(data.frame(x), aes(x)) +
geom_dots(layout = "bar")
# smooth_discrete() constructs wider bins of dots
ggplot(data.frame(x), aes(x)) +
geom_dots(smooth = "discrete")
# smooth_bar() is an alternative approach to rectangular layouts
ggplot(data.frame(x), aes(x)) +
geom_dots(smooth = "bar")
# adjust the shape by changing the kernel or the width. epanechnikov
# works well with side = "both"
ggplot(data.frame(x), aes(x)) +
geom_dots(smooth = smooth_discrete(kernel = "epanechnikov", width = 0.8), side = "both")
Apply no smooth to a dotplot
Description
Default smooth for dotplots: no smooth. Simply returns the input values.
Supports automatic partial function application with waived arguments.
Usage
smooth_none(x, ...)
Arguments
x |
<numeric> Values to smooth. |
... |
ignored |
Details
This is the default value for the smooth
argument of geom_dotsinterval()
.
Value
x
If x
is missing, returns a partial application of itself. See automatic-partial-functions.
See Also
Other dotplot smooths:
smooth_density
,
smooth_discrete()
CCDF bar plot (shortcut stat)
Description
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating CCDF bar plots.
Roughly equivalent to:
stat_slabinterval( aes( thickness = after_stat(thickness(1 - cdf, 0, 1)), justification = after_stat(0.5), side = after_stat("topleft") ), normalize = "none", expand = TRUE )
Usage
stat_ccdfinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
normalize = "none",
expand = TRUE,
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
point_interval = "median_qi",
limits = NULL,
n = waiver(),
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a CCDF bar geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_ccdfinterval() +
expand_limits(x = 0)
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_ccdfinterval() +
expand_limits(x = 0)
CDF bar plot (shortcut stat)
Description
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating CDF bar plots.
Roughly equivalent to:
stat_slabinterval( aes( thickness = after_stat(thickness(cdf, 0, 1)), justification = after_stat(0.5), side = after_stat("topleft") ), normalize = "none", expand = TRUE )
Usage
stat_cdfinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
normalize = "none",
expand = TRUE,
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
point_interval = "median_qi",
limits = NULL,
n = waiver(),
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
normalize |
<string> Groups within which to scale values of the
For a comprehensive discussion and examples of slab scaling and normalization, see the
|
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a CDF bar geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_cdfinterval()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_cdfinterval()
Dot plot (shortcut stat)
Description
A combination of stat_slabinterval()
and geom_dotsinterval()
with sensible defaults
for making dot plots. While geom_dotsinterval()
is intended for use on data
frames that have already been summarized using a point_interval()
function,
stat_dots()
is intended for use directly on data frames of draws or of
analytical distributions, and will perform the summarization using a point_interval()
function. Geoms based on geom_dotsinterval()
create dotplots that automatically determine a bin width that
ensures the plot fits within the available space. They can also ensure dots do not overlap.
Roughly equivalent to:
stat_dotsinterval( aes(size = NULL), geom = "dots", show_point = FALSE, show_interval = FALSE, show.legend = NA )
Usage
stat_dots(
mapping = NULL,
data = NULL,
geom = "dots",
position = "identity",
...,
quantiles = NA,
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
quantiles |
<scalar logical> Number of quantiles to plot in the dotplot. Use |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in
geom_slabinterval()
and can be given x positions (or y positions when in a horizontal orientation).Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the
slab_shape
aesthetic (when using thedotsinterval
family) or theshape
orslab_shape
aesthetic (when using thedots
family)
Stats and geoms in this family include:
-
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size of the dots automatically (may result in very small dots). -
geom_swarm()
andgeom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots. Usedside = "both"
by default, and sets the default dot size to the same size asgeom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small. -
stat_dots()
: dotplots on raw data, distributional objects, andposterior::rvar()
s -
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated intervals (rarely useful directly). -
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects, andposterior::rvar()
s (will calculate intervals for you). -
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to each dot to be specified using thesd
aesthetic. -
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a dot geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_dots()
)
the following aesthetics are supported by the underlying geom:
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.order
: The order in which data points are stacked within bins. Can be used to create the effect of "stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the value of the data points themselves are used to determine stacking order. Only applies whenlayout
is"bin"
or"hex"
, as the other layout methods fully determine both x and y positions.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.slab_shape
: Override forshape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
References
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See Also
See geom_dots()
for the geom underlying this stat.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval stats:
stat_dotsinterval()
,
stat_mcse_dots()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(12345)
tibble(
x = rep(1:10, 100),
y = rnorm(1000, x)
) %>%
ggplot(aes(x = x, y = y)) +
stat_dots()
# ON ANALYTICAL DISTRIBUTIONS
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
tibble(
x = 1:10,
sd = seq(1, 3, length.out = 10)
) %>%
ggplot(aes(x = x, ydist = dist_normal(x, sd))) +
stat_dots(quantiles = 50)
Dots + point + interval plot (shortcut stat)
Description
A combination of stat_slabinterval()
and geom_dotsinterval()
with sensible defaults
for making dots + point + interval plots. While geom_dotsinterval()
is intended for use on data
frames that have already been summarized using a point_interval()
function,
stat_dotsinterval()
is intended for use directly on data frames of draws or of
analytical distributions, and will perform the summarization using a point_interval()
function. Geoms based on geom_dotsinterval()
create dotplots that automatically determine a bin width that
ensures the plot fits within the available space. They can also ensure dots do not overlap.
Usage
stat_dotsinterval(
mapping = NULL,
data = NULL,
geom = "dotsinterval",
position = "identity",
...,
quantiles = NA,
point_interval = "median_qi",
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
quantiles |
<scalar logical> Number of quantiles to plot in the dotplot. Use |
point_interval |
<function | string> A function from the |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in
geom_slabinterval()
and can be given x positions (or y positions when in a horizontal orientation).Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the
slab_shape
aesthetic (when using thedotsinterval
family) or theshape
orslab_shape
aesthetic (when using thedots
family)
Stats and geoms in this family include:
-
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size of the dots automatically (may result in very small dots). -
geom_swarm()
andgeom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots. Usedside = "both"
by default, and sets the default dot size to the same size asgeom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small. -
stat_dots()
: dotplots on raw data, distributional objects, andposterior::rvar()
s -
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated intervals (rarely useful directly). -
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects, andposterior::rvar()
s (will calculate intervals for you). -
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to each dot to be specified using thesd
aesthetic. -
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a dots + point + interval geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_dotsinterval()
)
the following aesthetics are supported by the underlying geom:
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.order
: The order in which data points are stacked within bins. Can be used to create the effect of "stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the value of the data points themselves are used to determine stacking order. Only applies whenlayout
is"bin"
or"hex"
, as the other layout methods fully determine both x and y positions.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.slab_shape
: Override forshape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
References
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See Also
See geom_dotsinterval()
for the geom underlying this stat.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval stats:
stat_dots()
,
stat_mcse_dots()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(12345)
tibble(
x = rep(1:10, 100),
y = rnorm(1000, x)
) %>%
ggplot(aes(x = x, y = y)) +
stat_dotsinterval()
# ON ANALYTICAL DISTRIBUTIONS
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
tibble(
x = 1:10,
sd = seq(1, 3, length.out = 10)
) %>%
ggplot(aes(x = x, ydist = dist_normal(x, sd))) +
stat_dotsinterval(quantiles = 50)
Eye (violin + interval) plot (shortcut stat)
Description
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating eye (violin + interval) plots.
Roughly equivalent to:
stat_slabinterval( aes(side = after_stat("both")) )
Usage
stat_eye(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
expand = FALSE,
point_interval = "median_qi",
limits = NULL,
n = waiver(),
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a eye (violin + interval) geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_eye()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_eye()
Gradient + interval plot (shortcut stat)
Description
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating gradient + interval plots.
Roughly equivalent to:
stat_slabinterval( aes( justification = after_stat(0.5), thickness = after_stat(thickness(1)), slab_alpha = after_stat(f) ), fill_type = "auto", show.legend = c(size = FALSE, slab_alpha = FALSE) )
If your graphics device supports it, it is recommended to use this stat
with fill_type = "gradient"
(see the description of that parameter). On R >= 4.2,
support for fill_type = "gradient"
should be auto-detected based on the
graphics device you are using.
Usage
stat_gradientinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
fill_type = "auto",
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
expand = FALSE,
point_interval = "median_qi",
limits = NULL,
n = waiver(),
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE, slab_alpha = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
fill_type |
<string> What type of fill to use when the fill color or alpha varies within a slab. One of:
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a gradient + interval geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_gradientinterval()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_gradientinterval()
Half-eye (density + interval) plot (shortcut stat)
Description
Equivalent to stat_slabinterval()
, whose default settings create half-eye (density + interval) plots.
Usage
stat_halfeye(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
expand = FALSE,
point_interval = "median_qi",
limits = NULL,
n = waiver(),
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a half-eye (density + interval) geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_halfeye()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_halfeye()
Histogram + interval plot (shortcut stat)
Description
Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating histogram + interval plots.
Roughly equivalent to:
stat_slabinterval( density = "histogram" )
Usage
stat_histinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
density = "histogram",
p_limits = c(NA, NA),
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
expand = FALSE,
point_interval = "median_qi",
limits = NULL,
n = waiver(),
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
density |
<function | string> Density estimator for sample data. One of:
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a histogram + interval geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_histinterval()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_histinterval()
Multiple-interval plot (shortcut stat)
Description
Shortcut version of stat_slabinterval()
with geom_interval()
for
creating multiple-interval plots.
Roughly equivalent to:
stat_slabinterval( aes( colour = after_stat(level), size = NULL ), geom = "interval", show_point = FALSE, .width = c(0.5, 0.8, 0.95), show_slab = FALSE, show.legend = NA )
Usage
stat_interval(
mapping = NULL,
data = NULL,
geom = "interval",
position = "identity",
...,
.width = c(0.5, 0.8, 0.95),
point_interval = "median_qi",
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
.width |
<numeric> The |
point_interval |
<function | string> A function from the |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a multiple-interval geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_interval()
)
the following aesthetics are supported by the underlying geom:
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Deprecated aesthetics
interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_interval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_interval() +
scale_color_brewer()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_interval() +
scale_color_brewer()
Line + multiple-ribbon plot (shortcut stat)
Description
A combination of stat_slabinterval()
and geom_lineribbon()
with sensible defaults
for making line + multiple-ribbon plots. While geom_lineribbon()
is intended for use on data
frames that have already been summarized using a point_interval()
function,
stat_lineribbon()
is intended for use directly on data frames of draws or of
analytical distributions, and will perform the summarization using a point_interval()
function.
Roughly equivalent to:
stat_slabinterval( aes( group = after_stat(level), fill = after_stat(level), order = after_stat(level), size = NULL ), geom = "lineribbon", .width = c(0.5, 0.8, 0.95), show_slab = FALSE, show.legend = NA )
Usage
stat_lineribbon(
mapping = NULL,
data = NULL,
geom = "lineribbon",
position = "identity",
...,
.width = c(0.5, 0.8, 0.95),
point_interval = "median_qi",
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
.width |
<numeric> The |
point_interval |
<function | string> A function from the |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a line + multiple-ribbon geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval.
Aesthetics
The line+ribbon stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their two sub-geometries: the line and the ribbon.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_lineribbon()
)
the following aesthetics are supported by the underlying geom:
Ribbon-specific aesthetics
xmin
: Left edge of the ribbon sub-geometry (iforientation = "horizontal"
).xmax
: Right edge of the ribbon sub-geometry (iforientation = "horizontal"
).ymin
: Lower edge of the ribbon sub-geometry (iforientation = "vertical"
).ymax
: Upper edge of the ribbon sub-geometry (iforientation = "vertical"
).order
: The order in which ribbons are drawn. Ribbons with the smallest mean value oforder
are drawn first (i.e., will be drawn below ribbons with larger mean values oforder
). Iforder
is not supplied togeom_lineribbon()
,-abs(xmax - xmin)
or-abs(ymax - ymax)
(depending onorientation
) is used, having the effect of drawing the widest (on average) ribbons on the bottom.stat_lineribbon()
usesorder = after_stat(level)
by default, causing the ribbons generated from the largest.width
to be drawn on the bottom.
Color aesthetics
colour
: (orcolor
) The color of the line sub-geometry.fill
: The fill color of the ribbon sub-geometry.alpha
: The opacity of the line and ribbon sub-geometries.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of line. In ggplot2 < 3.4, was calledsize
.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc)
Other aesthetics (these work as in standard geom
s)
group
See examples of some of these aesthetics in action in vignette("lineribbon")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_lineribbon()
for the geom underlying this stat.
Other lineribbon stats:
stat_ribbon()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(12345)
tibble(
x = rep(1:10, 100),
y = rnorm(1000, x)
) %>%
ggplot(aes(x = x, y = y)) +
stat_lineribbon() +
scale_fill_brewer()
# ON ANALYTICAL DISTRIBUTIONS
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
tibble(
x = 1:10,
sd = seq(1, 3, length.out = 10)
) %>%
ggplot(aes(x = x, ydist = dist_normal(x, sd))) +
stat_lineribbon() +
scale_fill_brewer()
Blurry MCSE dot plot (stat)
Description
Variant of stat_dots()
for creating blurry dotplots of quantiles. Uses
posterior::mcse_quantile()
to calculate the Monte Carlo Standard Error
of each quantile computed for the dotplot, yielding an se
computed variable
that is by default mapped onto the sd
aesthetic of geom_blur_dots()
.
Usage
stat_mcse_dots(
mapping = NULL,
data = NULL,
geom = "blur_dots",
position = "identity",
...,
quantiles = NA,
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
quantiles |
<scalar logical> Number of quantiles to plot in the dotplot. Use |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
The dots family of stats and geoms are similar to ggplot2::geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in
geom_slabinterval()
and can be given x positions (or y positions when in a horizontal orientation).Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the
slab_shape
aesthetic (when using thedotsinterval
family) or theshape
orslab_shape
aesthetic (when using thedots
family)
Stats and geoms in this family include:
-
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size of the dots automatically (may result in very small dots). -
geom_swarm()
andgeom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots. Usedside = "both"
by default, and sets the default dot size to the same size asgeom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small. -
stat_dots()
: dotplots on raw data, distributional objects, andposterior::rvar()
s -
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated intervals (rarely useful directly). -
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects, andposterior::rvar()
s (will calculate intervals for you). -
geom_blur_dots()
: blurry dotplots that allow the standard deviation of a blur applied to each dot to be specified using thesd
aesthetic. -
stat_mcse_dots()
: blurry dotplots of quantiles using the Monte Carlo Standard Error of each quantile.
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a blurry MCSE dot geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic. -
se
: For dots, the Monte Carlo Standard Error of the quantile corresponding to each dot.
Aesthetics
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_blur_dots()
)
the following aesthetics are supported by the underlying geom:
Dots-specific (aka Slab-specific) aesthetics
sd
: The standard deviation (in data units) of the blur associated with each dot.order
: The order in which data points are stacked within bins. Can be used to create the effect of "stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the value of the data points themselves are used to determine stacking order. Only applies whenlayout
is"bin"
or"hex"
, as the other layout methods fully determine both x and y positions.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.slab_shape
: Override forshape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
References
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092–5103. doi:10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. doi:10.1145/3173574.3173718.
See Also
See geom_blur_dots()
for the geom underlying this stat.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval stats:
stat_dots()
,
stat_dotsinterval()
Examples
library(dplyr)
library(ggplot2)
theme_set(theme_ggdist())
set.seed(1234)
data.frame(x = rnorm(1000)) %>%
ggplot(aes(x = x)) +
stat_mcse_dots(quantiles = 100, layout = "weave")
Point + multiple-interval plot (shortcut stat)
Description
Shortcut version of stat_slabinterval()
with geom_pointinterval()
for
creating point + multiple-interval plots.
Roughly equivalent to:
stat_slabinterval( geom = "pointinterval", show_slab = FALSE )
Usage
stat_pointinterval(
mapping = NULL,
data = NULL,
geom = "pointinterval",
position = "identity",
...,
point_interval = "median_qi",
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
point_interval |
<function | string> A function from the |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a point + multiple-interval geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_pointinterval()
)
the following aesthetics are supported by the underlying geom:
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_pointinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_pointinterval()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_pointinterval()
Multiple-ribbon plot (shortcut stat)
Description
A combination of stat_slabinterval()
and geom_lineribbon()
with sensible defaults
for making multiple-ribbon plots. While geom_lineribbon()
is intended for use on data
frames that have already been summarized using a point_interval()
function,
stat_ribbon()
is intended for use directly on data frames of draws or of
analytical distributions, and will perform the summarization using a point_interval()
function.
Roughly equivalent to:
stat_lineribbon( show_point = FALSE )
Usage
stat_ribbon(
mapping = NULL,
data = NULL,
geom = "lineribbon",
position = "identity",
...,
.width = c(0.5, 0.8, 0.95),
point_interval = "median_qi",
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
.width |
<numeric> The |
point_interval |
<function | string> A function from the |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends?
|
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a multiple-ribbon geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval.
Aesthetics
The line+ribbon stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their two sub-geometries: the line and the ribbon.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_lineribbon()
)
the following aesthetics are supported by the underlying geom:
Ribbon-specific aesthetics
xmin
: Left edge of the ribbon sub-geometry (iforientation = "horizontal"
).xmax
: Right edge of the ribbon sub-geometry (iforientation = "horizontal"
).ymin
: Lower edge of the ribbon sub-geometry (iforientation = "vertical"
).ymax
: Upper edge of the ribbon sub-geometry (iforientation = "vertical"
).order
: The order in which ribbons are drawn. Ribbons with the smallest mean value oforder
are drawn first (i.e., will be drawn below ribbons with larger mean values oforder
). Iforder
is not supplied togeom_lineribbon()
,-abs(xmax - xmin)
or-abs(ymax - ymax)
(depending onorientation
) is used, having the effect of drawing the widest (on average) ribbons on the bottom.stat_lineribbon()
usesorder = after_stat(level)
by default, causing the ribbons generated from the largest.width
to be drawn on the bottom.
Color aesthetics
colour
: (orcolor
) The color of the line sub-geometry.fill
: The fill color of the ribbon sub-geometry.alpha
: The opacity of the line and ribbon sub-geometries.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Other aesthetics (these work as in standard geom
s)
group
See examples of some of these aesthetics in action in vignette("lineribbon")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_lineribbon()
for the geom underlying this stat.
Other lineribbon stats:
stat_lineribbon()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(12345)
tibble(
x = rep(1:10, 100),
y = rnorm(1000, x)
) %>%
ggplot(aes(x = x, y = y)) +
stat_ribbon() +
scale_fill_brewer()
# ON ANALYTICAL DISTRIBUTIONS
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
tibble(
x = 1:10,
sd = seq(1, 3, length.out = 10)
) %>%
ggplot(aes(x = x, ydist = dist_normal(x, sd))) +
stat_ribbon() +
scale_fill_brewer()
Slab (ridge) plot (shortcut stat)
Description
Shortcut version of stat_slabinterval()
with geom_slab()
for
creating slab (ridge) plots.
Roughly equivalent to:
stat_slabinterval( aes(size = NULL), geom = "slab", show_point = FALSE, show_interval = FALSE, show.legend = NA )
Usage
stat_slab(
mapping = NULL,
data = NULL,
geom = "slab",
position = "identity",
...,
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
expand = FALSE,
limits = NULL,
n = waiver(),
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override
the default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a slab (ridge) geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slab()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_slab()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_slab()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_slab()
# RIDGE PLOTS
# "ridge" plots can be created by expanding the slabs to the limits of the plot
# (expand = TRUE), allowing the density estimator to be nonzero outside the
# limits of the data (trim = FALSE), and increasing the height of the slabs.
data.frame(
group = letters[1:3],
value = rnorm(3000, 3:1)
) %>%
ggplot(aes(y = group, x = value)) +
stat_slab(color = "black", expand = TRUE, trim = FALSE, height = 2)
Slab + interval plots for sample data and analytical distributions (ggplot stat)
Description
"Meta" stat for computing distribution functions (densities or CDFs) + intervals for use with
geom_slabinterval()
. Useful for creating eye plots, half-eye plots, CCDF bar plots,
gradient plots, histograms, and more. Sample data can be supplied to the x
and y
aesthetics or analytical distributions (in a variety of formats) can be supplied to the
xdist
and ydist
aesthetics.
See Details.
Usage
stat_slabinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
expand = FALSE,
point_interval = "median_qi",
limits = NULL,
n = waiver(),
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override the
default connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
point_interval |
<function | string> A function from the |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
.width |
<numeric> The |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
A highly configurable stat for generating a variety of plots that combine a "slab" that describes a distribution plus a point summary and any number of intervals. Several "shortcut" stats are provided which combine multiple options to create useful geoms, particularly eye plots (a violin plot of density plus interval), half-eye plots (a density plot plus interval), CCDF bar plots (a complementary CDF plus interval), and gradient plots (a density encoded in color alpha plus interval).
The shortcut stats include:
-
stat_eye()
: Eye plots (violin + interval) -
stat_halfeye()
: Half-eye plots (density + interval) -
stat_ccdfinterval()
: CCDF bar plots (CCDF + interval) -
stat_cdfinterval()
: CDF bar plots (CDF + interval) -
stat_gradientinterval()
: Density gradient + interval plots -
stat_slab()
: Density plots -
stat_histinterval()
: Histogram + interval plots -
stat_pointinterval()
: Point + interval plots -
stat_interval()
: Interval plots
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a slab or combined slab+interval geometry which can
be added to a ggplot()
object.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See Also
See geom_slabinterval()
for more information on the geom these stats
use by default and some of the options it has.
See vignette("slabinterval")
for a variety of examples of use.
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# EXAMPLES ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c", "c", "c"),
value = rnorm(2500, mean = c(5, 7, 9, 9, 9), sd = c(1, 1.5, 1, 1, 1))
)
# here are vertical eyes:
df %>%
ggplot(aes(x = group, y = value)) +
stat_eye()
# note the sample size is not automatically incorporated into the
# area of the densities in case one wishes to plot densities against
# a reference (e.g. a prior distribution).
# But you may wish to account for sample size if using these geoms
# for something other than visualizing posteriors; in which case
# you can use after_stat(f*n):
df %>%
ggplot(aes(x = group, y = value)) +
stat_eye(aes(thickness = after_stat(pdf*n)))
# EXAMPLES ON ANALYTICAL DISTRIBUTIONS
dist_df = tribble(
~group, ~subgroup, ~mean, ~sd,
"a", "h", 5, 1,
"b", "h", 7, 1.5,
"c", "h", 8, 1,
"c", "i", 9, 1,
"c", "j", 7, 1
)
# Using functions from the distributional package (like dist_normal()) with the
# dist aesthetic can lead to more compact/expressive specifications
dist_df %>%
ggplot(aes(x = group, ydist = dist_normal(mean, sd), fill = subgroup)) +
stat_eye(position = "dodge")
# using the old character vector + args approach
dist_df %>%
ggplot(aes(x = group, dist = "norm", arg1 = mean, arg2 = sd, fill = subgroup)) +
stat_eye(position = "dodge")
# the stat_slabinterval family applies a Jacobian adjustment to densities
# when plotting on transformed scales in order to plot them correctly.
# It determines the Jacobian using symbolic differentiation if possible,
# using stats::D(). If symbolic differentation fails, it falls back
# to numericDeriv(), which is less reliable; therefore, it is
# advisable to use scale transformation functions that are defined in
# terms of basic math functions so that their derivatives can be
# determined analytically (most of the transformation functions in the
# scales package currently have this property).
# For example, here is a log-Normal distribution plotted on the log
# scale, where it will appear Normal:
data.frame(dist = "lnorm", logmean = log(10), logsd = 2*log(10)) %>%
ggplot(aes(y = 1, dist = dist, arg1 = logmean, arg2 = logsd)) +
stat_halfeye() +
scale_x_log10(breaks = 10^seq(-5,7, by = 2))
# see vignette("slabinterval") for many more examples.
Spike plot (ggplot2 stat)
Description
Stat for drawing "spikes" (optionally with points on them) at specific points
on a distribution (numerical or determined as a function of the distribution),
intended for annotating stat_slabinterval()
geometries.
Usage
stat_spike(
mapping = NULL,
data = NULL,
geom = "spike",
position = "identity",
...,
at = "median",
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = waiver(),
breaks = waiver(),
align = waiver(),
outline_bars = waiver(),
expand = FALSE,
limits = NULL,
n = waiver(),
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
check.aes = TRUE,
check.param = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
<Geom | string> Use to override the default
connection between |
position |
<Position | string> Position adjustment,
either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
at |
<numeric | function | character | list> The points at which to evaluate the PDF and CDF of the distribution. One of:
The values of |
p_limits |
<length-2 numeric> Probability limits. Used to determine the lower and upper
limits of analytical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample and the value of the |
density |
<function | string> Density estimator for sample data. One of:
|
adjust |
<scalar numeric | waiver> Passed to |
trim |
<scalar logical | waiver> Passed to |
breaks |
<numeric | function | string | waiver> Passed to
For example, |
align |
<scalar numeric | function | string | waiver> Passed to
For example, |
outline_bars |
<scalar logical | waiver> Passed to |
expand |
<logical> For sample data, should the slab be expanded to the limits of the scale? Default |
limits |
<length-2 numeric> Manually-specified limits for the slab, as
a vector of length two. These limits are combined with those computed based on
|
n |
<scalar numeric> Number of points at which to evaluate the function that defines the slab. Also
passed to |
orientation |
<string> Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
<scalar logical> If |
show.legend |
<logical> Should this layer be included in the legends? Default is |
inherit.aes |
If |
check.aes , check.param |
If |
Details
This stat computes slab values (i.e. PDF and CDF values) at specified locations
on a distribution, as determined by the at
parameter.
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a spike geometry which can be added to a ggplot()
object.
Aesthetics
The spike geom
has a wide variety of aesthetics that control
the appearance of its two sub-geometries: the spike and the point.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_spike()
)
the following aesthetics are supported by the underlying geom:
Spike-specific (aka Slab-specific) aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.
Color aesthetics
colour
: (orcolor
) The color of the spike and point sub-geometries.fill
: The fill color of the point sub-geometry.alpha
: The opacity of the spike and point sub-geometries.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the spike sub-geometry.size
: Size of the point sub-geometry.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the spike.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic. -
at
: For spikes, a character vector of names of the functions or expressions used to determine the points at which the slab functions were evaluated to create spikes. Values of this computed variable are determined by theat
parameter; see its description above.
See Also
See geom_spike()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
Examples
library(ggplot2)
library(distributional)
library(dplyr)
df = tibble(
d = c(dist_normal(1), dist_gamma(2,2)), g = c("a", "b")
)
# annotate the density at the mode of a distribution
df %>%
ggplot(aes(y = g, xdist = d)) +
stat_slab(aes(xdist = d)) +
stat_spike(at = "Mode") +
# need shared thickness scale so that stat_slab and geom_spike line up
scale_thickness_shared()
# annotate the endpoints of intervals of a distribution
# here we'll use an arrow instead of a point by setting size = 0
arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt"))
df %>%
ggplot(aes(y = g, xdist = d)) +
stat_halfeye(point_interval = mode_hdci) +
stat_spike(
at = function(x) hdci(x, .width = .66),
size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75
) +
scale_thickness_shared()
# annotate quantiles of a sample
set.seed(1234)
data.frame(x = rnorm(1000, 1:2), g = c("a","b")) %>%
ggplot(aes(x, g)) +
stat_slab() +
stat_spike(at = function(x) quantile(x, ppoints(10))) +
scale_thickness_shared()
Scaled and shifted Student's t distribution
Description
Density, distribution function, quantile function and random generation for the
scaled and shifted Student's t distribution, parameterized by degrees of freedom (df
),
location (mu
), and scale (sigma
).
Usage
dstudent_t(x, df, mu = 0, sigma = 1, log = FALSE)
pstudent_t(q, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE)
qstudent_t(p, df, mu = 0, sigma = 1, lower.tail = TRUE, log.p = FALSE)
rstudent_t(n, df, mu = 0, sigma = 1)
Arguments
x , q |
vector of quantiles. |
df |
degrees of freedom ( |
mu |
<numeric> Location parameter (median). |
sigma |
<numeric> Scale parameter. |
log , log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities are
|
p |
vector of probabilities. |
n |
number of observations. If |
Value
-
dstudent_t
gives the density -
pstudent_t
gives the cumulative distribution function (CDF) -
qstudent_t
gives the quantile function (inverse CDF) -
rstudent_t
generates random draws.
The length of the result is determined by n
for rstudent_t
, and is the maximum of the lengths of
the numerical arguments for the other functions.
The numerical arguments other than n
are recycled to the length of the result. Only the first elements
of the logical arguments are used.
See Also
parse_dist()
and parsing distribution specs and the stat_slabinterval()
family of stats for visualizing them.
Examples
library(dplyr)
library(ggplot2)
expand.grid(
df = c(3,5,10,30),
scale = c(1,1.5)
) %>%
ggplot(aes(y = 0, dist = "student_t", arg1 = df, arg2 = 0, arg3 = scale, color = ordered(df))) +
stat_slab(p_limits = c(.01, .99), fill = NA) +
scale_y_continuous(breaks = NULL) +
facet_grid( ~ scale) +
labs(
title = "dstudent_t(x, df, 0, sigma)",
subtitle = "Scale (sigma)",
y = NULL,
x = NULL
) +
theme_ggdist() +
theme(axis.title = element_text(hjust = 0))
Sub-geometry scales for geom_slabinterval (ggplot2 scales)
Description
These scales allow more specific aesthetic mappings to be made when using geom_slabinterval()
and stats/geoms based on it (like eye plots).
Usage
scale_point_colour_discrete(..., aesthetics = "point_colour")
scale_point_color_discrete(..., aesthetics = "point_colour")
scale_point_colour_continuous(
...,
aesthetics = "point_colour",
guide = guide_colourbar2()
)
scale_point_color_continuous(
...,
aesthetics = "point_colour",
guide = guide_colourbar2()
)
scale_point_fill_discrete(..., aesthetics = "point_fill")
scale_point_fill_continuous(
...,
aesthetics = "point_fill",
guide = guide_colourbar2()
)
scale_point_alpha_continuous(..., range = c(0.1, 1))
scale_point_alpha_discrete(..., range = c(0.1, 1))
scale_point_size_continuous(..., range = c(1, 6))
scale_point_size_discrete(..., range = c(1, 6), na.translate = FALSE)
scale_interval_colour_discrete(..., aesthetics = "interval_colour")
scale_interval_color_discrete(..., aesthetics = "interval_colour")
scale_interval_colour_continuous(
...,
aesthetics = "interval_colour",
guide = guide_colourbar2()
)
scale_interval_color_continuous(
...,
aesthetics = "interval_colour",
guide = guide_colourbar2()
)
scale_interval_alpha_continuous(..., range = c(0.1, 1))
scale_interval_alpha_discrete(..., range = c(0.1, 1))
scale_interval_size_continuous(..., range = c(1, 6))
scale_interval_size_discrete(..., range = c(1, 6), na.translate = FALSE)
scale_interval_linetype_discrete(..., na.value = "blank")
scale_interval_linetype_continuous(...)
scale_slab_colour_discrete(..., aesthetics = "slab_colour")
scale_slab_color_discrete(..., aesthetics = "slab_colour")
scale_slab_colour_continuous(
...,
aesthetics = "slab_colour",
guide = guide_colourbar2()
)
scale_slab_color_continuous(
...,
aesthetics = "slab_colour",
guide = guide_colourbar2()
)
scale_slab_fill_discrete(..., aesthetics = "slab_fill")
scale_slab_fill_continuous(
...,
aesthetics = "slab_fill",
guide = guide_colourbar2()
)
scale_slab_alpha_continuous(
...,
limits = function(l) c(min(0, l[[1]]), l[[2]]),
range = c(0, 1)
)
scale_slab_alpha_discrete(..., range = c(0.1, 1))
scale_slab_size_continuous(..., range = c(1, 6))
scale_slab_size_discrete(..., range = c(1, 6), na.translate = FALSE)
scale_slab_linewidth_continuous(..., range = c(1, 6))
scale_slab_linewidth_discrete(..., range = c(1, 6), na.translate = FALSE)
scale_slab_linetype_discrete(..., na.value = "blank")
scale_slab_linetype_continuous(...)
scale_slab_shape_discrete(..., solid = TRUE)
scale_slab_shape_continuous(...)
guide_colourbar2(...)
guide_colorbar2(...)
Arguments
... |
Arguments passed to underlying scale or guide functions. E.g. |
aesthetics |
<character> Names of aesthetics to set scales for. |
guide |
|
range |
<length-2 numeric> The minimum and maximum size of the plotting symbol after transformation. |
na.translate |
<scalar logical> In discrete scales, should we show missing values? |
na.value |
<linetype> When |
limits |
One of:
|
solid |
Should the shapes be solid, |
Details
The following additional scales / aesthetics are defined for use with geom_slabinterval()
and
related geoms:
scale_point_color_*
Point color
scale_point_fill_*
Point fill color
scale_point_alpha_*
Point alpha level / opacity
scale_point_size_*
Point size
scale_interval_color_*
Interval line color
scale_interval_alpha_*
Interval alpha level / opacity
scale_interval_linetype_*
Interval line type
scale_slab_color_*
Slab outline color
scale_slab_fill_*
Slab fill color
scale_slab_alpha_*
Slab alpha level / opacity. The default settings of
scale_slab_alpha_continuous
differ fromscale_alpha_continuous()
and are designed for gradient plots (e.g.stat_gradientinterval()
) by ensuring that densities of 0 get mapped to 0 in the output.scale_slab_linewidth_*
Slab outline line width
scale_slab_linetype_*
Slab outline line type
scale_slab_shape_*
Slab dot shape (for
geom_dotsinterval()
)
See the corresponding scale documentation in ggplot for more information; e.g.
scale_color_discrete()
,
scale_color_continuous()
, etc.
Other scale functions can be used with the aesthetics/scales defined here by using the aesthetics
argument to that scale function. For example, to use color brewer scales with the point_color
aesthetic:
scale_color_brewer(..., aesthetics = "point_color")
With continuous color scales, you may also need to provide a guide as the default guide does not work properly;
this is what guide_colorbar2
is for:
scale_color_distiller(..., guide = "colorbar2", aesthetics = "point_color")
These scales have been deprecated:
scale_interval_size_*
Use
scale_linewidth_*
scale_slab_size_*
Slab
scale_size_linewidth_*
Value
A ggplot2::Scale representing one of the aesthetics used to target the appearance of specific parts of composite
ggdist
geoms. Can be added to a ggplot()
object.
Author(s)
Matthew Kay
See Also
Other ggplot2 scales: scale_color_discrete()
,
scale_color_continuous()
, etc.
Other ggdist scales:
scale_colour_ramp
,
scale_side_mirrored()
,
scale_thickness
Examples
library(dplyr)
library(ggplot2)
# This plot shows how to set multiple specific aesthetics
# NB it is very ugly and is only for demo purposes.
data.frame(distribution = "Normal(1,2)") %>%
parse_dist(distribution) %>%
ggplot(aes(y = distribution, xdist = .dist, args = .args)) +
stat_halfeye(
shape = 21, # this point shape has a fill and outline
point_color = "red",
point_fill = "black",
point_alpha = .1,
point_size = 6,
stroke = 2,
interval_color = "blue",
# interval line widths are scaled from [1, 6] onto [0.6, 1.4] by default
# see the interval_size_range parameter in help("geom_slabinterval")
linewidth = 8,
interval_linetype = "dashed",
interval_alpha = .25,
# fill sets the fill color of the slab (here the density)
slab_color = "green",
slab_fill = "purple",
slab_linewidth = 3,
slab_linetype = "dotted",
slab_alpha = .5
)
Axis sub-guide for thickness scales
Description
This is a sub-guide intended for annotating the thickness
and dot-count
subscales in ggdist. It can be used with the subguide
parameter of
geom_slabinterval()
and geom_dotsinterval()
.
Supports automatic partial function application with waived arguments.
Usage
subguide_axis(
values,
title = NULL,
breaks = waiver(),
labels = waiver(),
position = 0,
just = 0,
label_side = "topright",
orientation = "horizontal",
theme = theme_get()
)
subguide_inside(..., label_side = "inside")
subguide_outside(..., label_side = "outside", just = 1)
subguide_integer(..., breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))
subguide_count(..., breaks = scales::breaks_width(1))
subguide_slab(values, ...)
subguide_dots(values, ...)
subguide_spike(values, ...)
Arguments
values |
<numeric> Values used to construct the scale used for this guide.
Typically provided automatically by |
title |
<string> The title of the scale shown on the sub-guide's axis. |
breaks |
One of:
|
labels |
One of:
|
position |
<scalar numeric> Value between |
just |
<scalar numeric> Value between |
label_side |
<string> Which side of the axis to draw the ticks and labels on.
|
orientation |
<string> Orientation of the geometry this sub-guide is for. One
of |
theme |
<theme> Theme used to determine the style that the
sub-guide elements are drawn in. The title label is drawn using the
|
... |
Arguments passed to other functions, typically back to
|
Details
subguide_inside()
is a shortcut for drawing labels inside of the chart
region.
subguide_outside()
is a shortcut for drawing labels outside of the chart
region.
subguide_integer()
only draws breaks that are integer values, useful for
labeling counts in geom_dots()
.
subguide_count()
is a shortcut for drawing labels where every whole number
is labeled, useful for labeling counts in geom_dots()
. If your max count is
large, subguide_integer()
may be better.
subguide_slab()
, subguide_dots()
, and subguide_spike()
are aliases
for subguide_none()
that allow you to change the default subguide used
for the geom_slabinterval()
, geom_dotsinterval()
, and geom_spike()
families. If you overwrite these in the global environment, you can set
the corresponding default subguide. For example:
subguide_slab = ggdist::subguide_inside(position = "right")
This will cause geom_slabinterval()
s to default to having a guide on the
right side of the geom.
See Also
The thickness datatype.
The thickness
aesthetic of geom_slabinterval()
.
scale_thickness_shared()
, for setting a thickness
scale across
all geometries using the thickness
aesthetic.
subscale_thickness()
, for setting a thickness
sub-scale within
a single geom_slabinterval()
.
Other sub-guides:
subguide_none()
Examples
library(ggplot2)
library(distributional)
df = data.frame(d = dist_normal(2:3, 2:3), g = c("a", "b"))
# subguides allow you to label thickness axes
ggplot(df, aes(xdist = d, y = g)) +
stat_slabinterval(subguide = "inside")
# they respect normalization and use of scale_thickness_shared()
ggplot(df, aes(xdist = d, y = g)) +
stat_slabinterval(subguide = "inside", normalize = "groups")
# they can also be positioned outside the plot area, though
# this typically requires manually adjusting plot margins
ggplot(df, aes(xdist = d, y = g)) +
stat_slabinterval(subguide = subguide_outside(title = "density", position = "right")) +
theme(plot.margin = margin(5.5, 50, 5.5, 5.5))
# any of the subguide types will also work to indicate bin counts in
# geom_dots(); subguide_integer() and subguide_count() can be useful for
# dotplots as they only label integers / whole numbers:
df = data.frame(d = dist_gamma(2:3, 2:3), g = c("a", "b"))
ggplot(df, aes(xdist = d, y = g)) +
stat_dots(subguide = subguide_count(label_side = "left", title = "count")) +
scale_y_discrete(expand = expansion(add = 0.1)) +
scale_x_continuous(expand = expansion(add = 0.5))
Empty sub-guide for thickness scales
Description
This is a blank sub-guide that omits annotations for the thickness
and
dot-count sub-scales in ggdist. It can be used with the subguide
parameter of geom_slabinterval()
and geom_dotsinterval()
.
Supports automatic partial function application with waived arguments.
Usage
subguide_none(values, ...)
Arguments
values |
<numeric> Values used to construct the scale used for this guide.
Typically provided automatically by |
... |
ignored. |
See Also
Other sub-guides:
subguide_axis()
Identity sub-scale for thickness aesthetic
Description
This is an identity sub-scale for the thickness
aesthetic
in ggdist. It returns its input as a thickness vector without
rescaling. It can be used with the subscale
parameter of
geom_slabinterval()
.
Usage
subscale_identity(x)
Arguments
x |
<numeric> Vector to be rescaled.
Typically provided automatically by |
Value
A thickness vector of the same length as x
, with infinite
values in x
squished into the data range.
See Also
Other sub-scales:
subscale_thickness()
Sub-scale for thickness aesthetic
Description
This is a sub-scale intended for adjusting the scaling of the thickness
aesthetic at a geometry (or sub-geometry) level in ggdist. It can be
used with the subscale
parameter of geom_slabinterval()
.
Supports automatic partial function application with waived arguments.
Usage
subscale_thickness(
x,
limits = function(l) c(min(0, l[1]), l[2]),
expand = c(0, 0)
)
Arguments
x |
<numeric> Vector to be rescaled.
Typically provided automatically by |
limits |
<length-2 numeric | function | NULL> One of:
|
expand |
<numeric> Vector of limit expansion constants of length
2 or 4, following the same format used by the |
Details
You can overwrite subscale_thickness
in the global environment to set
the default properties of the thickness subscale. For example:
subscale_thickness = ggdist::subscale_thickness(expand = expansion(c(0, 0.05)))
This will cause geom_slabinterval()
s to default to a thickness subscale
that expands by 5% at the top of the scale. Always prefix such a
definition with ggdist::
to avoid infinite loops caused by recursion.
Value
A thickness vector of the same length as x
scaled to be between
0
and 1
.
See Also
The thickness datatype.
The thickness
aesthetic of geom_slabinterval()
.
scale_thickness_shared()
, for setting a thickness
scale across
all geometries using the thickness
aesthetic.
Other sub-scales:
subscale_identity()
Examples
library(ggplot2)
library(distributional)
df = data.frame(d = dist_normal(2:3, 1), g = c("a", "b"))
# breaks on thickness subguides are always limited to the bounds of the
# subscale, which may leave labels off near the edge of the subscale
# (e.g. here `0.4` is omitted because the max value is approx `0.39`)
ggplot(df, aes(xdist = d, y = g)) +
stat_slabinterval(
subguide = "inside"
)
# We can use the subscale to expand the upper limit of the thickness scale
# by 5% (similar to the default for positional scales), allowing bounds near
# (but just less than) the limit, like `0.4`, to be shown.
ggplot(df, aes(xdist = d, y = g)) +
stat_slabinterval(
subguide = "inside",
subscale = subscale_thickness(expand = expansion(c(0, 0.5)))
)
Simple, light ggplot2 theme for ggdist and tidybayes
Description
A simple, relatively minimalist ggplot2 theme, and some helper functions to go with it.
Usage
theme_ggdist(
base_size = 11,
base_family = "",
base_line_size = base_size/22,
base_rect_size = base_size/22
)
theme_tidybayes(
base_size = 11,
base_family = "",
base_line_size = base_size/22,
base_rect_size = base_size/22
)
facet_title_horizontal()
axis_titles_bottom_left()
facet_title_left_horizontal()
facet_title_right_horizontal()
Arguments
base_size |
base font size, given in pts. |
base_family |
base font family |
base_line_size |
base size for line elements |
base_rect_size |
base size for rect elements |
Details
This is a relatively minimalist ggplot2 theme, intended to be used for making publication-ready plots.
It is currently based on ggplot2::theme_light()
.
A word of warning: this theme may (and very likely will) change in the future as I tweak it to my taste.
theme_ggdist()
and theme_tidybayes()
are aliases.
Value
A named list in the format of ggplot2::theme()
Author(s)
Matthew Kay
See Also
ggplot2::theme()
, ggplot2::theme_set()
Examples
library(ggplot2)
theme_set(theme_ggdist())
Thickness (datatype)
Description
A representation of the thickness of a slab: a scaled value (x
) where
0
is the base of the slab and 1
is its maximum extent, and the lower
(lower
) and upper (upper
) limits of the slab values in their original
data units.
Usage
thickness(x = double(), lower = NA_real_, upper = NA_real_)
Arguments
x |
<coercible-to-numeric> A numeric vector or an object
coercible to a numeric (via |
lower |
<numeric> The original lower bounds of thickness values before scaling.
May be |
upper |
<numeric> The original upper bounds of thickness values before scaling.
May be |
Details
This datatype is used by scale_thickness_shared()
and subscale_thickness()
to represent numeric()
-like objects marked as being in units of slab "thickness".
Unlike regular numeric()
s, thickness()
values mapped onto the thickness
aesthetic are not rescaled by scale_thickness_shared()
or geom_slabinterval()
.
In most cases thickness()
is not useful directly; though it can be used to
mark values that should not be rescaled—see the definitions of
stat_ccdfinterval()
and stat_gradientinterval()
for some example usages.
thickness objects with unequal lower or upper limits may not be combined.
However, thickness objects with NA
limits may be combined with
thickness objects with non-NA
limits. This allows (e.g.) specifying
locations on the thickness scale that are independent of data limits.
Value
A vctrs::rcrd of class "ggdist_thickness"
with fields
"x"
, "lower"
, and "upper"
.
Author(s)
Matthew Kay
See Also
The thickness
aesthetic of geom_slabinterval()
.
scale_thickness_shared()
, for setting a thickness
scale across
all geometries using the thickness
aesthetic.
subscale_thickness()
, for setting a thickness
sub-scale within
a single geom_slabinterval()
.
Examples
thickness(0:1)
thickness(0:1, 0, 10)
Translate between different tidy data frame formats for draws from distributions
Description
These functions translate ggdist/tidybayes-style data frames to/from different data frame formats (each format using a different naming scheme for its columns).
Usage
to_broom_names(data)
from_broom_names(data)
to_ggmcmc_names(data)
from_ggmcmc_names(data)
Arguments
data |
<data.frame> A data frame to translate. |
Details
Function prefixed with to_
translate from the ggdist/tidybayes format to another format, functions
prefixed with from_
translate from that format back to the ggdist/tidybayes format. Formats include:
to_broom_names()
/ from_broom_names()
:
-
.variable
<->term
-
.value
<->estimate
-
.prediction
<->.fitted
-
.lower
<->conf.low
-
.upper
<->conf.high
to_ggmcmc_names()
/ from_ggmcmc_names()
:
-
.chain
<->Chain
-
.iteration
<->Iteration
-
.variable
<->Parameter
-
.value
<->value
Value
A data frame with (possibly) new names in some columns, according to the translation scheme described in Details.
Author(s)
Matthew Kay
Examples
library(dplyr)
data(RankCorr_u_tau, package = "ggdist")
df = RankCorr_u_tau %>%
dplyr::rename(.variable = i, .value = u_tau) %>%
group_by(.variable) %>%
median_qi(.value)
df
df %>%
to_broom_names()
A waived argument
Description
A flag indicating that the default value of an argument should be used.
Usage
waiver()
Details
A waiver()
is a flag passed to a function argument that indicates the
function should use the default value of that argument. It is used in two
cases:
-
ggplot2 functions use it to distinguish between "nothing" (
NULL
) and a default value calculated elsewhere (waiver()
). -
ggdist turns ggplot2's convention into a standardized method of argument-passing: any named argument with a default value in an automatically partially-applied function can be passed
waiver()
when calling the function. This will cause the default value (or the most recently partially-applied value) of that argument to be used instead.Note: due to historical limitations,
waiver()
cannot currently be used on arguments to thepoint_interval()
family of functions.
See Also
auto_partial()
, ggplot2::waiver()
Examples
f = auto_partial(function(x, y = "b") {
c(x = x, y = y)
})
f("a")
# uses the default value of `y` ("b")
f("a", y = waiver())
# partially apply `f`
g = f(y = "c")
g
# uses the last partially-applied value of `y` ("c")
g("a", y = waiver())
Weighted empirical cumulative distribution function
Description
A variation of ecdf()
that can be applied to weighted samples.
Usage
weighted_ecdf(x, weights = NULL, na.rm = FALSE)
Arguments
x |
<numeric> Sample values. |
weights |
<numeric | NULL> Weights for the sample. One of:
|
na.rm |
<scalar logical> If |
Details
Generates a weighted empirical cumulative distribution function, F(x)
.
Given x
, a sorted vector (derived from x
), and w_i
, the corresponding
weight
for x_i
, F(x)
is a step function with steps at each x_i
with F(x_i)
equal to the sum of all weights up to and including w_i
.
Value
weighted_ecdf()
returns a function of class "weighted_ecdf"
, which also
inherits from the stepfun()
class. Thus, it also has plot()
and print()
methods. Like ecdf()
, weighted_ecdf()
also provides a quantile()
method,
which dispatches to weighted_quantile()
.
See Also
Examples
weighted_ecdf(1:3, weights = 1:3)
plot(weighted_ecdf(1:3, weights = 1:3))
quantile(weighted_ecdf(1:3, weights = 1:3), 0.4)
Weighted sample quantiles
Description
A variation of quantile()
that can be applied to weighted samples.
Usage
weighted_quantile(
x,
probs = seq(0, 1, 0.25),
weights = NULL,
n = NULL,
na.rm = FALSE,
names = TRUE,
type = 7,
digits = 7
)
weighted_quantile_fun(x, weights = NULL, n = NULL, na.rm = FALSE, type = 7)
Arguments
x |
<numeric> Sample values. |
probs |
<numeric> Vector of probabilities in |
weights |
<numeric | NULL> Weights for the sample. One of:
|
n |
<scalar numeric> Presumed effective sample size. If this is greater than 1 and
continuous quantiles (
|
na.rm |
<scalar logical> If |
names |
<scalar logical> If |
type |
<scalar integer> Value between 1 and 9: determines the type of quantile estimator to be used. Types 1 to 3 are for discontinuous quantiles, types 4 to 9 are for continuous quantiles. See Details. |
digits |
<scalar numeric> The number of digits to use to format percentages
when |
Details
Calculates weighted quantiles using a variation of the quantile types based
on a generalization of quantile()
.
Type 1–3 (discontinuous) quantiles are directly a function of the inverse CDF as a step function, and so can be directly translated to the weighted case using the natural definition of the weighted ECDF as the cumulative sum of the normalized weights.
Type 4–9 (continuous) quantiles require some translation from the definitions
in quantile()
. quantile()
defines continuous estimators in terms of
x_k
, which is the k
th order statistic, and p_k
, which is a function of k
and n
(the sample size). In the weighted case, we instead take x_k
as the k
th
smallest value of x
in the weighted sample (not necessarily an order statistic,
because of the weights). Then we can re-write the formulas for p_k
in terms of
F(x_k)
(the empirical CDF at x_k
, i.e. the cumulative sum of normalized
weights) and f(x_k)
(the normalized weight at x_k
), by using the
fact that, in the unweighted case, k = F(x_k) \cdot n
and 1/n = f(x_k)
:
- Type 4
p_k = \frac{k}{n} = F(x_k)
- Type 5
p_k = \frac{k - 0.5}{n} = F(x_k) - \frac{f(x_k)}{2}
- Type 6
p_k = \frac{k}{n + 1} = \frac{F(x_k)}{1 + f(x_k)}
- Type 7
p_k = \frac{k - 1}{n - 1} = \frac{F(x_k) - f(x_k)}{1 - f(x_k)}
- Type 8
p_k = \frac{k - 1/3}{n + 1/3} = \frac{F(x_k) - f(x_k)/3}{1 + f(x_k)/3}
- Type 9
p_k = \frac{k - 3/8}{n + 1/4} = \frac{F(x_k) - f(x_k) \cdot 3/8}{1 + f(x_k)/4}
Then the quantile function (inverse CDF) is the piece-wise linear function
defined by the points (p_k, x_k)
.
Value
weighted_quantile()
returns a numeric vector of length(probs)
with the
estimate of the corresponding quantile from probs
.
weighted_quantile_fun()
returns a function that takes a single argument,
a vector of probabilities, which itself returns the corresponding quantile
estimates. It may be useful when weighted_quantile()
needs to be called
repeatedly for the same sample, re-using some pre-computation.