--- title: "Polychrome: Creating New Palettes" author: "Kevin R. Coombes" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Creating Palettes with Polychrome} %\VignetteKeywords{OOMPA,Polychrome,Color Palettes,Palettes} %\VignetteDepends{Polychrome} %\VignettePackage{Polychrome} %\VignetteEngine{knitr::rmarkdown} --- ```{r opts,echo=FALSE} knitr::opts_chunk$set(fig.width=7, fig.height=5) ``` Although the Polychrome package ships with several pre-built multicolored palettes, it also contains tools to build more. In this vignette, we expain how to use the createPalette function for this purpose. # Getting Started As usual, we start by loading the package: ```{r lib} library(Polychrome) ``` Here are the arguments of the main function. ```{r args} args(createPalette) ``` + N is the number of colors to include in the palette + seedcolors is a list of one or more starting colors that you would like included in the palette. + prefix is a character string used to create default names for the palette colors. + range is a numeric vector of length 2 indicating the minimum and maximum desired luminance for the palette. + M is an an integer representing the number of random colors to be generated in order to create the palette. ## Recommendations 1. The seed colors usually have a small effect on the overall palette, but they can change the order. White and black appear to be exceptions; they tend not to be included in the palette unless forced. One reason to use more seed colors would be to aumgent an existing small palette by adding distinguishable colors. 2. We show below that an upper bound on a good set of colors is likely to be around 40. It may be better in the long run to build a palette that you like of about this size, and then just use colors from the beginning if you don't need that many. 3. The speed of the algorithm mainly depends on M. You get a richer selection of colors with larger M. We have been successful using the default of 50000. 4. The range depends on whether you expect to use a light background (take range = c(10, 60)), a dark background (take range = c(55, 95) )or both (take range = c(30, 80)). 5. Unless you really need colors called X1, ..., X37, you are probably better off using the colorNames function to assign meaningful names after creating the palette. # A Small Palette We'll start with a simple (but fast) example. ```{r first10} set.seed(723451) # for reproducibility firstpal <- createPalette(10, c("#010101", "#ff0000"), M=1000) firstpal ``` The probably surprising result is that the seed colors are actually _not_ included in the final palette. There are two reasons for this. First, the colors 'black' and 'red' are outside the specified (default) luminance range. So, they are replaced by the closest randomly generated colors that are inside the range. But that leads us to the second reason. We specified a small value of M. The algorithm operates by randomly selecting a set of potential colors in the desired range, and will eventually assemble the palette from this set. With only 1000 candidates, there may not be anything that looks close to pure red or pure black in the set of candidates. ```{r firstswatch, fig.cap="First palette"} swatch(firstpal) ``` ## Do the seeds matter? Here is the palette that we get if we just start with the color black. Note that we keep resetting the random seed -- the one for the random number generator -- to make sure we are choosing from the same set of randomly generated candidates. ```{r second, fig.cap="A pallete from a single black seed."} set.seed(723451) # for reproducibility second <- createPalette(10, c("#010101"), M=1000) swatch(second) ``` This palette is essentially indistinguishable from the first one. However, here is what happens if we just start with the color red. ```{r third, fig.cap="A palette from a single red seed."} set.seed(723451) # for reproducibility third <- createPalette(10, c("#ff0000"), M=1000) swatch(third) ``` The red, green, and blue triumvirate stays near the top of the list, but black has disappeared. What happens if we instead start with green and blue? ```{r fourth, fig.cap="A blue and green seeded palette."} set.seed(723451) # for reproducibility fourth <- createPalette(10, c("#00ff00", "#0000ff"), M=1000) swatch(fourth) ``` Except for the forced reordering at the start, we get nearly the same palette. So, let's instead try starting with cyan, magenta, and yellow. ```{r fifth, fig.cap="A CMY seeded palete."} set.seed(723451) # for reproducibility fifth <- createPalette(10, c("#00ffff", "#ff00ff", "#ffff00"), M=1000) swatch(fifth) ``` We see that the initial seed colors have a relatively small effect. This result should not be unexpected based on the "greedy" nature of the underlying algorithm, which tries at each step to find the color that is most distant from everything already included in the list. The vertices of the polyhedron in three-dimensional LUV space include the usual primary colors. # How many good colors can we find? Now we'd like to examine how big we think we can realistically make N, the number of distinguishable colors. We can estimate this by first estimating the volume of the visible region of LUV space. We can start by determining the convex hull of the UV projection. We saw in the manuscript that our 36-color palette uses almost all of the visible region. So (for speed as well as simplicity), we are going to work with that palette. First, we have to get the LUV coordinates of the colors. ```{r luv36} library(colorspace) p36 <- palette36.colors(36) luvmat <- as(hex2RGB(p36), "LUV") y <- luvmat@coords ``` Now we can use a function from the grDevices package to compute the convex hull. ```{r hull, fig.width=6,fig.height=6,fig.cap="The convex hull of the UV coordinates of the 36-color palette."} hull <- y[chull(y[,2], y[,3]), 2:3] uvscatter(p36) polygon(hull[,1], hull[,2], lwd=2) ``` The next step is to compute the area of the convex hull. While there are R packages that include functions to do this computation, the algorithm is sufficiently easy that it hardly seems worth it to load another package. ```{r area} hullaug <- rbind(hull, hull[1,]) N <- nrow(hullaug) fst <-sum(hullaug[1:(N-1), 1]*hullaug[2:N, 2] ) snd <- sum(hullaug[2:N, 1]*hullaug[1:(N-1), 2] ) area <- (snd - fst)/2 area ``` So, a crude (over)estimate of the volume of the viewable region of LUV space is to take the range of L values (0--100) and multiple it times the UV area. This is an overestimate for two reasons. First, we can't actually use the entire viewable range of luminances agains any single background, so the factor of 100 is probably closer to 60. Second, not every combination of L-U-V values in this region designates an actual color. ```{r uppervol} overvol <- 100*area overvol ``` Now, we want colors to be 40 units apart to be reliably distinguishable by most viewers. So, each time we pick a color, it should "occupy" a sphere of radius 20 to avoid picking indistinguishable colors. The volume of such a sphere is: ```{r volsphere} sphere <- 4/3*pi*20^3 sphere ``` To get the most generous (over)estimate of the number of colors, we pretend for a moment that we can pack spheres into the space with minimal waste. By the Kepler conjecture, we know that this means that the spheres can occupy about 74% of the volume. ```{r fat} 100 * area/sphere *0.74 60 * area/sphere *0.74 ``` If we could use the full range of luminances, we could get about 82 colors. If we can only use a range of luminance values about 60 units wide, then we can at most hope to get 50 distinct colors. ### Improving the estimate We can empirically improve our estimate of the volume of the visible portion of LUV space. We start by generating a tight, regularly spaced grid of values. ```{r what} L <- 0:100 U <- -60:150 V <- -150:150 luv <- cbind(L = rep(L, length(U)*length(V)), U = rep(rep(U, each=length(L)), times=length(V)), V = rep(V, each=length(L)*length(U)))*1.0 ``` Next, we convert the LUV coordinates into hexadecimal representations in RGB space. ```{r hexify} tested <- LUV(luv) co <- tested@coords ``` Whenever the conversion fails (yielding an NA value), the LUV coordinates were not visible. So, we can estimate the fraction of the regular grid that is visible: ```{r } he <- hex(tested) summary(is.na(he)) mean(!is.na(he)) ``` That is roughly 20%. Now we look at how many spheres we could pack into this space. ```{r} VOL <- diff(range(L)) * diff(range(U)) * diff(range(V)) * 0.205 VOL VOL/sphere * 0.74 ``` Hmm. This computation suggest that we can only find about 28 distinguishable colors. Since we have examples of palettes with more colors than this, we must now somehow be _underestimating_ the value. But it's clear why. Many of the spheres can be placed on the boundary, that is, along the faces, edges, or even vertices of the polyhedron. In that case, half or more of their volume sticks outsde the polyhedron. If we crudely estimate that half of the volume of half of the spheres can be ignored, then this adjustment roughly undoes the correction for the sphere packing density. ```{r final} VOL/sphere ``` So, our final estimate is that the maximum number of distinguishable colors is roughly 38 or 39. Being generous (and tending to favor round numbers), we're going to take it to be about 40. # Computational Complexity The final parameter we want to look at is M, the number of candidate colors that we search through. In general, we expect the time needed to run the algorithm to be approximately N*M, since we must search through all M candidates for each of the N desired colors. The trade-off, of course, is that we can find more colors that are distinguishable with more candidates. Here we increase M from 1000 to 10000. ```{r last10, fig.cap="Palette with more candidate colors."} set.seed(723451) # for reproducibility lastpal <- createPalette(10, c("#010101", "#ff0000"), M=10000) lastpal swatch(firstpal) swatch(lastpal) ```