The R package bootSVD can be used to implement fast, exact bootstrap principal component analysis and singular value decompositions for high dimensional data, where the number of measurements per subject is much larger than the number of subjects. This package is based on the methodology outlined by Fisher et al. (2014), who demonstrate the method on a dataset of 352 brain magnetic resonace images (MRIs), with approximately 3 million measurements per subject.
The primary function in this package is the bootSVD function, for
which we include a documented example based on simulated sleep
electroencephalogram (EEG) data. When the data is too large to store in
memory, functions in this package can also be applied to objects of
class ff
. These ff
objects have a
representation in memory, but store their primary contents on disk (see
the ff package).
Speed improvements are driven by the fact that sample size (n) is much less than sample dimension, which allows a n-dimensional representation of the sample to be sufficient for many calculations.
To install:
## if needed
("devtools")
install.packages
## main package
(devtools)
library('aaronjfisher/bootSVD')
install_github
(bootSVD)
library
## to access help pages
(package=bootSVD)
help?bootSVD
References:
Aaron Fisher, Brian Caffo, and Vadim Zipunnikov. Fast, Exact Bootstrap Principal Component Analysis for p>1 million. Working Paper, 2014. http://arxiv.org/abs/1405.0922