% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/gt_dapc.R
\name{gt_dapc}
\alias{gt_dapc}
\title{Discriminant Analysis of Principal Components for gen_tibble}
\usage{
gt_dapc(x, pop = NULL, n_pca = NULL, n_da = NULL, loadings_by_locus = TRUE)
}
\arguments{
\item{x}{an object of class \code{gt_pca}, or its subclass \code{gt_cluster_pca}}

\item{pop}{either a factor indicating the group membership of individuals; or
an integer defining the desired \emph{k} if x is a \code{gt_cluster_pca}; or NULL, if
'x' is a \code{gt_cluster_pca} and contain an element 'best_k', usually
generated with \code{\link[=gt_cluster_pca_best_k]{gt_cluster_pca_best_k()}}, which will be used to select the
clustering level.}

\item{n_pca}{number of principal components to be used in the Discriminant
Analysis. If NULL, k-1 will be used.}

\item{n_da}{an integer indicating the number of axes retained in the
Discriminant Analysis step.}

\item{loadings_by_locus}{a logical indicating whether the loadings and
contribution of each locus should be stored (TRUE, default) or not (FALSE).
Such output can be useful, but can also create large matrices when there
are a lot of loci and many dimensions.}
}
\value{
an object of class \link[adegenet:dapc]{adegenet::dapc}
}
\description{
This function implements the Discriminant Analysis of Principal Components
(DAPC, Jombart et al. 2010). This method describes the diversity between
pre-defined groups. When groups are unknown, use \code{\link[=gt_cluster_pca]{gt_cluster_pca()}} to infer
genetic clusters. See 'details' section for a succinct description of the
method, and the vignette in the package \code{adegenet} ("adegenet-dapc") for a
tutorial.
}
\details{
The Discriminant Analysis of Principal Components (DAPC) is designed to
investigate the genetic structure of biological populations. This
multivariate method consists in a two-steps procedure. First, genetic data
are transformed (centred, possibly scaled) and submitted to a Principal
Component Analysis (PCA). Second, principal components of PCA are submitted
to a Linear Discriminant Analysis (LDA). A trivial matrix operation allows to
express discriminant functions as linear combination of alleles, therefore
allowing one to compute allele contributions. More details about the
computation of DAPC are to be found in the indicated reference.

Results can be visualised with \code{\link[=autoplot.gt_dapc]{autoplot.gt_dapc()}}, see the help of that
method for the available plots. There are also \link{gt_dapc_tidiers} for
manipulating the results. For the moment, his function returns objects of
class \code{\link[adegenet:dapc]{adegenet::dapc}} which are
compatible with methods from \code{adegenet}; graphical methods for DAPC are
documented in \link[adegenet:dapcGraphics]{adegenet::scatter.dapc} (see ?scatter.dapc). This is likely
to change in the future, so make sure you do not rely on the objects
remaining compatible.

Note that there is no current method to predict scores for
individuals not included in the original analysis. This is because we
currently do not have  mechanism to store the pca information in the
object, and that is needed for prediction.
}
\examples{
# Create a gen_tibble of lobster genotypes
bed_file <-
  system.file("extdata", "lobster", "lobster.bed", package = "tidypopgen")
lobsters <- gen_tibble(bed_file,
  backingfile = tempfile("lobsters"),
  quiet = TRUE
)

# Remove monomorphic loci and impute
lobsters <- lobsters \%>\% select_loci_if(loci_maf(genotypes) > 0)
lobsters <- gt_impute_simple(lobsters, method = "mode")

# Create PCA object
pca <- gt_pca_partialSVD(lobsters)

# Run DAPC on the `gt_pca` object, providing `pop` as factor
populations <- as.factor(lobsters$population)
gt_dapc(pca, n_pca = 6, n_da = 2, pop = populations)

# Run clustering on the first 10 PCs
cluster_pca <- gt_cluster_pca(
  x = pca,
  n_pca = 10,
  k_clusters = c(1, 5),
  method = "kmeans",
  n_iter = 1e5,
  n_start = 10,
  quiet = FALSE
)

# Find best k
cluster_pca <- gt_cluster_pca_best_k(cluster_pca,
  stat = "BIC",
  criterion = "min"
)

# Run DAPC on the `gt_cluster_pca` object
gt_dapc(cluster_pca, n_pca = 10, n_da = 2)

#  should be stored (TRUE, default) or not (FALSE). This information is
#  required to predict group membership of new individuals using predict, but
#  makes the object slightly bigger.
}
\references{
Jombart T, Devillard S and Balloux F (2010) Discriminant analysis
of principal components: a new method for the analysis of genetically
structured populations. BMC Genetics 11:94. doi:10.1186/1471-2156-11-94
Thia, J. A. (2023). Guidelines for standardizing the application of
discriminant analysis of principal components to genotype data. Molecular
Ecology Resources, 23, 523–538. https://doi.org/10.1111/1755-0998.13706
}
