\documentclass[article,shortnames, nojss]{jss}

%\VignetteIndexEntry{Robust Tools for Three-way Component Analysis of Compositional Data: The R package rrcov3way}
%\VignetteKeywords{robustness, multi-way analysis, PARAFAC, TUCKER3}
%\VignettePackage{rrcov3way}

\usepackage[ansinew]{inputenc}
\usepackage{multirow}
\usepackage{color}
\usepackage{mathtools}
\usepackage{fixltx2e}
\usepackage{float}
\floatplacement{figure}{H}

\usepackage{amsmath}                %% need for subequations
\usepackage{amssymb}
\usepackage{amsfonts}
\usepackage{hyperref}
\usepackage[english]{babel}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\newcommand{\vv}[1]{\mbox{\boldmath$#1$}}
\newcommand{\xv}{\mbox{\boldmath$x$}}
\newcommand{\yv}{\mbox{\boldmath$y$}}
\newcommand{\zv}{\mbox{\boldmath$z$}}
\newcommand{\mv}{\mbox{\boldmath$m$}}
\newcommand{\tv}{\mbox{\boldmath$t$}}
\newcommand{\xbarv}{\mbox{\boldmath$\bar x$}}
\newcommand{\uv}{\mbox{\boldmath$u$}}
\newcommand{\muv}{\mbox{\boldmath$\mu$}}
\newcommand{\nuv}{\mbox{\boldmath$\nu$}}
\newcommand{\deltav}{\mbox{\boldmath$\delta$}}
\newcommand{\chiv}{\mbox{\boldmath$\chi$}}
\newcommand{\ov}{\mbox{\boldmath$0$}}

\newcommand\MCD{\mathit{MCD}}
\newcommand\MVE{\mathit{MVE}}
\newcommand\OGK{\mathit{OGK}}
\newcommand\diag{\mathrm{diag}}
%\newcommand\det{\mathrm{det}}

\newcommand{\R}{\textbf{\textsf{R}}}
\newcommand{\class}[1]{``\code{#1}''}
\newcommand{\fct}[1]{\code{#1()}}

%% almost as usual
%% almost as usual
\author{Valentin Todorov\\UNIDO \And
        Violetta Simonacci\\University of Naples-L'Orientale \AND
        Maria Anna Di Palma\\University of Naples-L'Orientale \AND Michele Gallo\\University of Naples-L'Orientale}

\title{Robust Tools for Three-way Component Analysis of Compositional Data: The \proglang{R} package \pkg{rrcov3way}}

%% for pretty printing and a nice hypersummary also set:
\Plainauthor{Valentin Todorov, Violetta Simonacci, Maria Anna Di Palma, Michele Gallo} %% comma-separated
\Plaintitle{rrcov3way: An R Package for Robust Three-way Analysis}   %% without formatting
%% \Shorttitle{R package for three-way analysis} %% a short title (if necessary)

\Abstract{
The \proglang{R} package \pkg{rrcov3way} provides a set of tools for fitting the Tucker3 and PARAFAC models to multidimensional arrays by use of classical, robust and compositional estimating procedures. Two main functions are implemented, Parafac and Tucker3, which include alternative options to the standard least squares algorithm in order to properly calculate the models in case of data with outliers and/or characterized by a biased covariance structure. A comprehensive collection of three-way plots, diagnostics and data processing functions is also made available.
A brief overview of multilinear tools, robustification procedures and compositional data analysis for three-way arrays is followed by a detailed presentation on the use and applicability of the implemented functions by means of real data examples included in the package.}
\Keywords{Parafac, Tucker3, compositional data, closed data, log-ratio, robustness, outlier}
%%\Plainkeywords{Parafac, Tucker3, compositional data, closed data, log-ratio, robustness, outlier}

%% without formatting
%% at least one keyword must be supplied

%% publication information
%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
%% \Volume{13}
%% \Issue{9}
%% \Month{September}
%% \Year{2004}
%% \Submitdate{2004-09-29}
%% \Acceptdate{2004-09-29}

%% The address of (at least) one author should be given
%% in the following format:
\Address{
  Valentin Todorov\\
  Department of Policy, Research and Statistics\\
  United Nations Industrial Development Organization (UNIDO)\\
  Vienna International Centre\\
  P.O.Box 300, 1400, Vienna, Austria\\
  E-mail: \email{valentin@todorov.at}\\
%  URL: \url{http://statmath.wu-wien.ac.at/~zeileis/}

    Violetta Simonacci\\
    Department of Human and Social Sciences\\
    L'Orientale--University of Naples\\
    P.zza S.Giovanni Maggiore, 30\\
    Naples, 80134, Italy\\
    E-mail:  \email{vsimonacci@unior.it}\\

    Maria Anna Di Palma\\
    Department of Human and Social Sciences\\
    LOrientale--University of Naples\\
    P.zza S.Giovanni Maggiore, 30\\
    Naples, 80134, Italy\\
    E-mail:  \email{madipalma@unior.it}\\

    Michele Gallo\\
    Department of Human and Social Sciences\\
    L'Orientale--University of Naples\\
    P.zza S.Giovanni Maggiore, 30\\
    Naples, 80134, Italy\\
    E-mail:  \email{mgallo@unior.it}\\
}
%% It is also possible to add a telephone and fax number
%% before the e-mail in the following format:
%% Telephone: +43/1/31336-5053
%% Fax: +43/1/31336-734

%% for those who use Sweave please include the following line (with % symbols):
%% need no \usepackage{Sweave.sty}

%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newcommand{\bi}{\begin{itemize}}
\newcommand{\ei}{\end{itemize}}
\newcommand{\be}{\begin{enumerate}}
\newcommand{\ee}{\end{enumerate}}
\renewcommand{\v}[1]{\mathbf{#1}}
%% Sweave options for the complete document
%%\SweaveOpts{keep.source=TRUE}
\SweaveOpts{prefix.string=rrcov3way}
\SweaveOpts{width=60}
\SweaveOpts{concordance=FALSE}


\begin{document}

%% include your article here, just as usual
%% Note that you should use the \pkg{}, \proglang{} and \code{} commands.

\section[Introduction]{Introduction}
\label{sec:intro}
The most usual way to start exploring data on $m$ objects in $p$
continuous variables is by principal component analysis (PCA). This procedure
summarizes the most important information in the data by representing
the objects and the variables simultaneously by a limited number
of optimally selected components. The optimality
of these components is intended in the sense that they explain the
maximal possible variance. The components remain optimal
even after rotation, thus, it is possible to search for a structure
which allows for easier interpretation. If these data are
measured at $k$ occasions (conditions, times, locations),
generalizations of PCA, that can handle three-way data, are needed. The first such
generalization that comes at hand is PARAFAC (parallel
factor analysis), alternatively called CANDECOMP
(canonical decomposition)\citep{Harshman:1970, Carroll:1970}.
PARAFAC summarizes
simultaneously the objects, variables and occasions in an optimally
selected lower dimensional (with a limited number of components)
representation. The solution provided by PARAFAC is unique, i.e.,
it does not allow for rotation.
Another popular trilinear
decomposition method is the Tucker3 model which was suggested
already in 1966 by \citet{Tucker:1966} for solving three-way
problems in the field of psychometrics. The decomposition provided by
the Tucker3 model results into a core array and three eigenvector
matrices. In contrast to the PARAFAC model, the solution provided
by Tucker3 is not unique. Any arbitrary nonsingular rotation of
all components simultaneously will retain the model representation, producing a new solution.\\\\
While in the rest of this paper we will limit the discussion to
three-way data, it is obvious that extension to higher orders
is straightforward. With the nowadays advanced data capture methods
in across a variety of applications, producing complex data structures,
multiway methods with their ability to uncover the underlying
structures, gained wide popularity. Originating in psychometrics
these models are now applied in chemometrics, social networks,
process control, econometrics. This resulted into a large
number of publications, see \citet{acar:2009} for a review.
\citet{kroonenberg:2016} provides a very personal view on
the development and the recent status of multiway analysis.\\\\
To make the models readily available for practical use, a number of
software tools were developed, particularly in \proglang{MATLAB}
with the most comprehensive package being
the \pkg{NWay} toolbox \citep{bro:2000}. A graphical interface to \pkg{NWay}
is provided by \pkg{CuBatch} \citep{cubatch:2005}. The robust version of PARAFAC
introduced by \citet{Engelen11} is provided in the \proglang{MATLAB} toolbox for
robust statistics \pkg{LIBRA} \citep{verboven-libra:2010}. Currently
several \proglang{R} packages are available at CRAN:
\pkg{PTAk} of \citet{leibovici:2010}, \pkg{ThreeWay} of \citet{threeway:2014},
\pkg{multiway} of \citet{multiway:2016} and \pkg{rTensor} of \citet{rTensor:2018}.
In this work another package for three-way modeling is presented, the package
\pkg{rrcov3way}, which aims to introduce additional tools not yet available
in \proglang{R}. Specifically its content responds to the need for flexible
functions suited for appropriately dealing with problematic features such
as the presence of outliers and/or of a compositional structure in the data.
The standard Tucker3 and PARAFAC estimating procedure, alternating least
squares (ALS), is extremely sensitive to the presence of anomalous observations,
which may artificially inflate the variance and skew the model away
from the real underlying solution.
For these reasons robust alternatives to classical PARAFAC and
Tucker3 models have been proposed. A robust version of Tucker3
based on the use of the minimum covariance determinant (MCD)
estimator \citep{Rousseeuw99} was introduced by \citet{pravdova2001robust}.
Later on \citet{Engelen11} extended this methodology to the PARAFAC
model by also revealing and correcting some of its inefficiencies. These
inefficiencies were corrected also for the robust version of Tucker3 model
by \cite{todorov:tucker3} and at the same time a version for handling
compositional data was proposed.
The robust functions provided in \pkg{rrcov3way} for both
Tucker3 and PARAFAC follow these two latter approaches.\\\\
Modeling compositional data also requires special tools. Compositions or CoDa (compositional data) are positive vectors that carry relative information and can be expressed as proportions of a whole. The elements of such vectors are bounded by an explicit or implicit sum constraint, which imposes a negative covariance bias and contradicts the typical assumptions of classical multivariate statistics. Geometrically, this constraint translates into compositions having one redundant dimension, therefore they are constricted in a subset of real space defined as simplex, characterized by a specific geometry called Aitchison geometry \citep{Ego05,Ego06,Ego03,egozcue2011elements}.
Standard models are designed to work without any bias within a Euclidean framework \citep{PawEgo,Bill01} and would return distorted results on compositions.
The preferred strategy for modeling CoDa is to project them onto Euclidean space by means of a transformation into log-ratio coordinates. Successively, classical analysis can be executed while results must  be interpreted in compositional terms.
When the row vectors of a three-way array are compositional in nature, i.e., observations are expressed as the proportion of a total recorded at several occasions, this approach must be followed before trilinear models can be fitted to the data. For this reason, compositional versions of the Tucker3 and PARAFAC models based on log-ratios were developed. For details see \citet{gallo:2015, gallo:2013, Engle13}. Recently, a robust version of the PARAFAC model for compositions was also developed in \citet{dipalma:2018}. These compositional and robust-compositional variants of classical trilinear models are made available in the \pkg{rrcov3way} package. \\\\
In brief, the main purpose of this work is to illustrate the unique tools introduced in the \proglang{R} package \pkg{rrcov3way} for modeling three-way data and three-way compositions with or without outliers. This is achieved by demonstrating through real data examples the relevance and correct use of the standard, robust, compositional and compositional-robust procedures included in the package and of other functions provided for the proper treatment and representation of results. In detail, the paper is organized in the following manner. In Section~\ref{sec:why} the innovative features of the \pkg{rrcov3way} package with respect to other existing packages in \proglang{R} and \proglang{MATLAB} are illustrated. In Section~\ref{sec:theory} the theoretical background is introduced by recalling some basic notions of standard trilinear models, robust procedures and compositional data analysis for three-way data. Section~\ref{sec:ex-intro} presents a quick ``getting started'' example demonstrating main functions and necessary steps for a complete three-way data analysis. For this purpose the \emph{OECD Electronics Industries Data} from \citet{kroonenberg:2008} are used. As it is usual in developing \proglang{R} packages, \pkg{rrcov3way} contains many example data sets which can be used to illustrate different features of the considered methodology. Examples based on several of these data sets are presented in Section~\ref{sec:examples}. Finally, Section~\ref{sec:conclusions} concludes with a  discussion of the package features, its limitations and the possible further extensions.


\section{Why do we need rrcov3way?}
\label{sec:why}
In the previous section the main \proglang{R} packages for the analysis of three-way
arrays were recalled, namely \pkg{PTak}, \pkg{ThreeWay}, \pkg{multiway} and
\pkg{rTensor}. They all implement general functions for carrying out
standard models, but at the same, by focusing on specific modeling
needs, they also provide different tools to the community. The \pkg{PTak}
package aims to deal with the modeling of data characterized by a
spatio-temporal context. It introduces the multidimensional method
Principal Tensor Analysis on k-Modes, but it also provides standard
PARAFAC and Tucker3 functions. There are, however, some limitations
on the minimal number of components which can be extracted. This package
also offers the advantage of providing procedures which support
non-identity metrics and penalization.
The \pkg{ThreeWay} package provides a more complete suite of functions
for modeling three-way data. Here the emphasis is on the PARAFAC and
Tucker3 models. The main functions are also available as interactive
tools which guide the user step by step into the analysis, offering
the possibility to choose constraints, pre-processing and post-processing
alternatives. Other useful features include basic plotting tools and
the possibility to obtain bootstrap percentile intervals.
The \pkg{multiway} package, on the other hand, concentrates on the
generalization of component models to $n$-dimensional arrays, in
addition it offers the option to impose non-negativity constraints
and provides other functions such as \emph{Individual Differences Scaling},
\emph{Multiway Covariates Regression} and \emph{Simultaneous Component Analysis}.
Lastly, the \pkg{rTensor} package, offers a framework for handling
and analyzing $n$-th order tensors by providing the S4 class \code{Tensor}
and common tensor operations and decompositions, including the
PARAFAC/CANDECOMP decomposition, Generalized Low Rank Approximations of Matrices,
Multilinear Principal Component Analysis, Population Value Decomposition
and all Tucker models. The $t$-product and the $t$-singular value
decomposition are also implemented for three-mode cases.\\\\
It is clear that none of the existing \proglang{R} packages provides
robust versions of standard decompositions for dealing with
three-way data contaminated by outliers. Such tools are available
only in \proglang{MATLAB}.
Here the \pkg{LIBRA} package implements a robust PARAFAC model while
the \pkg{CuBatch} interface allows for outliers identification analysis
by providing an advanced version of the \pkg{NWay} toolbox,
a compendium of all multi-way functions in the environment.
In this perspective the first important contribution of the
\pkg{rrcov3way} package is to integrate the classical functions
for the decomposition of three-mode data with robust tools. Moreover it adds
another advanced feature which none of the above packages provides:
multi-way modeling of compositional data. Like other statistical tools,
the PARAFAC and Tucker3 models cannot be applied directly to compositions
due to the spurious correlations issue, therefore a specific methodology
must be followed. The attention on compositional data analysis has
increased in the past years and its usage has spread to various
disciplines such as chemistry, geosciences, biology, agriculture,
but also human and behavioral sciences. This increase in applicability
is also due to the broadening of the concept of composition, which
now also included implicit sum constraints and problems with both
dimensional and relative aspects \citep{pawlowsky2015modeling, pawlowsky2015tools}.
Tailoring standard multilinear tools to this methodology can, thus,
prove of great interest. For this reason an option to carry out a
compositional version of model functions is included in the package.
Specifically, the included \code{Parafac()} and \code{Tucker3()} functions present the
option of carrying out four different types of analysis by means of
two parameters, \code{robust=TRUE/FALSE} and \code{coda.transform="none"}
(default) or \code{coda.transform="ilr"}:
\begin{enumerate}
\item Classical Analysis: the model is estimated on the original array
    entries (with or without standard preprocessing) using the alternating
    least square (ALS) algorithm (\code{robust=FALSE, coda.transform="none"});
\item Robust Analysis: the original array is robustified and then the
    ALS algorithm is performed. Outliers are identified and accommodated
    without affecting the solution  (\code{robust=TRUE, coda.transform="none"});
\item Compositional Analysis: data are transformed in log-ratio coordinates,
    the ALS algorithm is performed and the output is expressed in
    log-contrasts  (\code{robust=FALSE, coda.transform="ilr" or coda.transform="clr"});
\item Robust and Compositional Analysis: data are first expressed in
    log-ratio coordinates, then the transformed array is robustified and
    ultimately the ALS algorithm is performed  (\code{robust=TRUE, coda.transform="ilr"}).
\end{enumerate}
These four types of analysis are provided in \pkg{rrcov3way} for both PARAFAC
and Tucker3 models. Compared to the other existing software packages, \pkg{ThreeWay} and
the other already mentioned \proglang{R} packages as well as the \proglang{MATLAB} packages \pkg{NWay}
and \pkg{CuBatch} cover only (1), while the robustness \proglang{MATLAB} package \pkg{LIBRA}
covers only (2) for PARAFAC model. The compositional versions (3) and (4) as well as the
robust version of Tucker3 from (2) are available only in \pkg{rrcov3way}. \\\\
The package \pkg{rrcov3way} relies on several functions \emph{imported} from
the package \pkg{Threeway} as well as on functions for robust PCA imported
from package \pkg{rrcov} \citep{todorov-oof} and functions for robust preprocessing
imported from package \pkg{robustbase} \citep{robustbase}. The package \pkg{nnls}
\citep{mullen-nnls:2012} is used for the Lawson-Hanson algorithm for
non-negative least squares (NNLS) and several
utility mathematical functions are imported from package \pkg{pracma}
\citep{borchers-pracma:2021}.\\\\
In addition the package also includes a useful set of plotting tools which
allow for quick and correct visualization of the main results.
Details on the treatment of compositions and robustification
procedures are provided in the following section.

\section{Methods and algorithms for three-way data analysis}
\label{sec:theory}
In standard multivariate analysis the main data structure is
a matrix of two dimensions, say $\vv{X}$ of order $I \times J$
with $I$ and $J$ denoting the number of objects and variables, while in three-way
analysis we deal with data collected into three
dimensional arrays (tensors) $\vv{\underline{X}}$ of order
$I \times J \times K$. A three dimensional array can be visualized as
a box consisting of $k=1 \ldots K$ frontal slices $\mathbf{X}_k$, containing the $I \times J$ object by variables matrices, one matrix for each occasion.
Similarly, if the objects or variables index is fixed instead of the one for occasions, the array can be seen also as a collection of horizontal slices $\mathbf{X}_i$ of dimension $K \times J$ or vertical slices $\mathbf{X}_j$ of dimension $I \times K$.
In general, rather than speaking about objects, variables and occasions, the three entities of the array are defined as modes, namely A, B and C mode respectively.
A three-way array can be converted into a two-way matrix if two-way slices are juxtaposed by row or column \citep[p.7]{kroonenberg:2008}. This operation is defined as matricization or unfolding. In particular an array is matricized with respect to the first mode if frontal slices are put next to each other row-wise. In this manner a $(I\times J K)$ wide-matrix ${\bf X}_A = [{\bf X}_1|\ldots|{\bf X}_k|\ldots|{\bf X}_K]$ is obtained, which is often used to write decomposition models in a simplified manner.\\\\
Three-way data are generally modeled by standard multilinear tools, namely the PARAFAC and Tucker3 models, which can be defined as different generalization of PCA to higher order arrays inheriting distinct features of this technique. The Tucker3 model aims to provide the best approximation of a three-way array within a joint low-dimensional subspace. Consequently, it is characterized by subspace uniqueness and rotational freedom, but does not provide a unique solution or equal number of components across modes like PCA. The PARAFAC model yields the best low-rank approximation of the array by including the additional constraint that each component in a given mode is related to only one component in the other modes, thus a unique solution can be identified but the model cannot be rotated without loss of fit.
Here the two models are briefly recalled for completeness, for more details on their properties see \cite{harshman1994parafac}, \cite{smilde:2004} and \cite{kroonenberg:2008}.

\subsection{Tucker3 model}
The Tucker3 model \citep[see][]{Tucker:1966, bro:2000} with $P$, $Q$ and $R$
components decomposes the
data array $\vv{\underline{X}}$ into three orthogonal loading matrices
$\vv{A}$ $(I \times P)$, $\vv{B}$ $(J\times Q)$, $\vv{C}$ $(K\times R)$
and an array of component interactions called core array $\vv{G}$ $(P\times Q \times R)$ which
describes the relation between combinations of components.
The Tucker3 model can be written formally as
\begin{equation}
\label{eq:tucker3}
\vv{X}_A=\vv{AG}_A(\vv{C}\otimes \vv{B})^\top + \vv{E}_A
\end{equation}
where $\vv{X}_A$, $\vv{G}_A$ $(P \times Q R)$ and $\vv{E}_A$ $(I\times J K)$ are the original array, the core array and the error array matricized with respect to the mode A. The symbol $\otimes$ represents the Kronecker product between two matrices.
To estimate the optimal component matrices the residual sum of squares
\begin{equation}
\label{eq:RSS}
||\vv{E}_A||^2=\sum_{i=1}^I\sum_{j=1}^J\sum_{k=1}^K(x_{ijk}-\hat x_{ijk})^2 = \sum_{i=1}^I||\vv{x}_i - \vv{\hat x}_i ||^2 = \sum_{i=1}^I RD_i^2
\end{equation}
is minimized.
The residual distance for observation $i$ is thus given by
\begin{equation}
\label{eq:rdist}
RD_i = ||\vv{x}_i - \vv{\hat x}_i || = \sqrt{\sum_{j=1}^J\sum_{k=1}^K(x_{ijk}-\hat x_{ijk})^2}
\end{equation}
and the estimation is equivalent to the minimization of the sum of the squared distances.
Three-way models are usually fitted by an iterative procedure based on
alternating least squares, TUCKALS3 \citep{kroonenberg1980principal}. The component matrices are estimated
one at a time, keeping the estimates of the other component matrices fixed,
i.e., we start with initial estimates of $\vv{B}$ and $\vv{C}$ and find an
estimate for $\vv{A}$ conditional on $\vv{B}$ and $\vv{C}$ by minimizing
the objective function. Estimates for $\vv{B}$ and $\vv{C}$ are found
analogously. The iteration continues until the relative change in the
model fit is smaller than a predefined constant.
\subsection{PARAFAC model}
The PARAFAC/CANDECOMP model \citep[see][]{Carroll:1970, Harshman:1970}
decomposes the 3-way
data array $\vv{\underline{X}}$ into three loading matrices
$\vv{A}$ $(I \times F)$, $\vv{B}$ $(J\times F)$,
$\vv{C}$ $(K\times F)$ with $F$ components (using the same number for each mode).

The PARAFAC model can be written formally as
\begin{equation}
\label{parafac}
\vv{X}_A = \vv{AI}_A(\vv{C} \otimes \vv{B})^\top + \vv{E}_A,
\end{equation}
where $\vv{X}_A$, $\vv{E}_A$ and $\otimes$ are defined as in Equation~\ref{eq:tucker3} and $\vv{I}_A$ $(F \times F F)$ is the superdiagonal three-way identity array matricized in mode A,  which substitutes  the core matrix of the Tucker3 model, because here the latent structure is the same across modes. Specifically, the PARAFAC model can be considered a constraint version of Tucker3 with the same number of components for each mode $(P = Q = R = F)$, no interaction between
the components is allowed. Similarly as for the Tucker3 model, a PARAFAC-ALS estimation algorithm can be constructed optimizing a residual sum. Analogously, residual distances $RD_i$ are defined as given in Equation~\ref{eq:rdist}.
\subsection{Robust algorithms}
\label{sec:robust}
It is well-known that algorithms which rely on least squares  break down
in the presence of outliers. This is the case of PCA, for which this problem was extensively
studied and a number of robust algorithms, resistant to outliers, were proposed. Similarly, the ALS procedure is badly affected by anomalous observations in the data and  robust algorithms  for three-way procedures are also needed. A robust version of Tucker3 was proposed by \citet{pravdova2001robust}, later improved and adapted for
PARAFAC by \citet{Engelen11}. The idea of a robust version of
PARAFAC or Tucker3 is to identify enough ``good'' observations and then to
perform the classical ALS procedure on these observations. Thereafter, results are used
to compute the residual distances for all
observations according to Equation~\ref{eq:rdist} and select as
``good'' observations those that have smaller residual distances.
This is repeated until no significant change is observed.
Finally a reweighting step is carried out to improve the efficiency
of the estimates. In order to initially identify  the ``good'' observations, a robust version of
principal component analysis, e.g., ROBPCA \citep{hubert-ROBPCA:2005, todorov-oof}
on the matricized array is used. The convergence of the algorithm can be
ensured \citep{Engelen11}, however, it is not guaranteed that
a global optimum will be found. In order to label extreme points once the robust ALS-procedure is performed, two distances are
computed: the robust residual distance calculated as in  Equation~\ref{eq:rdist} which  indicates how
well the fitted data correspond to the observations
and the robust score distance, a Mahalanobis-type distance in the scores space defined as:
\begin{equation}
\label{sdist}
\mbox{SD}_i = \sqrt{(\vv{\hat{a}}_i - \vv{\hat{\mu}}) ^\top \vv{\hat{\Sigma}}^{-1}(\vv{\hat{a}}_i - \vv{\hat{\mu}})},
\end{equation}
where $\vv{\hat{a}}_i$ is the $i$-th row ($i$-th score) of the matrix $\vv{\hat{A}}$, and
$\vv{\hat{\mu}}$ and $\vv{\hat{\Sigma}}$ are estimates of location and covariance, respectively,
of $\vv{\hat{A}}$.
For more details on the theoretical aspects of the robust PARAFAC algorithm included in \pkg{rrcov3way} see \citet{Engelen11}.

\subsection{Compositional analysis}
\label{sec:compositional}
A three-way compositional array can be organized in such way that the rows of
 $\vv{\underline{X}}$ consist of $K$ concatenated $J$-part compositional vectors in the simplex sample space:
\begin{equation}
S_{k}^J=\biggl\{\vv{x}_{ik}=(x_{i1k},\ldots,x_{ijk},\ldots,x_{iJk}),~x_{ijk}> 0,~\forall j,~{\displaystyle{\sum_{j=1}^J
x_{ijk}}}=\kappa\biggl\},
\end{equation}
where the constant $\kappa$ can assume any value, for example 1 for unit data.
Before applying PARAFAC or Tucker3 algorithms (robust or classical), compositions
have to be transformed to  Euclidean space coordinates. Following the approach proposed by
Aitchison, an isometric mapping between $S_{k}^J$ and a hyperplane of ${\mathbb R}^{J}$
is given by the centered log-ratio (\textit{clr}) coordinates \citet{Ait86}:
\begin{equation}
\label{clrtrafo}
\vv{y}_{ik} = clr(\vv{x}_{ik})=(y_{i1k},\ldots ,y_{iJk})=
\left(\ln {\frac{x_{i1k}}{\sqrt[J]{\prod_{j=1}^J {x_{ijk}}}}},\ldots,
\ln {\frac{x_{iJk}}{\sqrt[J]{\prod_{j=1}^J {x_{ijk}}}}}\right).
\end{equation}
The vectors $\vv{y}_{ik}$ ($i=1,\ldots ,I$) can be collected as rows in
the matrix $\vv{Y}_k$, for different occasions $k=1,\ldots, K$, resulting
in a matrix $\vv{Y}_k$, namely the \textit{clr}-coordinates version of
the slice $\vv{X}_k$, or differently it can be thought as the
\textit{k}-th slice of a three-way array $\vv{\underline{Y}}$.
The new \textit{clr} coordinates sum to zero by construction
\citep{Ait86}, resulting in perfect data collinearity. This is not an issue for classic PARAFAC and Tucker3 but creates problems in case of anomalous observations.
Outliers in a compositional framework are not necessarily represented by extreme values but rather by anomalous ratios between elements. An outlier in three-way compositional data occurs if a composition is very different form the others in terms of ratio between parts for one occasion or for combined occasions. Standard robust estimators are inapplicable on \textit{clr} coordinates due to the singular covariance matrices \citep{maronna:2006}, consequently, it becomes necessary to formulate the problem using an alternative coordinate system.\\\\
The isometric log-ratio (\textit{ilr}) transformation was introduces by \citet{Ego03} who demonstrated that the \textit{ilr}-values of a composition represent an isometric mapping from $S_{k}^J$ to
${\mathbb R}^{J-1}$. Here the formalization is reported for a three-way case:
\begin{equation}
\label{ilrtrafo}
\vv{z}_{ik}=ilr(\vv{x}_{ik})=(z_{i1k},\ldots ,z_{i,J-1,k})=
\sqrt{\frac{J-j}{J-j+1}}\,\ln\frac{x_{ijk}}{\sqrt[J-j]{\prod_{l=j+1}^J
x_{ilk}}},
\end{equation}
with $j=1,\dots ,J-1$. For full details on the Tucker3 or PARAFAC analysis
for Coda see \citet{gallo:2015,Engle13}. \\\\
Robust PARAFAC and Tucker3 models can be carried out on arrays of \textit{ilr} coordinates without further concerns. Therefore this is the approach implemented in the \pkg{rrcov3way} functions for CoDa. For more details on robust PARAFAC analysis for CoDa see \citep{dipalma:2018}.\\\\
The \textit{ilr} coordinates present the disadvantage of being hard to interpret and plot. However, with easy computations, results can be translated in \textit{clr} terms. Specifically the loading matrices $\vv{A}$ and $\vv{C}$ computed by both models are invariant under a change of basis, which means that their estimates are the same whether \textit{ilr} or \textit{clr} coordinates are used. The only difference is recorded in the second mode. In particular the loading matrices $\vv{B}_{clr}$ and $\vv{B}_{ilr}$ are different, but are linked by the relation $\vv{B}_{clr}= \vv{\Psi} \vv{B}_{ilr}$ with $\vv{\Psi}^\top\vv{\Psi}=\vv{I}_{J-1}$ and $\vv{\Psi}\vv{\Psi}^\top=\vv{I}_{J}-\vv{1}_j\vv{1}_j^\top/J$. Hence, in \pkg{rrcov3way}, CoDa functions yield both $\vv{B}_{clr}$ and $\vv{B}_{ilr}$ as standard outputs to facilitate interpretation.
\section{Getting started with rrcov3way}
\label{sec:ex-intro}
%% set the prompt to "R> " and the continuation to "+ "
<<echo=FALSE, results=hide>>=
options(prompt = "R> ", continue = "+  ", width = 70, useFancyQuotes = FALSE)
@
The package \pkg{rrcov3way} contains several built-in data sets which
can be used for demonstrating its functionality, therefore we start
by loading the package and one of the available data sets - \code{elind},
the OECD Electronics Industries Data from \cite{kroonenberg:2008}.
<<label=intro>>=
library("rrcov3way")
data("elind")
@
The organisation for economic co-operation and development (OECD) publishes comparative statistics of the export size of various sectors of the electronics industry: information science, telecommunication products, radio and television equipment, components and parts, electro-medical equipment and scientific equipment.
The data consist of specialization indices of electronics industries of 23 European countries for the years 1973-1979. The specialization index is defined as the proportion of the monetary value of an electronic industry compared to the total export value of manufactured goods of a country compared to the similar proportion for the world as a whole \citep[see][]{elind}.
Let us look at what it is in the data set - its dimensions and labels of each of the modes.
<<label=intro-elind1>>=
dim(elind)
@
The data set is a three-way array with dimension $23 \times6 \times 7$.
The frontal slices (mode C) are matrices with dimension $23 \times 6$
representing the data for all countries and all industries in one year.
The labels are:
<<label=intro-elind2>>=
rownames(elind[, , 1])
colnames(elind[, , 1])
@

The lateral slices (mode B) represent the data for all countries and years for one industry---matrices
with dimension 23 x 7 and the labels are:
<<label=intro-elind-3>>=
rownames(elind[, 1, ])
colnames(elind[, 1, ])
@
First of all we center and scale the data, using the default procedures for centering and scaling.
<<label=intro-elind-preprocess>>=
elind <- do3Scale(elind, center = TRUE, scale = TRUE)
@
For this purpose we use the function \code{do3Scale()} which by default will center the data across mode A
by subtracting the arithmetic mean column-wise and will scale within mode B to sum of squares equal to 1.
Next we perform classical PARAFAC analysis with the default number of components (\code{ncomp=2}).
<<label=intro-parafac>>=
res <- Parafac(elind, ncomp = 3)
res
@
The function returns an \code{S3} object containing a fit value expressed
as a percentage and the component matrices for the different modes. The
default \code{print()} method specifies the model and shows the fit value
in per cent. It is possible to inspect any of the component matrices directly:
<<label=intro-parafac-1>>=
head(res$A)
res$B
@
Next we can produce different plots using the standard method \code{plot()}.
The default plot is a distance-distance plot presenting the
\emph{residual distances} against the \emph{score distances}.
The first distance expresses the closeness between a single point
and its projection in the subspace spanned by the first two
principal components while the second measure computes how
far a score point is from the majority of the data (score matrix)
projected in the principal component subspace, it is compared
to the classical Mahalanobis distance. Outliers are labeled analyzing
the distribution of the residual distance (RD)
and the score distance (SD), an observation is declared to be an extreme
point if exceeds a specific threshold (dashed line), the cut-off value is
different for both measures. Robust and classical version for both distances,
their computations and cut-off identification are presented in detail
in Section~\ref{sec:structure}.
In the left panel of Fig.~\ref{fig:intro-dd} the classical distances
are presented, since we have not selected the \emph{robust} option.
We see that Turkey is marked as an outlier.
Compare with \cite{kroonenberg:2008}, where Turkey is identified as
outlier even with the residuals plots introduced there.
The analysis is then repeated using the
robust version of the PARAFAC model and the results are presented
in the right panel of Fig.~\ref{fig:intro-dd} in a distance-distance
plot (the default plot).
<<label=intro-parafac-rob>>=
(resr <- Parafac(elind, ncomp = 3, robust = TRUE))
@
\begin{figure}[H]
\centering
<<label=intro-parafac-dd, fig=TRUE, include=TRUE, width=14, height=7>>=
oldpar <- par(mfrow = c(1,2))
plot(res, main = "Classical distance-distance plot")
plot(resr, main = "Robust distance-distance plot")
par(oldpar)
@
\caption{Classical and robust distance-distance plot for the PARAFAC model with 3 components of the OECD data.}
\label{fig:intro-dd}
\end{figure}
Now not only Turkey is flagged as a bad leverage point, but also Poland.
Greece, New Zealand, Denmark and Switzerland are shown as residual outliers.\\\\
Another plot can be selected using the argument \code{which="comp"} while
the argument \code{mode} selects the mode (A, B or C) to display.
By default the component plot for mode A will be shown. If we want to present a
paired component plot for other modes, we use for example
the argument \code{mode="B"} for mode B. An example of a paired component
plot for a robust three component PARAFAC model of the OECD data is shown
in Fig.~\ref{fig:intro-comp}.
\begin{figure}[H]
\centering
<<label=intro-parafac-comp, fig=TRUE, include=TRUE, width=14, height=7>>=
oldpar <- par(mfrow = c(1,2))
plot(resr, which = "comp", main = "Paired component plot (mode A)")
plot(resr, which = "comp", mode = "B",
    main = "Paired component plot (mode B)")
par(oldpar)
@
\caption{Paired component plot for a robust three component PARAFAC model of the OECD data.
The left panel presents mode A and the right one mode B.}
\label{fig:intro-comp}
\end{figure}

The available plots will be described in
Section~\ref{sec:visual} and will be demonstrated in the example
Section~\ref{sec:examples}.

\section{Software approach}
\label{sec:software}
In this section first the data structures and their manipulation
will be considered and then the main model fitting interface
will be described. This includes the main estimation functions,
their control options and the objects returned. Further, the
important steps in the three-way analysis like preprocessing
and postprocessing will be presented.
\subsection{Data structure and data manipulation}
\label{sec:structure}
The main data structure used throughout the package is a three
dimensional array. The \code{dimnames} are the labels of the three modes.
We can think of the three dimensional array as consisting of $K$ frontal,
$I$ horizontal and $J$ lateral slices, as shown in the left-hand side of
Fig.~\ref{fig-matricization}, \citep[see][]{kiers2000towards}. All data
sets provided with the package are in this format.\\\\
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{matricization}
\caption{ Matricization of a three-way array into mode A matrix
$\vv{X}_A$, mode B matrix $\vv{X}_B$ and mode C matrix $\vv{X}_C$}
\label{fig-matricization}
\end{figure}
In some cases, particularly in most of the computational algorithms,
it is more convenient to represent the three-way array as a two dimensional
matrix by joining the slices in a specific way. This procedure is called
\emph{matricization} or \emph{unfolding} of the three dimensional array
and can be done in one of the three possible ways shown in
Fig.~\ref{fig-matricization} on the right. These matrices will be denoted in the code as
\code{Xa}, \code{Xb} and \code{Xc} respectively. To convert
a three-way array to one of these matrices the function \code{unfold()}
can be used, specifying the required mode (defaults to \code{mode="A"}).
To restore back a matricized array to a three dimensional array the
function \code{toArray()} is used. Let us consider as an example
\citep[][Section 2.6]{kiers2000towards} a $4 \times 3 \times 2$ array
consisting of the following two frontal slices:\\\\
$\begin{pmatrix}
1 & 2 & 0\\
0 & 1 & 0\\
1 & 0 & -1\\
-1 & 0 & 0
\end{pmatrix}$ and
$\begin{pmatrix}
0 & 1 & 0\\
1 & 0 & 0\\
1 & 2 & 0\\
0 & 0 & 1
\end{pmatrix}$

<<label=structure-matricization>>=
x <- c(1, 0, 1, -1, 2, 1, 0, 0, 0, 0, -1, 0, 0, 1, 1, 0, 1,
    0, 2, 0, 0, 0, 0, 1)
X <- array(x, dim = c(4, 3, 2))
dimnames(X) <- list(1:4, 1:3, 1:2)
X
(Xa <- unfold(X))
(Xb <- unfold(X, mode = "B"))
(Xc <- unfold(X, mode = "C"))
@
These three forms of matricizations are related to each other by a
simple cyclic permutation of the modes, which is executed by
the function \code{permute()}. Another usage of the matricized
format is to read three-way arrays from and write to files.
\subsection{Model fitting interface}
\label{sec:interface}
The modeling interface of the \pkg{rrcov3way} consists of two main
functions, \code{Parafac()} and \code{Tucker3()} for estimating
the corresponding models. These functions take as an input a three-dimensional
data array and return an \code{S3} object \code{parafac} or \code{tucker3}
respectively. These objects are basically lists containing the estimated
loadings matrices \code{A}, \code{B} and \code{C} (plus the core
array \code{GA} in case of Tucker3), different estimation options and
results, like the fit value and the number of iterations performed,
as well as diagnostic output like the residual distances and the outlier flags.
On these objects a number of standard \proglang(R)~functions can be applied: \code{print()},
\code{show()}, \code{summary()}, \code{plot()} as well as a number of
postprocessing functions which will be described in the next section.
Whether to perform robust analysis is controlled by  the parameter
\code{robust=FALSE} (default) or \code{robust=TRUE}. Similarly, whether
to take into account the closeness of the data (compositional or not) is
controlled by the parameter \code{coda.transform="none"} or \code{coda.transform="ilr"}.
Of course this parameter could have been also logical like \code{robust},
but in this way we left the door open for implementing other transformations for
compositional data like the \emph{pivot coordinates} or \emph{weighted pivot coordinates}
\citep[see][]{filzmoser-book:2018}.
\subsubsection{Initialization of the algorithms}
For starting the ALS estimation it is necessary to provide
initial values of the loading matrices. Good starting values
could potentially  speed up the algorithm and ensure that the
global minimum  is found. By default the initial values are based on
an approximate solution obtained from the generalized
singular value decomposition. Alternatively the user
can select random start matrices by setting \code{stat="random"} or
provide a list of three matrices, to be used as starting values.\\\\
For stopping the iterations the  relative  change in fit between
two iterations is used, the algorithm stops if it is
below a certain value given by the parameter \code{conv}.
By default \code{conv=1e-6}.
\subsubsection{Constrained estimation for PARAFAC}
Sometimes it is necessary to constrain the PARAFAC solution
in order to improve the interpretability or for other reasons.
In psychometrics orthogonality constraints are often used
as a means of overcoming  problems with unstable solutions. In
chemometrics the preferred constraint is nonnegativity of
the solution producing nonnegative loadings. A general method
to find the least squares  loading  vector given  a nonnegativity  constraint
has been proposed by \citet{lawson:nnls} which is implemented
in the \proglang{R}~package \pkg{nnls}. Constraints are requested from
the \code{Parafac()} function using the parameter \code{const}.
The default is \code{const="none"} for no constraints. Orthogonality
constraints for all modes are requested by \code{const="orth"},
nonnegativity constraints by \code{const="nonneg"} and
\code{const="zerocor"} means zero correlations constraints.
Examples for using nonnegativity constraints in PARAFAC
can be found in Section~\ref{sec:ex-amino}.
\subsubsection{Parameters for robust estimation}
Selecting \code{robust=TRUE} in the call of the modeling
function (\code{Parafac()} or \code{Tucker3()}), robust
estimation will be performed for which several more
parameters can be selected.\\\\
First of all the level of robustness,
the so called breakdown point can be set between 0.5 (maximal robustness) and 0 (no robustness).
This is done through the parameter \code{alpha} which is passed to the function
\code{PcaHubert()}. The default value is \code{alpha=0.75} which gives
high robustness of 25\% and acceptable efficiency. It can be changed to a value in the range from
\code{alpha=0.5} for maximal robustness to \code{alpha=1} for no robustness.\\\\
The next parameter related to robustness is the number of components $k$ to use in the robust PCA
carried out in the first step of the robust PARAFAC or Tucker3 algorithms. By default
\code{ncomp.rpca=0} which forces the algorithm to find $k$ such that $l_k/l_1
\ge 10^{-3}$ and $\sum_{j=1}^k l_j/\sum_{j=1}^r l_j \ge 0.8$, where $l_1, \ldots, l_r$
are the eigenvalues of the sample covariance matrix of the data.
Alternatively the number of principal components $k$ can be
specified by the user after inspecting the scree plot. Both \code{Parafac()} and
\code{Tucker3()} functions return a PCA object \code{pcaobj} which can be used to
visualize the scree plot as follows:
<<label=robust-pca, fig=TRUE, include=FALSE, width=6, height=6>>=
data("elind")
(o <- Parafac(elind, robust = TRUE, ncomp.rpca = 11))
rrcov::screeplot(o$pcaobj, main = "Screeplot: elind data")
o1 <- Parafac(elind, robust = TRUE)
cat("\n Selected number of components: ", o1$pcaobj$k, "\n")
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-robust-pca}\\
\caption{Selecting the number of components for robust PCA: The screeplot.}
\label{fig-robust-pca}
\end{figure}
The screen plot suggests 4 components, but this is also the number of components
which will be selected by the default method described above.
For more details on robust PCA in \proglang{R} see \citep{todorov-oof}.\\\\
Another parameter related to robust estimations is \code{robiter} which
specifyies the maximal number of ALS iterations necessary to achieve a robust fit.
By default \code{robiter=100} which in most of the cases is sufficient.\\\\
The last robustness parameter is \code{crit} which specifies the quantile
for identifying outliers, for example in the outlier map plot
shown in Fig.~\ref{fig-girls-ddplot}.
\subsection{Preprocessing}
\label{sec:preprocessing}
In three-way analysis, in the same way as in multivariate analysis it
is important to prepare the data before starting the actual analysis.
Often this preprocessing is a complex procedure, specific for the
application domain \citep[see for example][]{Engelen:2007}. The most
general preprocessing functions are removing the possible offset (centering)
and bringing the variables to the same scale (scaling or normalizing). While
in multivariate analysis these procedures are straightforward, in
three-way analysis we have more options (modes) of doing this. The most
standard preprocessing procedure is to center across the first (A) mode
and to normalize within the second (B) mode. This will (in most of the cases)
eliminate differences in levels and scales which are not natural to the data.
It is important to note that the multivariate methods,
including PARAFAC and Tucker3, require ratio-scale  data.
Ratio scale data are represented by proportional values and the lack
of the measured property is embodied by the zero. Usually data at hand are
interval scaled and the centering procedure will turn them into ratio-scaled.
If data are not approximately ratio-scaled, centering is mandatory.
On the other hand, scaling is related to the estimation procedure
and is intended to make the data compatible with the least squares.\\\\
Centering across a mode (by default mode A is selected) is performed by
first matricizing the array into the selected mode and then subtracting
a column measure of location from the columns of the matrix (same as in two way analysis).
If the selected location is the arithmetic mean the centering for mode A is done as follows :
\[z_{ijk}=x_{ijk}-\bar x_{.jk}=x_{ijk} - \frac{1}{I}\sum \limits_{i=1}^I x_{ijk}
\]
Since only a single mode is affected, this is usually called single centering.
In a similar way centering is applied to any other mode. In the package \pkg{rrcov3way}
centering is done by the function \code{do3Scale()}, setting the parameter \code{center=TRUE}.
We try this on the OECD data set (we could skip \code{center.mode="A"}
since this is the default). Setting \code{center=TRUE} will by default subtract the
arithmetic mean, but it is possible to specify any suitable function,
for example \code{center=median} will subtract the median.
<<label=elind-center>>=
elind.cA <- do3Scale(elind, center = TRUE, center.mode = "A")
round(colMeans(elind.cA[, , 1]), 10)
@
If it is necessary to center across several modes, this is done sequentially,
providing the result of one operation to the next. The first centering
(say,  across  mode A) will not be destroyed.
<<label=elind-center2>>=
elind.cAB <- do3Scale(elind.cA, center = TRUE, center.mode = "B")
round(colMeans(elind.cAB[, , 1]), 10)
@
Differently from centering, it is not appropriate to scale the unfolded
array column-wise, but rather whole submatrices of the array should be scaled.
Mathematically scaling within mode B can be described as follows:
\[
 z_{ijk}=\frac{x_{ijk}}{\sqrt{\left( \sum \limits_{i=1}^I \sum \limits_{k=1}^K x_{ijk^2} \right)} }
\]
The scaling is also done by the function \code{do3Scale()} setting \code{scale=TRUE}.
The default mode is B. If required to scale not by the sum of squares but by the
standard deviation as usually done in profile analysis \citep[see][]{kroonenberg:2009},
we simply set the scale parameter to the function: \code{scale=sd}.
<<label=elind-center3>>=
elind.cAsB <- do3Scale(elind, center = TRUE, scale = TRUE)
@
If robust scaling is required we could use some robust alternative of the scale function, like $MAD$ or $Q_n$.
<<label=elind-center4>>=
elind.cAsB <- do3Scale(elind, center = TRUE, scale = mad)
@
Let us consider the following very simple simulated example \citep{bro:1998-2008}, where,
with the matrices $A=B=C= (1, 2, 3, 4 )^\top$ a PARAFAC model $X=A (B \otimes C)^\top$ is constructed.
The estimated model with one component has 100\% fit and it obtains the same fit if the data are
scaled across the first mode. If we center by subtracting the
grand average, the fit drops dramatically.
<<label=center>>=
A <- B <- C <- matrix(c(1, 2, 3, 4)) + 10
X0 <- A %*% t(krp(C, B))
X <- toArray(X0, 4, 4, 4)
Parafac(X, ncomp = 1)$fp
Parafac(do3Scale(X, center = TRUE), ncomp = 1)$fp
Parafac(X - mean(X), ncomp = 1)$fp
Parafac(do3Scale(X, center = TRUE, scale = TRUE), ncomp = 1)$fp
@
Instead of doing the centering and scaling in advance, it is possible to
instruct the \code{Parafac()} and \code{Tucker3()} functions to do this by
setting the parameters \code{center=TRUE} and/or \code{scale=TRUE}.
\subsection{Postprocessing}
\label{sec:postprocessing}
Once the parameters are estimated and the model is built,
different procedures can be applied on the solution in order
to facilitate the interpretation. These can be different
transformations and rotations, rescaling of components and/or core array,
reflecting the sign or reordering the components.
\paragraph{Renormalization.} Once the solution is obtained, it could be necessary to
scale the components in order to make them comparable across modes or to
facilitate plotting. \cite{kroonenberg:2008} considers three ways of
adjusting the components for subsequent analysis and visualization and
the most basic of them is the \emph{renormalization}, i.e., bringing the
components to a unit length, i.e., $\sum a_{ip}^2 = 1$ for each component
$p=1, \ldots, P$. The components returned by the Tucker3 algorithm
implemented in the function \code{Tucker3()} are already normalized,
but it is possible that one wants the core array to be normalized.
The components returned by the PARAFAC algorithm are by default
not normalized. It is important to note, that it is necessary to
compensate the scaling of components by the inverse scaling of
the core array or other components. Renormalization is done by the
S3 method \code{do3Scale()} for Tucker3 or PARAFAC respectively.
With the following script the core array of the Tucker3 model
will be normalized to unit row sums and the compensation will
be absorbed by the A mode (default). With \code{renorm="B"} we
can choose another mode to absorb the compensation.
<<label=renormalize>>=
t3 <- Tucker3(elind, 3, 2, 2)
t3.norm <- do3Scale(t3, renorm.mode = "A")
rowSums(t3.norm$GA ^ 2)
@
Similarly, in PARAFAC with for example \code{renorm="A"} (default) we
choose to renormalize the components of the modes B and C to unit length
while the components of mode A will absorb the compensation.
<<label=renormalize-2>>=
cp <- Parafac(elind, ncomp = 3)
cp.norm <- do3Scale(cp, renorm.mode = "A")
colSums(cp.norm$B ^ 2)
colSums(cp.norm$C ^ 2)
@
\paragraph{Rotation.}
As already mentioned in the introduction, the decomposition provided by
the Tucker3 model is not unique. Any arbitrary nonsingular rotation of
all components simultaneously will retain the model representation
and thus will produce a new solution. \citet{Tucker:1966} showed that
postmultiplying $\vv{A}, \vv{B}$, and $\vv{C}$ by non singular matrices,
i.e., $\vv{L}, \vv{M}$, and $\vv{N}$, can always be compensated for by
applying the inverse of these matrices to the core array. It can be
verified that
\begin{equation}
\vv{X}_A = \vv{A}\vv{G}_A(\vv{C} \otimes \vv{B})^\top + \vv{E}_A= \tilde{\vv{A}}\tilde{\vv{G}}_A(\tilde{\vv{C}} \otimes \tilde{\vv{B}})^\top + \vv{E}_A
\end{equation}
with $\tilde{\vv{A}}=\vv{AL}^{-1}$, $\tilde{\vv{B}}=\vv{BM}^{-1}$, $\tilde{\vv{C}}=\vv{CN}^{-1}$ and $\tilde{\vv{G}}_A=\vv{L}\mathbf{G}_A (\vv{N} \otimes \vv{M})$.
This rotational freedom of
the Tucker3 model can be exploited to obtain a simplified, easier
to interpret form of the core array. Since such rotation can distort
the simplicity of the component matrices, a trade-off can be searched for,
as proposed by \citet{kiers1998orthomax}. This is a general procedure
for joint rotation of the core array and the component matrices,
allowing any combination of relative weights to be attached to
the simplicity of core versus the simplicity of the component matrices.\\\\
In \pkg{rrcov3way} this method is implemented in the function \code{do3rotation()}
which transforms a Tucker3 solution into a simpler one using the provided
as parameters relative weights. By default the rotation is done on all
modes: \code{rotate=c("A", "B", "C")}.
To illustrate the use of this function,
the OECD data \code{elind} will be used. In the following example the
Tucker3 solution with 3x2x2 components will be simplified by rotating all three component
matrices using equal relative weights $w_A=w_B=w_C=3$. The function will
update the solution \code{t3} with the rotated core \code{t3$GA} and the component matrices
\code{A}, \code{B} and \code{C}. The varimax values will be returned (in \code{vvalue})
as well as the rotation matrices \code{S}, \code{T} and \code{U}.
<<results=hide>>=
data("elind")
t3 <- Tucker3(elind, 3, 2, 2)
xout <- do3Rotate(t3, c(3, 3, 3), rotate = c("A", "B", "C"))
xout$vvalue
@

<<echo=FALSE>>=
xout$vvalue
@

<<echo=FALSE, results=hide>>=
w <- c(NA, 0, 0.5, 1, 2.5, 3, 3.5, 4, 5, 10, Inf)
res <- matrix(NA, nrow = length(w), ncol = 7)
for(i in 1:length(w))
{
    res[i, 1] <- res[i, 2] <- res[i, 3] <- w[i]
    if(is.na(w[i]))
        x <- do3Rotate(t3, rotate = c())
    else if(is.finite(w[i]))
        x <- do3Rotate(t3, rep(w[i], 3))
    else
        x <- do3Rotate(t3, rep(1e18, 3))

    res[i, 4:7] <- round(x$vvalue,3)
}
colnames(res) <- c("w(A)", "w(B)", "w(C)", "Core", "A", "B", "C")
rownames(res) <- 1:nrow(res)
@
With relative weights all equal to 0, $w_A=w_B=w_C=0$,
maximal simplicity of the core array is achieved. On the other hand,
all weights equal to infinity yield maximal simplicity of the component matrices.
By choosing appropriate combination of the weights, reasonable simplicity of
both the core array and the component matrices can be obtained. The following
Table~\ref{tab:varimax} shows the varimax values for the core array and
the component matrices for different joint varimax rotations (given by different
relative weights) to the Tucker3 solution with 3x2x2 components of the OECD data.
<<echo=FALSE, results=tex>>=
library("xtable")
print(xtable(res, label="tab:varimax", caption = "Varimax values for
the core array and the component matrices for different joint
varimax rotations to the Tucker3 solution with 3x2x2 components
of the OECD data"), table.placement = "H")
@
From this table we can see that the solution with all relative weights equal to 3 is
the best compromise between simplicity of core array and component matrices.
The next four tables show the unrotated and rotated with relative weights
(3, 3, 3) solutions and allow to observe to what extent the
interpretability of the model was improved.
<<echo=FALSE, results=hide>>=
x2=do3Rotate(t3, rep(3, 3), rotate = c("A", "B", "C"))
@
<<echo=FALSE, results=tex>>=
print(xtable(cbind(t3$A, x2$x$A), label = "tab:varimax-A",
    caption = "Component matrix A from the unrotated and
    rotated solution with relative weights (3, 3, 3)"),
    table.placement = "H")
@

<<echo=FALSE, results=tex>>=
print(xtable(cbind(t3$B, x2$x$B), label = "tab:varimax-B",
    caption = "Component matrix B from the unrotated and
    rotated solution with relative weights (3, 3, 3)"),
    table.placement = "H")
@

<<echo=FALSE, results=tex>>=
print(xtable(cbind(t3$C, x2$x$C), label = "tab:varimax-C",
    caption = "Component matrix C from the unrotated and
    rotated solution with relative weights (3, 3, 3)"),
    table.placement = "H")
@

<<echo=FALSE, results=tex>>=
print(xtable(rbind(t3$GA, rep(NA, 4), x2$x$GA),
    label = "tab:varimax-core",
    caption = "Core array from the unrotated and rotated
    solution with relative weights (3, 3, 3)"),
    table.placement = "H")
@

\paragraph {Reflection (or sign reversal).} The sign reversal
does not change the model (PARAFAC or Tucker3) as long as
it is done consistently \citep[see][p.200]{smilde:2004}. The
flip of the sign (multiplying by -1) in PARAFAC has to be done
simultaneously on two modes (A and B or A and C or B and C), while
in Tucker3 this is more complicated because of the presence
of the core array. The sign reversal will cause a mirroring
of the scatter plot (see~\ref{sec:visual}). In \pkg{rrcov3way} reflection
is done either by one of the methods \code{reflect()} or \code{do3Postprocess()}.
In the following example the sign of the second component of mode
A of a PARAFAC model will be flipped (and this change will be absorbed by mode B).
<<label=oecd-reflect>>=
res <- Parafac(elind)
head(res$A)
head(res$B)
res1 <- do3Postprocess(res, reflectA = c(1, -1))
head(res1$A)
head(res1$B)
@
\paragraph{Reordering the components.} Similarly as the sign reversal,
reordering of the components will not change the model if done consistently
(for all modes simultaneously in PARAFAC and for the selected mode
and the core array in Tucker3). In \pkg{rrcov3way} reordering
is done by one of the methods \code{reorder()} or \code{do3Postprocess()}.
The parameter \code{order} can be either a vector containing the new order
of the components or a logical \code{TRUE}. In the latter case the
components will be arranged in decreasing order of the component
standardized weights (explained variability).
\subsection{Visual tools for three-way analysis}
\label{sec:visual}
The results from a three-way analysis can be presented in
several different ways \citep[see][]{kroonenberg:2009}, the first one
being tables of the coefficients or loadings for each mode, either rotated or not.
While it is important to inspect the numerical output of the
methods for analysis of three-way data (the component matrices
and the core array) in order to properly interpret the results,
it can be of great help to use different visual representations of these outcomes.
The most typical plots are:
\bi
\item Distance-distance plot for presenting robust models and identifying outliers,
\item Pair-wise graphs of the components for each mode separately,
\item All-components plots which will show all components of a single mode using the levels of the mode as X-axis,
\item Per-component plot, showing a single component on
all modes simultaneously in the same plot.
\ei
As we will see further, although all these plots can be applied
to both PARAFAC and Tucker3 models, some of them (e.g., the
per-component plots) are more suitable for the results of
PARAFAC, while others are more suitable for Tucker3 (e.g., the joint biplots).\\\\
One of the features of the package \pkg{rrcov3way} which is not
to be found in the other \proglang{R} packages for three-way analysis,
is the availability of a variety of plotting procedures. These
procedures are flexible enough to give the user the possibility
to design the graphs according to the needs and the data at hand
but at the same time provide suitable default parameters which
facilitate their use. These procedures are mainly based on
\citet{kiers2000some} and \citet{kroonenberg:2008} and the reader
is referred to these publications for more details. \\\\
To illustrate the plotting procedures in the package the data set
\code{girls} \citep{kroonenberg:2008} will be used. It is available
in the package and we start by loading it.
<<label=girls>>=
data("girls")
dim(girls)
head(girls[, , 1])
sum(girls ^ 2)
@
The data are from a French auxiological study in the years 1953---1975,
\citep{sempe:1987} with the goal to
get insight into the physical growth patterns of children from ages
four to fifteen. Thirty girls were selected and they were measured
yearly between the ages 4 and 15 on the following eight variables:
\bi
\item 	Weight,
\item 	Length,
\item	Crown-rump length,
\item	Head circumference,
\item	Chest circumference,
\item	Arm,
\item	Calf,
\item	Pelvis.
\ei
The data are stored as a three-way array of size
30(girls) $\times$ 8(variables)  $\times$ 12(years). The data are
preprocessed by centering across mode A (the 30 girls), and by
normalizing each of the variables (i.e., across girls and years)
such that for each variable the sum of squares is unity. We do
the centering and normalization with the function \code{do3Scale()}, setting
\code{only.data=FALSE} in order to obtains also the center that was removed,
i.e., the profile of the ``average girl''. This average profile will be used
to plot in Fig.~\ref{fig-average-girl} the average growth curves.
To equalize the range of the variables,
several variables were divided by 10 (in the legend of the plot these
variables are marked by an asterisk.
<<label=girls-preprocess>>=
X <- do3Scale(girls, center = TRUE, scale = TRUE, only.data = FALSE)
center <- X$center
X <- X$x

average.girl <- as.data.frame(matrix(center, ncol = 8, byrow = TRUE))
dimnames(average.girl) <- list(dimnames(X)[[3]], dimnames(X)[[2]])
average.girl$weight <- average.girl$weight / 10
average.girl$length <- average.girl$length / 10
average.girl$crrump <- average.girl$crrump / 10

sum(X ^ 2)
@
<<label=girls-average-plot, fig=TRUE, include=FALSE, width=7, height=5>>=
p <- ncol(average.girl)
plot(rownames(average.girl), average.girl[,1],
    ylim = c(50, 1200),
    type = "n", xlab = "Age", ylab = "")
for(i in 1: p)
{
    lines(rownames(average.girl), average.girl[, i], lty = i, col = i)
    points(rownames(average.girl), average.girl[, i], pch = i, col = i)
}
legend <- colnames(average.girl)
legend[1] <- paste0(legend[1], "*")
legend[2] <- paste0(legend[3], "*")
legend[3] <- paste0(legend[4], "*")
legend("topleft", legend = legend, col = 1:p, lty = 1:p, pch = 1:p)
@
\begin{figure}[H]
\centering
\includegraphics{rrcov3way-girls-average-plot}\\
\caption{Average growth curves for the girls data set. The variables
with an asterisk were divided by 10 to make the ranges similar}
\label{fig-average-girl}
\end{figure}
To the preprocessed data will be applied Tucker3 analysis
with $P=3, Q=3$ and $R=2$ components and PARAFAC analysis with $F=3$ components,
as it was done in \citet{kroonenberg:2008}.
<<label=girls-models>>=
(t3 <- Tucker3(X, 3, 3, 2))
(cp <- Parafac(X, ncomp = 3))
@
Both models obtain a fit of 77.4\%. We execute also the robust Tucker3 procedure
by setting the argument \code{robust=TRUE}. For the underlying robust principal
components for both Tucker3 and PARAFAC, we select the number of components
to be \code{ncomp.rpca=3}, as suggested by the unfolded data array.
In this case the fit becomes 81\% for the robust Tucker and 82\% for
the robust PARAFAC models.
<<label=girls-models-robust>>=
(t3r <- Tucker3(X, 3, 3, 2, robust = TRUE, ncomp.rpca = 3))
(cpr <- Parafac(X, ncomp = 3, robust = TRUE, ncomp.rpca = 3))
@
\paragraph{Distance-Distance plot.} In the context of PCA \cite{hubert-ROBPCA:2005}
defined a \emph{diagnostic plot} or \emph{outlier map} which
helps to distinguish between regular observations and
different types of outliers. Similarly as in robust regression \cite[see][]{Rousseeuw-van-Zomeren}
each observation is characterized by two distances: the residual distance (RD) defined in Equation~\ref{eq:rdist}
and the score distance (SD). The score distance of an observation $\vv{X}_i$
is the robust version of the Mahalanobis distance of the score $\vv{a}_i$
to the center of the score matrix $\vv{A}$:
\begin{equation}
\label{sdist}
\mbox{SD}_i=\sqrt{(\vv{a}_i - \vv{m})^\top \vv{S}^{-1}(\vv{a}_i - \vv{m})} ,
\end{equation}
where $\vv{m}$ and $\vv{S}$ are taken as the robust minimum
covariance determinant (MCD) estimates of the center and
covariance matrix of the scores respectively.
The diagnostic plot is constructed by plotting
the score distances on the horizontal axis, the residual
distances on the vertical axis and drawing two cutoff lines
which will help to classify the observations. The cutoff value on the
horizontal axis (for the score distances) is taken as the 97.5\% quantile of
$\chi^2_F$ distribution with $F$ degrees of freedom, i.e., $c_{horizontal}=\sqrt{\chi^2_{0.975, F}}$,
assuming that the scores are approximately normally distributed which gives
$\chi^2$-distributed squared score distances. For the cutoff value on the
vertical axis (for the residual distances) the Wilson-Hilferty transformation for a $\chi^2$ distribution
is used (which assumes that the $\text{RD}_i$ to the power of 2/3 are approximately
normally distributed). The parameters $\mu$ and $\sigma$ of the
normal distribution can be robustly estimated by the univariate MCD of the
values $\text{RD}_i^{2/3}$ as $m$ and $s$, and the critical value can be taken as
$c_{vertical}=(m + sz_{0.975})^{3/2}$ where $z_{0.975}$ is
the the 97.5\% quantile of the standard normal distribution.
Using these cutoff values the observations can be classified into regular observations
(small RD$_i$ and small SD$_i$), residual outliers (large RD$_i$ and small SD$_i$),
bad leverage points (large RD$_i$ and large SD$_i$) and good leverage points
(small RD$_i$ and large SD$_i$). \\\\
Similarly, a classical diagnostic plot can be generated by computing
the distances as well as their cutoff values using the classical
estimates instead of the robust ones. This will allow to compare visually
the classical and robust PARAFAC and Tucker3 methods. Both the robust and
the classical plot can be used for visualizing compositional data.\\\\
The distance-distance plot is the default plot for both Tucker3 and PARAFAC objects.
Therefore we can call the plot function, simply passing the
corresponding object as a parameter. The most extreme outliers
are identified by their labels. It is possible to specify the
number of outliers to be identified as well as the labels which are to be used.
In Fig.~\ref{fig-girls-ddplot} are shown the classical (left) and the robust (right)
distance-distance plots for PARAFAC model with \code{ncomp=3} of the girls data.
In the classical plot no bad leverage points are identified, only the girl
with label 27 is identified as  a residual outlier and girl 3 is possibly a border
case. Girl 30 is shown as a good leverage point. The robust  plot confirms
the outlying position of the girls 27 and 3, also shows girl 26 as a border
case, but also identifies girl 30 as a bed leverage point. Two other girls
are flagged as good leverage points (14 and 18).
<<label=girls-ddplot, fig=TRUE, include=FALSE, width=14, height=7, echo=FALSE, results=hide>>=
oldpar <- par(mfrow = c(1, 2))
plot(cp, main = "Classical outlier map")
plot(cpr, main = "Robust outlier map")
par(oldpar)
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-girls-ddplot}
\caption{Classical and robust distance-distance plot for a three component PARAFAC model of the girls data set.}
\label{fig-girls-ddplot}
\end{figure}
\paragraph{Paired components plots} (or single mode plots)
are plots in which pairs of components in a single mode
are plotted against each other in a two dimensional scatter plots.
\citet{kroonenberg:2008} differentiates between normalized
and principal coordinates, the former being the coordinates
returned by the estimation procedure and the latter are scaled
by the square root of the explained variance and argues that the
principal coordinates are preferred since they make the differences
in explained variances less dramatic. \citet{kiers2000some} proposes
a generic procedure, equivalent to the principal coordinates,
which we follow here. To generate this plot we use the argument
\code{which="comp"} in the generic plot function. The default
mode is A (can be changed by setting the argument \code{mode},
e.g., \code{mode="B"} or  \code{mode="C"}) and the default components to
plot are the first two but any combination of components can be chosen by
the argument \code{choices}, e.g., \code{choices=c(2,3)}. In case of
\code{mode="B"} the ``variables'' will be plotted as arrows, with
initial point in the origin. If this is not desired, the arrows can be suppressed
by \code{arrows=FALSE}. The paired components plots for mode A are shown
in Fig.~\ref{fig-girls-paired} and those for mode B (with arrows) in
Fig.~\ref{fig-girls-paired-B}.
<<label=girls-paired, fig=TRUE, include=FALSE, width=14, height=7>>=
oldpar <- par(mfrow = c(1, 2))
plot(t3, which = "comp", choices = 1L:2L)
plot(t3, which = "comp", choices = c(1L, 3L))
par(oldpar)
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-girls-paired}
\caption{Side-by-side paired components plots of mode A of the girls growth curves data.}
\label{fig-girls-paired}
\end{figure}
<<label=girls-paired-B, fig=TRUE, include=FALSE, width=14, height=7>>=
oldpar <- par(mfrow = c(1, 2))
plot(t3, which = "comp", choices = 1L:2L, mode = "B")
plot(t3, which = "comp", choices = c(1L,3L), mode = "B")
par(oldpar)
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-girls-paired-B}
\caption{Side-by-side paired components plots of mode B of the girls
growth curves data. The variables are presented as arrows.}
\label{fig-girls-paired-B}
\end{figure}
These displays are available for both Tucker3 and PARAFAC models, and
work in exactly the same way---this follows from the fact that
it is sufficient to replace the core array in Tucker3 by the
identity in PARAFAC. However, it is necessary to note that in the case of
a PARAFAC model the components are orthonormalized in accordance with
the Kiers' procedure \citep{kiers2000some}.\\\\
If the data are compositional, the orthonormalization will not affect the results.\\\\
\paragraph{Per-component plot.} This plot presents a single component across all modes.
In some cases one is not interested in the relation between the
different subjects which can be presented in one mode but rather in the relation
of the components in the different modes. The per-component plot is
constructed by plotting the coefficients for all modes along a single line.
Although this plot could be used also for Tucker3, it is most
suitable for PARAFAC. To facilitate the presentation of the different modes
on a single plot the components should be either orthonormalized or
scaled to unit mean square. The per-component plot for the girls
growth curve data is shown in Fig.~\ref{fig-girls-percomp}.
<<label=girls-percomp, fig=TRUE, include=FALSE, width=7, height=7>>=
cp <- Parafac(X, ncomp = 2)
cp <- do3Postprocess(cp, reflectA = -1, reflectC = -1)
plot(cp, which = "percomp")
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-girls-percomp}
\caption{Per-component plot for the girls growth curves data from a PARAFAC analysis with 2 components.}
\label{fig-girls-percomp}
\end{figure}
It is important to note that the variable means were removed for each
time point which means that the girl with zero component on the girls mode
is the average girl and has average growth curves. All variables and all
years have positive coefficients. The differences in variable 'Length' are
more dramatic than the differences in 'Arm circumference'. The time points
present something like trajectory up to the 13th year the differences
grow and after that fall down again.
\paragraph{All components plots.} These plots are also single mode plots and
are useful in case when one of the modes has a natural ordering---points
in time or wavelengths of emission/excitation spectra (the latter is
almost a standard plot in chemometrics). The two components from the
age mode (mode C) of the Tucker3 model of the girls growth curves data are presented
in Fig.~\ref{fig-girls-allcomp}.
<<label=girls-allcomp, fig=TRUE, include=FALSE, width=8, height=5>>=
plot(t3, which="allcomp")
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-girls-allcomp}
\caption{All components plot for the girls growth curves data with the age
on the horizontal axis (mode C) and the component scores on the vertical axis.}
\label{fig-girls-allcomp}
\end{figure}
The second example is from chemometrics
and returns an \emph{all component plot} of a
PARAFAC model with three components of a fluorescence spectroscopy
experiment (Fig.~\ref{fig-amino-allcomp}). The data set is \code{amino} containing 5 samples (mode A) of
emission-excitation data. In the plot is presented the excitation mode (C)
on 61 wavelengths from 240 to 300. More about this data set can be found
in Section~\ref{sec:ex-amino}.
<<label=amino-allcomp, fig=TRUE, include=FALSE, width=8, height=5>>=
data("amino")
amino.cp <- Parafac(amino, ncomp = 3, const = "nonneg")
plot(amino.cp, which = "allcomp",
    xlab = "Wavelength", ylab = "Intensity", mode = "C",
    points = FALSE, legend.position = NULL)
@
\begin{figure}
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-amino-allcomp}
\caption{All components plot for the Anderson data---plot of PARAFAC
components of the excitation mode (mode C).}
\label{fig-amino-allcomp}
\end{figure}

%%  \paragraph{Oblique components plots}
%5  \textcolor{red}{see \citet{kroonenberg:2008}, 11.4.4 on page 269 and \citet{kiers2000some}, 5.2 on page 160.}

\paragraph{Joint biplots} In (two-way) multivariate analysis the biplot introduced by
\cite{gabriel} in the context of principal component analysis, represents
both the observations and variables in the plane of (the first) two principal components
allowing the visualization of the magnitude and sign of each variable's contribution to these
principal components. Each observation (row of scores) is represented as a point in the biplot
and each variable is represented as an arrow. The arrows graphically indicate the proportion
of the original variance explained by the (first) two principal components and their direction
indicates the relative loadings on these components. The biplot is constructed by
SVD decomposition of a data matrix $\vv{Y}=\vv{U}\vv{\Lambda}\vv{V}^\top$ where $\vv{U}$
and $\vv{V}$ are the orthonormal matrices of the left and right singular vectors
and $\vv{\Lambda}$ is a diagonal matrix of the ordered singular values
(i.e., the square roots of the eigenvalues). From this decomposition  two matrices
$\vv{A}$ and $\vv{B}$ are obtained, which contain coordinates for plotting
the subjects and the variables on the same display.\\\\
In three-way analysis we have three modes and it is necessary to choose two
of them as \emph{display} modes and use the third one as a \emph{reference} mode.
For each slice of the core array (for each component of the reference mode)
can be constructed a joint biplot. The \emph{joint biplot axis} are given by
\begin{equation}
\label{eq:JB-axes-1}
\tilde{\vv{A}_r} = (I/J)^{0.25}\vv{A}\vv{U}_r\vv{\Lambda}_r^{\alpha}
\end{equation}
and
\begin{equation}
\label{eq:JB-axes-2}
\tilde{\vv{B}_r} = (J/I)^{0.25}\vv{B}\vv{V}_r\vv{\Lambda}_r^{1-\alpha}
\end{equation}
where $\vv{U}_r\vv{\Lambda}\vv{V}_r^\top$ is the singular value decomposition
of the $r$-th slice $\vv{G}_r$ of the core array $\vv{G}$. For all the details
of the construction of the joint biplot of Tucker3 model
see \cite{kroonenberg:2008}, p.273 and the references thereafter.\\\\
Let us now consider an example joint biplot using the Tucker3 model of the
girls' growth curves data. As a reference mode we choose the time mode
(see the components of the time mode depicted in the all-components plot in
Fig.~\ref{fig-girls-allcomp}). In the first component we can observe steady
increase up to the thirteenth year and then a decline in the last two years.
The straightforward interpretation of the first component would be the overall
variability, meaning that the differentiation between the girls increases
up to the thirteenth year and after that the difference decreases in the
last two years. The joint biplot of the girls and variables modes taking
the age mode as a reference one is constructed by the plot command using
the argument \code{which="jbplot"}. By default the third mode is selected
as a reference mode and again by default the slicing of the core array
is done by the first component.
The resulting plot is shown in Fig.~\ref{fig-girls-jointbiplot}.
The first component of the age mode is presented. The first striking feature
in the plot is that the variables fall into two main groups: length variables
and circumference (soft tissue) variables.
<<label=girls-jointbiplot, fig=TRUE, include=FALSE, width=7, height=7>>=
t3x <- do3Postprocess(t3, reflectC = -1)
plot(t3x, which = "jbplot")
@
\begin{figure}
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-girls-jointbiplot}
\caption{Joint  biplot for the girls growth curves data from a Tucker3
analysis with $3\times3\times2$ components for the first age component.}
\label{fig-girls-jointbiplot}
\end{figure}
\paragraph{Trajectory plot} In some cases one of the modes has the meaning
of a sequence this can be time (like the age in the girls data example)
or some kind of repeated measures. Then, we would be interested in what happens
to the combination of the other two modes with time, i.e., if we have
measurements of objects on a set of variables, we would be interested in how
the objects move in the subspace spanned by the variables with the time.
The result will be \emph{trajectories} followed by the objects in the
space of the variables. Since the full dimensional space cannot be used for
displaying the trajectories, we can make use of the projections on the
subspaces produced by the applied Tucker3 model. How the trajectories
can be computed is described in detail in \cite{kiers2000some}.\\\\
Let us now illustrate the procedure on the girls data set using the
visualization functions provided by the \pkg{rrcov3way}. As usually,
we call the function \code{plot()} on the object returned by the
Tucker3 function. To display the trajectory plot we specify
\code{which="tjplot"}. If the centering of the variables was done by
subject-occasion combination, the origin of the plot represents the
average profile of the subjects (with the average values on each
variable) for all occasions \citep[see][]{kroonenberg:2008}.
Then the patterns presented by the trajectories will describe
the deviations from this average profile. We see that most of the
girls start nearer to the origin and with each year diverge from it.
However, later, in the last several observed years, the trajectories turn
back. The arrows shown on the plot represent the variables. Since
all variables point to the right, it means that girls on the right side
of the plot increase their values with respect to the origin (like
girls 21 and 19) and girls on the left side of the plot lag behind
(like girls 9 and 11).
<<label=girls-tjplot, fig=TRUE, include=FALSE, width=7, height=7>>=
t3x <- do3Postprocess(t3, reflectB = -1)
plot(t3x, which = "tjplot", arrows = TRUE)
@
\begin{figure}[H]
\centering
\includegraphics[width=0.6\textwidth]{rrcov3way-girls-tjplot}
\caption{Trajectory plot for the girls growth curves data from a Tucker3
analysis with $3\times3\times2$ components. The trajectories represent
the development of the subjects (girls)
over time with respect to the variables.}
\label{fig-girls-tjplot}
\end{figure}
We have only 30 objects (girls) in 12 time points and can display them
all on the plot. However, in case of more objects and/or more time points,
the trajectories can be displayed selectively using the parameter \code{choices}.
To illustrate this option, in Fig.~\ref{fig-girls-tjplot2} are displayed
the trajectories of girls 9, 11, 25 and 18. Also the display of the variables
is suppressed by setting \code{arrows=FALSE}.
<<label=girls-tjplot2, fig=TRUE, include=FALSE, width=7, height=7>>=
plot(t3x, which = "tjplot", choices = c(9, 11, 25, 2), arrows = FALSE)
@
\begin{figure}[H]
\centering
\includegraphics[width=0.6\textwidth]{rrcov3way-girls-tjplot2}
\caption{Trajectory plot for several objects in the girls growth curves data from a Tucker3
analysis with $3\times3\times2$ components. }
\label{fig-girls-tjplot2}
\end{figure}

\section{Examples with data sets}
\label{sec:examples}

\subsection{Kojima data set: Judging Parents' Behaviour}
\label{sec:ex-kojima}
These data are drawn from a study \citep{kojima:1975} of the perception
of parental behaviour by parents and their children and consist of
two separate data sets: boys and girls. The data present ratings
expressing the judgements of parents with regard to their own
behaviour towards their children and judgements of the children
with respect to their parents. This results in four conditions
which in the case of the girls data set are
\bi
\item Father-Own behavior (F-F),
\item Mother-Own behavior (M-M),
\item Daughter-Father (D-F) and
\item Daughter-Mother (D-M).
\ei
The judgements we made by 150 middle-class Japanese eighth-grade boys
(153 girls) on 18 subscales and the corresponding three-way data sets are
150 (boys) $times$ 18 (scales) $times$ 4 (conditions) and 153 (girls)
$times$ 18 (scales) $times$ 4 (conditions) respectively.\\\\
The boys data were analyzed in \citet{kroonenberg:2008} and the girls data were analyzed in
\citet{kroonenberg:2009}. Here we will build a PARAFAC model for the Kojima girls data set.
<<label=Kojima>>=
data("Kojima")
dim(Kojima.girls)
head(dimnames(Kojima.girls)[[1]])
dimnames(Kojima.girls)[[2]]
dimnames(Kojima.girls)[[3]]
@
A standard way of preprocessing three-way profile data is to center
across individuals (per column in Mode A) and to normalize within variables
(per row in mode B).
<<label=Kojima-center>>=
X <- do3Scale(Kojima.girls, center = TRUE, scale = sd)
@
Now we perform classical PARAFAC analysis on the centered and normalized girls data
using the function \code{Parafac()} with the default settings and selecting 3 components
as it was done in \cite{kroonenberg:2009}.
<<label=Kojima-Parafac>>=
cp <- Parafac(X, ncomp = 3, const = c("orth", "none", "none"), conv = 1e-10)
cp
@

The fitted sum of squares explained by the model is 39\%.
According to \cite{kroonenberg:2008} it is possible
to apply orthogonality constraints on the first mode (individuals) to
improve the solution,however, this improvement is negligible. Furthermore,
it is necessary that at least one of the component matrices has
orthogonal columns in order to be able to compute the explained
variance by component. The orthogonal constraint to mode A is set by the
parameter \code{const=c("orth", "none", "none")} - this will set the constraints
on modes B and C to \code{"none"}.\\\\
As usual in the analysis of profile data the interpretation starts with
the variables (i.e., mode B, the scales in our example) and we want to
display these as principal coordinates \citep[see][9.4, page 219]{kroonenberg:2008}.
We reflect the solution (change the sign) and rearrange the components as in
Table 1 in \cite{kroonenberg:2009}. To renormalize the solution we use the generic
function \code{do3Scale()}---for a PARAFAC solution this function will normalize two
of the modes to unit sum of squares and will compensate in the third one. Since we
want first to look at the variables and conditions, we choose to compensate
the normalization in mode A. Since the purpose of this example is just to illustrate the
application of the package \pkg{rrcov3way} for profile data analysis, we will not go
further into detail, the interested reader is referred to
\cite{kroonenberg:2009}.
<<label=Kojima-Parafac-postprocess>>=
cp.norm <- do3Scale(cp, mode = "A")
cp.norm <- do3Postprocess(cp.norm, reflectB = c(-1, -1, -1))
cp.norm <- do3Postprocess(cp.norm, reorder = c(2, 3, 1))
b.pc <- coordinates(cp.norm, mode = "B", type = "principal")
@
Next we display the results: mode A, the variables, in principal coordinates,
the standardized weights (the explained variance per-component) and the
third mode, the conditions, in normalized coordinates.
<<label=Kojima-Parafac-Table1-2>>=
round(b.pc, 2)
round(weights(cp.norm), 2)
round(coordinates(cp.norm, mode = "C"), 2)
@
The most revealing plot for this analysis is the pre-component plot representing
the first component across all modes, which is shown in Fig.~\ref{fig-Kojima-percomp}.
\begin{figure}[H]
\centering
<<label=Kojima-percomp, fig=TRUE, include=TRUE, width=7, height=7>>=
plot(cp.norm, which = "percomp")
@
\caption{Per-component plot for the Kojima girls profile analysis from a PARAFAC analysis with 3 components.}
\label{fig-Kojima-percomp}
\end{figure}

Using the Tucker congruence coefficient, see~\ref{sec:appendix}, which
is the accepted way for comparing components,
we can investigate the relation between PARAFAC
models with different number of components
\citep[see][Table 2]{kroonenberg:2009}.
<<label=Kojima-girls-congruence>>=
cp1 <- Parafac(X, ncomp = 1, const = c("orth", "none", "none"),
    maxit = 10000, conv = 1e-10)
cp2 <- Parafac(X, ncomp = 2, const = c("orth", "none", "none"),
    maxit = 10000, conv = 1e-10)
cp2 <- do3Postprocess(cp2, reorder = TRUE)
cp3 <- Parafac(X, ncomp = 3, const = c("orth", "none", "none"),
    maxit = 10000, conv = 1e-10)
cp3 <- do3Postprocess(cp3, reorder = TRUE)

round(congruence(cp1$A, cp1$A), 2)
round(congruence(cp2$A, cp1$A), 2)
round(congruence(cp3$A, cp1$A), 2)

round(congruence(cp1$A, cp2$A), 2)
round(congruence(cp2$A, cp2$A), 2)
round(congruence(cp3$A, cp2$A), 2)

round(congruence(cp1$A, cp3$A), 2)
round(congruence(cp2$A, cp3$A), 2)
round(congruence(cp3$A, cp3$A), 2)
@
Most of the congruence coefficients are far from one
which means that the orthogonal subject spaces of
the Kojima girls data are nearly unrelated.
\subsection{Analyzing water quality data}
\label{sec:ex-arno}

The data set Arno contains data from the study of several physico-chemical processes,
natural or attributable to anthropogenic phenomena present along the Arno
river (central-northern Apennines, Italy). The impact of those variables
was modeled by \cite{gallo-buccianti:2013} through a weighted principal
component analysis with the aim to assess the impact of water
pollution on the environmental and ecological characteristics
of the river across time and space.
The data in the three-way array was obtained by sampling carried out
between May 2002 and October 2003, at 23 locations along
the river, at different distances from the the springs. The main
chemical composition of water measured by 11 compositional parts
(HCO3, Cl, SO4, NH4, NO2, NO3, Na, K, Ca, Mg, SiO2) was recorded
for 92 compositions (23 locations at four occasions). The 23
compositions, associate to different spatial coordinates,
indicating distance from the spring, were arranged by rows,
the 11 parts by columns and the 4 occasions by tubes, thus
resulting in a $23 \times 11 \times 4$ array. All the details
on the data collection can be found in \cite{gallo-buccianti:2013}.
One of the front slices (the first occasion) is presented below in the original measure:
\footnotesize
<<echo=false>>=
options(width = 120)
@
<<label=Arno-data>>=
data("Arno")
Arno[, , 1]
@
<<echo=false>>=
options(width = 60)
@
\normalsize
The original data are expressed in mg/L, (equivalent to ppm if data are multiplied
for the density of the water (g/cm3)). The intrinsic characteristic of this data set
suggests using compositional data analysis.\\\\
To perform compositional Tucker3 analysis we call the function
\code{Tucker3()} using the parameter \code{coda.transform="ilr"}.
At the same time the preprocessing is invoked by setting \code{center=TRUE}
and \code{center.mode="AB"}.
Thus the data were first \emph{ilr}-transformed (losing one dimension in
the second mode) and then centered across the first and second mode so that
the average of each row and each column was zero.
\cite{gallo-buccianti:2013} suggest the number of components for the three modes to be chosen as
($2 \times 2 \times 1$) which results in a fit of 60.02\%.

<<label=Arno-t3-comp>>=
(Arnot3 <- Tucker3(Arno, P = 2, Q = 2, R = 1,
    center = TRUE, center.mode = "AB", coda.transform = "ilr"))
@

Of great help for the proper interpretation of the results can be the different
visual representations of these outcomes. First of all let us look at the pair-wise
graphs of the components for the first two modes separately, shown in
Fig.~\ref{fig-Arno-t3-comp-pair-wise}.
It should be reminded that the analysis is carried out
on \textit{ilr} coordinates while the different plots display the \textit{clr}
coordinates allowing a better interpretability of the individuals (locations)
and variables (compositional parts). In Fig.~\ref{fig-Arno-t3-comp-pair-wise}.b
the length of the arrows give information about the standard
deviation of the corresponding centered logratio. It is more important to look at the
the distance between the tips of the arrows, known as \emph{links} which estimates
the standard deviation of the ratios between chemical elements. The correlation
between the parts (variables) is displayed in terms of angle between the arrows.

<<label=Arno-t3-comp-modeA, fig=TRUE, include=FALSE, width=8, heigth=5>>=
plot(Arnot3, which = "comp", main = "(a) Paired component plot (mode A)")
@
<<label=Arno-t3-comp-modeB, fig=TRUE, include=FALSE, width=8, heigth=5>>=
plot(Arnot3, which = "comp", mode = "B",
    main = "(b) Paired component plot (mode B)")
@

\begin{figure}[H]
\centering
\includegraphics[width=0.45\textwidth]{rrcov3way-Arno-t3-comp-modeA}
\includegraphics[width=0.45\textwidth]{rrcov3way-Arno-t3-comp-modeB}
\caption{Pair-wise component plot for the Tucker3 decomposition of the Arno data.
The left-hand panel (a) corresponds to mode A and the right-hand one, (b) ---to mode B.}
\label{fig-Arno-t3-comp-pair-wise}
\end{figure}

To investigate the relationship between the
elements of different modes, the decomposition can be presented in
a \emph{joint biplot} which shows both compositions and parts (assuming
that the third mode, the time, is chosen as a reference mode) in the same
plot, see Fig.~\ref{fig-Arno-t3-comp-joint}.

<<label=Arno-t3-comp-joint, fig=TRUE, include=FALSE, width=8, heigth=5>>=
plot(Arnot3, which = "jbplot", main = "Joint biplot")
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-Arno-t3-comp-joint}
\caption{Joint component plot for the Tucker3 decomposition of the Arno data.}
\label{fig-Arno-t3-comp-joint}
\end{figure}

The most important variation is shown by Cl and Na followed by
NO\textsubscript{2} and NH\textsubscript{4}. For Cl a high variation has to be attributable to
different phenomena such as the presence of evaporation, pollution or, to the
contribution of sea water intrusion perturbing the samples near the mouth of the river, all
contributions that suffer the effect of seasonality. Na shows the same pattern.
High variability of NO\textsubscript{2} and NH\textsubscript{4} is
attributable to anthropogenic sources changing in time and space and
to the chemical reactions that transformed the reduced forms into
the more stable species, NO\textsubscript{3}, in oxidized environment.

Considering the links between components which represent the ratio
between chemical parts, we see that the ratio Na\textsuperscript{+}/Cl\textsuperscript{-}
displayed as a very small segment joining these chemical part shows
that the value is almost constant, preserving the environmental conditions
affecting the behavior of the variables.
A similar conclusion is demonstrated for SO\textsubscript{4}\textsuperscript{2-} and
K\textsuperscript{+} and for SiO\textsubscript{2} and Ca\textsuperscript{2+}.
Projecting the composition "La Casina" on the ratio
Mg\textsuperscript{2+}/NH\textsuperscript{+}\textsubscript{4} reveals
for the same variables a possible perturbation due to different
sources (increase of Mg\textsuperscript{2+} or decrease of
NH\textsuperscript{+}\textsubscript{4}, caused by seasonality).

Considering the compositions, we notice that ``LaCasina'', ``Rosano'', ``PisaP'' and ``Arno''
are located in the opposite side with respect  to ``Camaioni'' and ``Montelupo''
which can be explained by the fact that the latter are mainly characterized
by an important contribution of NH\textsubscript{4}\textsuperscript{+}
along the river's path, indicating that these sites suffered higher
pollution during the sampling period than most of the others.

If we look at the third mode as a time variable, and draw the trajectory plot
as shown in Fig.~\ref{fig-Arno-t3-comp-trajectory}, it is possible to
analyze the effect of seasonality and to point out different behavior
for the different spatial positions.
<<label=Arno-t3-comp-trajectory, fig=TRUE, include=FALSE, width=8, heigth=5>>=
plot(Arnot3, which = "tjplot", main = "Trajectory biplot")
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-Arno-t3-comp-trajectory}
\caption{Trajectory plot for the Tucker3 decomposition of the Arno data.}
\label{fig-Arno-t3-comp-trajectory}
\end{figure}
It is clear that most of the samples were located on a linear pattern
which in part corresponded to the distance from the source or the distance from
different sub-basins that occurred along the river's path.
Some differences are demonstrated for two sites whose results
are mostly influenced by the closeness to the mouth (influence of
sea water intrusion) which modified the chemical composition.
\subsection{Decomposition of fluorescence data}
\label{sec:ex-amino}
The purpose of this example is to demonstrate the use of nonnegativity constraints in PARAFAC.
The data set \citep{bro:1997} is available in \proglang{MATLAB} format at \url{http://www.models.life.ku.dk/Amino_Acid_fluo}.
It consists of five simple laboratory-made samples where each sample contains
different amounts of tyrosine, tryptophan and phenylalanine dissolved in phosphate
buffered water. The samples were measured by fluorescence (excitation 240-300 nm,
emission 250-450 nm, 1 nm intervals) on a PE LS50B spectrofluorometer.
The array to be decomposed is thus $5 \times 201 \times 61$. First we plot
the emission and excitation measurements of one sample. The result is shown in Fig~\ref{fig-amino-surface}.
<<label=amino-surface, fig=TRUE, include=FALSE, width=8, height=8>>=
data("amino")
x <- as.numeric(dimnames(amino)[[2]])
y <- as.numeric(dimnames(amino)[[3]])
persp(x, y, amino[2, , ], ticktype = "detailed",
    theta = -15, phi = 30, col = "lightblue",
    xlab = "Emission wavelength", ylab = "Excitation wavelength",
    zlab = "Intensity")
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-amino-surface}
\caption{Fluorescence landscape of one sample.}
\label{fig-amino-surface}
\end{figure}

In the next figure Fig.~\ref{fig-amino-excitation} are shown two excitation spectra.
These data have been investigated several times and always using PARAFAC with
three components. We expect that ideally the data should be describable with
three PARAFAC components because each individual amino acid gives a rank-one
contribution to the data.
<<label=amino-excitation, fig=TRUE, include=FALSE, width=8, height=8>>=
library("ggplot2")
library("reshape2")
mamino1= melt(amino[, 1 ,], value.name = "Intensity")
mamino2= melt(amino[, 30 ,], value.name = "Intensity")
mamino <- rbind(cbind(id = dimnames(amino)[[2]][1], mamino1),
    cbind(id = dimnames(amino)[[2]][30], mamino2))
colnames(mamino)[2] <- "Sample"
colnames(mamino)[3] <- "Emission Wavelength"
p <- ggplot(data=mamino, aes(x = `Emission Wavelength`, y = Intensity)) +
    geom_line(aes(colour = Sample)) +
    facet_wrap(~id, nrow = 2, scales = "free") +
theme_bw()
p
@
\begin{figure}[H]
\centering
\includegraphics[width=0.7\textwidth]{rrcov3way-amino-excitation}
\caption{Plot of two excitation spectra.}
\label{fig-amino-excitation}
\end{figure}

Therefore we run the default PARAFAC (no constraints)
with three components and then the same, but with nonnegativity constraints, setting the parameter
\code{const="nonneg"}. The results of the two versions are plotted as \emph{all components plot}
using the plot function with parameter \code{which="allcomp"} and are
shown in Fig~\ref{fig-amino-PARAFAC}.
<<label=amino-PARAFAC>>=
set.seed(1234)
(p1 <- Parafac(amino, 3, const = "nonneg", start = "random"))
(p2 <- Parafac(amino, 3, const = "none", start = "random"))
@
<<label=amino-PARAFAC-nonneg, fig=TRUE, include=FALSE, width=7, height=5>>=
plot(p1, which = "allcomp", mode = "B", points = FALSE,
    legend.position = NULL, xlab = "Wavelength",
    main = "(b)")
@
<<label=amino-PARAFAC-none, fig=TRUE, include=FALSE, width=7, height=5>>=
plot(p2, which = "allcomp", mode = "B", points = FALSE,
    legend.position = NULL, xlab = "Wavelength",
    main = "(a)")
@
\begin{figure}[H]
\centering
\includegraphics[width=0.45\textwidth]{rrcov3way-amino-PARAFAC-none}
\includegraphics[width=0.45\textwidth]{rrcov3way-amino-PARAFAC-nonneg}
\caption{Emission spectra decomposition: (a) Unconstrained PARAFAC with three
components and (b) Nonnegative PARAFAC with three components.}
\label{fig-amino-PARAFAC}
\end{figure}
\section{Summary and Conclusions}
\label{sec:conclusions}
Multi-way analysis received a growing attention in chemistry,
chemometrics, economics and other disciplines in the recent
decades, with the most prominent three-way models being
Tucker3 and PARAFAC. This gave raise to a number of software
tools implementing these models and different variations of
them.  While most of these tools are developed for \proglang{MATLAB}, recently
a number of \proglang{R}~packages appeared too. With the development of
our \proglang{R}~package we address two important issues in three-way
analysis which were not covered so far by \proglang{R}~or other software,
namely: (i) the robustness of the estimation procedures to
outliers in the data  and (ii) handling of compositional data.
Thus, the main purpose of this work is to illustrate the
unique tools introduced in the \proglang{R}~ package \pkg{rrcov3way} for
modeling three-way data and three-way compositions with or
without outliers. This is achieved by demonstrating through
real data examples the relevance and correct use of the standard,
robust, compositional and compositional-robust procedures included
in the package and of other functions provided for the proper
treatment and representation of results. In addition the package
also includes a useful set of plotting tools which allow for
quick and correct visualization of the main results.\\\\
We plan to expand the models implemented in \pkg{rrcov3way} and presented
in this paper to supervised methods like three-way partial
least squares and discriminant analysis. Addition of more diagnostic
and model selection tools will contribute to the usefulness of the package.
Also, interactive visualization features seem to offer a promising
path for visual diagnostics, which is another subject
for future work.
\bibliography{references}
\section*{Appendix 1: Useful formulas and procedures}
\label{sec:appendix}
Let us consider matrices $\vv{A} = (a_{ij})$ and $\vv{C} = (c_{ij})$ of order $ m \times n$ and a matrix
$\vv{B} = (b_{kl})$ of order $p \times q$.
\subsection* {Hadamard product}
\label{app:Hadamard}
The Hadamard product (or elementwise product) is defined as follows
\[
\vv{A} \circ \vv{C} = (a_{ij} c_{ij})_{ij},
\]
where the scalar $a_{ij} c_{ij}$ is the $ij$-th element of the resulting matrix $\vv{A} \circ \vv{C}$, which is of order $m \times n$.
\subsection* {Kronecker product}
\label{app:Kronecker}
\[
\vv{A} \otimes \vv{B} = (a_{ij} \vv{B})_{ij},
\]
where the resulting matrix $\vv{A} \otimes \vv{B}$ is of order $mp \times nq$ and $a_{ij}B$ is the $ij$-th submatrix of order $p \times q$.
\subsection* {Khatri-Rao product}
\label{app:kr}
The Khatri-Rao product of two matrices $\vv{A}$ and $\vv{B}$, of dimensions $I \times K$ and $J \times K$ respectively is defined as
\[
\vv{A} \odot \vv{B} = (\vv{A}_{ij} \otimes \vv{B}_{ij})_{ij},
\]
The result is an $IJ \times K$ matrix formed by the matching \emph{column-wise Kronecker products}.
In \pkg{rrcov3way} the Khatri-Rao product is computed by the function \code{krp(A, B)}.
\subsection* {Tucker congruence coefficient}
\label{app:congruence}
The congruence coefficient is an index of the similarity between factors
which is similar to a correlation coefficient except that the components are not in
deviation of their means. It is known as ``Tucker's coefficient'' \citep{Tucker:1951}, however it was first introduced by \citet{Burt:1948}.
The congruence coefficient $\phi$ between two components $\vv{x}$ and $\vv{y}$ is given by the following formula:
\[
\phi_{xy} = \frac{\sum\vv{x}\vv{y}}{\sqrt{\sum\vv{x}^2 \sum\vv{y}^2}}.
\]
In \pkg{rrcov3way} the Tucker's congruence coefficient can be computed by the function
\code{congruence(x,y)}. The input can be either two vectors, one matrix (by default \code{y=NULL}) or two matrices. If one matrix \code{X} is passed to the function, the congruence coefficients between its columns will be computed. The result is a symmetric matrix with ones on the diagonal. If two matrices are provided, they must have the same size and the result is a square matrix containing the congruence coefficients between all pairs of columns of the two matrices.
\end{document}