| Type: | Package |
| Title: | Aggregated Latent Space Index for Binary, Ordinal, and Continuous Data |
| Version: | 0.2.0 |
| Date: | 2026-03-03 |
| Description: | Provides three stability-validated pipelines for computing an Aggregated Latent Space Index (ALSI): a binary MCA pipeline (alsi_workflow()), an ordinal pipeline using homals alternating least squares optimal scaling (alsi_workflow_ordinal()), and a continuous ipsatized SVD pipeline (calsi_workflow()). All three pipelines share a common bootstrap dual-criterion stability framework (principal angles and Tucker congruence phi) for determining the number of dimensions to retain before index construction. The package is designed to complement Segmented Profile Analysis (SEPA) and is intended for psychometric scale construction and dimensional reduction in survey and clinical research. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| LazyData: | true |
| RoxygenNote: | 7.3.3 |
| Imports: | homals, stats, graphics, utils |
| Suggests: | paran, readxl, openxlsx, testthat (≥ 3.0.0), knitr, rmarkdown, spelling |
| Depends: | R (≥ 4.1.0) |
| Language: | en-US |
| NeedsCompilation: | no |
| Packaged: | 2026-03-04 00:56:11 UTC; sekangkim |
| Author: | Se-Kang Kim |
| Maintainer: | Se-Kang Kim <se-kang.kim@bcm.edu> |
| Repository: | CRAN |
| Date/Publication: | 2026-03-04 08:40:18 UTC |
Fit homals and return person scores, stacked category scores, eigenvalues
Description
Converts ordered-factor columns to plain integers before calling homals. Passing ordered factors directly causes ALS to collapse to a trivial zero-discrimination solution (verified in homals 1.0.11).
Usage
.alsi_fit_homals(X, ndim, suppress_warnings = FALSE, itermax = 1000L)
Arguments
X |
Ordered-factor data frame. |
ndim |
Integer. Number of dimensions to extract. |
suppress_warnings |
Logical. Muffle "Loss function increases" warnings (expected on permuted data; meaningful on real data). |
itermax |
Integer. Maximum ALS iterations (default 1000). |
Value
List: Z [n x ndim], C [P x ndim], lambda [ndim], fit (raw object).
Create disjunctive (indicator) matrix from binary data
Description
Create disjunctive (indicator) matrix from binary data
Usage
.alsi_make_disjunctive(X01)
Perform Multiple Correspondence Analysis on binary indicator matrix
Description
Perform Multiple Correspondence Analysis on binary indicator matrix
Usage
.alsi_mca_indicator(Xbin01)
Arguments
Xbin01 |
Data frame or matrix with binary (0/1) variables |
Value
List containing MCA results with eigenvalues, coordinates, and masses
Read Excel file with fallback options
Description
Read Excel file with fallback options
Usage
.alsi_read_xlsx(path)
Summarize matrix columns (median and quantiles)
Description
Summarize matrix columns (median and quantiles)
Usage
.alsi_summarise_matrix(X, probs = c(0.05, 0.95))
Perform ipsatized SVD on continuous data (SEPA-style)
Description
Perform ipsatized SVD on continuous data (SEPA-style)
Usage
.alsi_svd_ipsatized(X, K = NULL)
Arguments
X |
Data frame or matrix with continuous variables (persons x domains) |
K |
Number of dimensions to retain (default: all) |
Value
List containing SVD results with eigenvalues, coordinates
Convert various formats to binary 0/1
Description
Convert various formats to binary 0/1
Usage
.alsi_to01(x)
ANR2: Binary Psychiatric Comorbidity Dataset
Description
A binary indicator dataset recording the presence (1) or absence (0) of
nine psychiatric diagnoses for a sample of patients. The dataset is
included as the primary example dataset for the binary MCA pipeline
(alsi_workflow).
Usage
data(ANR2)
Format
A data frame with 13 columns:
- MDD
Major Depressive Disorder (0/1)
- DYS
Dysthymia (0/1)
- DEP
Depressive disorder NOS (0/1)
- PTSD
Post-Traumatic Stress Disorder (0/1)
- OCD
Obsessive-Compulsive Disorder (0/1)
- GAD
Generalized Anxiety Disorder (0/1)
- ANX
Anxiety disorder NOS (0/1)
- SOPH
Social Phobia (0/1)
- ADHD
Attention Deficit Hyperactivity Disorder (0/1)
- pre_EDI
Pre-treatment EDI score (numeric)
- post_EDI
Post-treatment EDI score (numeric)
- pre_bmi
Pre-treatment BMI (numeric)
- post_bmi
Post-treatment BMI (numeric)
Examples
data(ANR2)
vars <- c("MDD", "DYS", "DEP", "PTSD", "OCD", "GAD", "ANX", "SOPH", "ADHD")
results <- alsi_workflow(ANR2, vars = vars, B_pa = 100, B_boot = 100)
Compute Aggregated Latent Space Index (ALSI)
Description
Calculates ALSI as a variance-weighted Euclidean norm of row principal coordinates within a retained K-dimensional MCA subspace.
Usage
alsi(Fmat, eig, K)
Arguments
Fmat |
Matrix of row principal coordinates (N x K or larger) |
eig |
Vector of eigenvalues (inertias) |
K |
Integer, number of dimensions to aggregate |
Value
S3 object of class alsi containing:
alpha |
Numeric vector of ALSI values (length N), representing each individual's variance-weighted distance from the centroid in the retained MCA subspace |
w |
Variance weights (length K), computed as the proportion of retained inertia for each dimension |
alpha_vec |
Aggregated direction vector (length K), equal to sqrt(w), used for projecting category coordinates |
K |
Number of dimensions used in aggregation |
Examples
# Create example data
set.seed(123)
Fmat <- matrix(rnorm(100 * 4), nrow = 100, ncol = 4)
eig <- c(0.5, 0.3, 0.15, 0.05)
# Compute ALSI
a <- alsi(Fmat, eig, K = 3)
print(a)
hist(a$alpha, main = "Distribution of ALSI")
Complete ALSI Analysis Workflow
Description
Runs a complete ALSI analysis including parallel analysis for dimensionality assessment, bootstrap stability evaluation, ALSI computation, and visualization.
Usage
alsi_workflow(
data,
vars,
B_pa = 2000,
B_boot = 2000,
q = 0.95,
seed = 20260123
)
Arguments
data |
Data frame or path to .xlsx file |
vars |
Character vector of binary variable names |
B_pa |
Number of permutations for parallel analysis (default: 2000) |
B_boot |
Number of bootstrap resamples (default: 2000) |
q |
Quantile for parallel analysis (default: 0.95) |
seed |
Random seed for reproducibility |
Value
List (returned invisibly) containing all analysis objects:
pa |
Parallel analysis results (class |
boot |
Bootstrap stability results (class |
fit |
MCA fit object (class |
alsi |
ALSI values (class |
K |
Number of dimensions retained based on parallel analysis |
Examples
data(ANR2)
vars <- c("MDD", "DYS", "DEP", "PTSD", "OCD", "GAD", "ANX", "SOPH", "ADHD")
results <- alsi_workflow(
data = ANR2,
vars = vars,
B_pa = 100,
B_boot = 100
)
results$pa
results$boot
results$alsi
Ordinal ALSI pipeline via homals ALS optimal scaling
Description
Runs the four-stage ordinal ALSI pipeline:
Permutation parallel analysis (column-wise shuffle preserves marginals, destroys inter-item structure) determines K_PA.
Reference homals fit followed by varimax rotation on the stacked category score matrix (the loading analogue in homogeneity analysis). The same rotation matrix is applied to person scores.
Bootstrap dual-criterion stability. For each resample, homals is refitted and the category score matrix is Procrustes-aligned to the reference. Principal angle and Tucker congruence phi are computed on the same post-Procrustes matrix. K* is the largest k where ALL dimensions 1..k satisfy BOTH criteria simultaneously.
Eigenvalue-weighted linear ALSI index from K* retained rotated person scores (result can be negative; z-standardized version also returned).
Usage
alsi_workflow_ordinal(
data,
items,
reversed_items = character(0L),
scale_min = 1L,
scale_max = 5L,
n_permutations = 100L,
pa_percentile = 95,
B_boot = 1000L,
angle_threshold_deg = 20,
tucker_threshold = 0.85,
seed = 12345L,
itermax = 1000L,
verbose = TRUE
)
Arguments
data |
A |
items |
Character vector of item column names. |
reversed_items |
Character vector of items to reverse-score
( |
scale_min |
Integer. Lowest valid response value (default 1). |
scale_max |
Integer. Highest valid response value (default 5). |
n_permutations |
Integer. Permutation replicates for Stage 1 (100). |
pa_percentile |
Numeric. Null-distribution percentile cutoff (95). |
B_boot |
Integer. Bootstrap replicates for Stage 3 (1000). |
angle_threshold_deg |
Numeric. Max principal angle in degrees for a dimension to pass the stability criterion (default 20). |
tucker_threshold |
Numeric. Min Tucker congruence phi for a dimension to pass the replicability criterion (default 0.85). |
seed |
Integer. Random seed (default 12345). |
itermax |
Integer. Max ALS iterations passed to homals (1000). |
verbose |
Logical. Print progress messages (default TRUE). |
Value
An S3 object of class "alsi_ordinal" with components:
- ALSI_index
Numeric vector (n). Raw eigenvalue-weighted linear composite. Can be negative.
- ALSI_z
Numeric vector (n). Z-standardized ALSI.
- K_PA
Integer. Dimensions retained by parallel analysis.
- K_star
Integer. Final model order after dual-criterion selection.
- Z_ref
Matrix n x K_PA. Varimax-rotated person scores.
- C_ref
Matrix P x K_PA. Varimax-rotated stacked category scores.
- lambda_rot
Numeric vector (K_PA). Eigenvalues (invariant to varimax rotation).
- stability_table
Data frame. Per-dimension stability metrics (eigenvalue, angle, Tucker phi, pass/fail, grade).
- pa_table
Data frame. Parallel analysis results per dimension.
- n_skipped
Integer. Bootstrap replicates discarded due to non-convergence or degenerate resamples.
- call
The matched call.
References
de Leeuw, J., & Mair, P. (2009). Gifi methods for optimal scaling in R: The package homals. Journal of Statistical Software, 31(4), 1-21.
Gifi, A. (1990). Nonlinear multivariate analysis. Wiley.
Lorenzo-Seva, U., & ten Berge, J. M. F. (2006). Tucker's congruence coefficient as a meaningful index of factor similarity. Methodology, 2, 57-64.
Takane, Y., Young, F. W., & de Leeuw, J. (1977). Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features. Psychometrika, 42, 7-67.
Compute Continuous Aggregated Latent Space Index (cALSI)
Description
Calculates cALSI as a variance-weighted Euclidean norm of row coordinates within a retained K-dimensional ipsatized SVD subspace.
Usage
calsi(F, eig, K)
Arguments
F |
Matrix of row coordinates (N x K or larger) |
eig |
Vector of eigenvalues |
K |
Integer, number of dimensions to aggregate |
Value
S3 object of class calsi containing:
alpha |
Numeric vector of cALSI values (length N) |
w |
Variance weights (length K) |
alpha_vec |
Aggregated direction vector (sqrt of weights) |
K |
Number of dimensions used |
Demonstrate what cALSI adds beyond SEPA
Description
Demonstrate what cALSI adds beyond SEPA
Usage
calsi_vs_sepa_demo(data, K = 4, B_boot = 2000, seed = 20260206)
Arguments
data |
Data matrix |
K |
Number of dimensions |
B_boot |
Bootstrap samples for stability |
seed |
Random seed |
Value
List with comparison results
Complete cALSI Workflow for Continuous Data
Description
Integrates parallel analysis, bootstrap stability, and cALSI computation.
Usage
calsi_workflow(
data,
B_pa = 2000,
B_boot = 2000,
q = 0.95,
seed = 20260206,
K_override = NULL
)
Arguments
data |
Data frame or matrix of continuous variables |
B_pa |
Number of permutations for parallel analysis |
B_boot |
Number of bootstrap resamples |
q |
Quantile for parallel analysis |
seed |
Random seed |
K_override |
Optional: override parallel analysis K* with specified value |
Value
List containing all analysis objects
Compare SEPA plane-wise summaries with cALSI
Description
Compare SEPA plane-wise summaries with cALSI
Usage
compare_sepa_calsi(fit, K, target_ids = NULL)
Arguments
fit |
SVD fit object |
K |
Number of dimensions |
target_ids |
Optional vector of person IDs to highlight |
Value
Data frame comparing SEPA and cALSI person-level indices
Align MCA solution via Procrustes rotation with sign anchoring
Description
Performs orthogonal Procrustes rotation to align a set of category coordinates to a reference solution, then applies sign correction to maximize agreement with the reference on each dimension.
Usage
mca_align(G, Gref)
Arguments
G |
Matrix of category coordinates to align (M x K) |
Gref |
Reference matrix of category coordinates (M x K) |
Value
List containing:
G_aligned |
Matrix of aligned category coordinates (M x K), rotated and sign-corrected to match the reference |
R |
Orthogonal rotation matrix (K x K) used for alignment |
Examples
# Create example matrices
set.seed(123)
Gref <- matrix(rnorm(20), nrow = 10, ncol = 2)
G <- Gref %*% matrix(c(0.8, 0.6, -0.6, 0.8), 2, 2)
# Align G to Gref
aligned <- mca_align(G, Gref)
print(aligned$G_aligned)
Bootstrap-Based Subspace Stability Assessment
Description
Evaluates reproducibility of retained MCA dimensions via bootstrap resampling. Quantifies stability using Procrustes principal angles (subspace-level) and Tucker's congruence coefficients (dimension-level).
Usage
mca_bootstrap(data, vars, K, B = 2000, seed = 20260123, verbose = TRUE)
Arguments
data |
Data frame or path to .xlsx file |
vars |
Character vector of binary variable names |
K |
Integer, number of dimensions to retain and assess |
B |
Integer, number of bootstrap resamples (default: 2000) |
seed |
Integer, random seed for reproducibility |
verbose |
Logical, print progress messages |
Value
S3 object of class mca_bootstrap containing:
ref |
Reference MCA fit object (class |
K |
Number of dimensions assessed |
B |
Number of bootstrap resamples performed |
angles |
Matrix of principal angles in degrees (B x K), measuring subspace similarity between bootstrap and reference solutions |
tucker |
Matrix of Tucker congruence coefficients (B x K), measuring dimension-level similarity after Procrustes alignment |
angles_summary |
Summary statistics (median, 5th, 95th percentiles) for principal angles |
tucker_summary |
Summary statistics (median, 5th, 95th percentiles) for Tucker congruence coefficients |
Examples
# Using included ANR2 dataset
data(ANR2)
vars <- c("MDD", "DYS", "DEP", "PTSD", "OCD", "GAD", "ANX", "SOPH", "ADHD")
boot <- mca_bootstrap(ANR2, vars = vars, K = 3, B = 100)
print(boot)
Parallel Analysis for MCA Dimensionality Assessment
Description
Compares observed MCA eigenvalues against reference distributions from permuted data to identify statistically meaningful dimensions.
Usage
mca_pa(
data,
vars,
B = 2000,
q = 0.95,
seed = 20260123,
max_dims = 20,
verbose = TRUE
)
Arguments
data |
Data frame or path to .xlsx file |
vars |
Character vector of binary variable names |
B |
Integer, number of permutations (default: 2000) |
q |
Numeric, reference quantile for retention (default: 0.95) |
seed |
Integer, random seed for reproducibility |
max_dims |
Integer, maximum dimensions to display in plot |
verbose |
Logical, print progress messages |
Value
S3 object of class mca_pa containing:
eig_obs |
Observed eigenvalues from the MCA of the original data |
eig_q |
Reference quantiles from permutation distribution |
eig_perm |
Matrix of permutation eigenvalues (B x dimensions) |
K_star |
Suggested number of dimensions to retain (where observed > reference) |
fit |
MCA fit object (class |
q |
Quantile threshold used for comparison |
B |
Number of permutations performed |
Examples
# Using included ANR2 dataset
data(ANR2)
vars <- c("MDD", "DYS", "DEP", "PTSD", "OCD", "GAD", "ANX", "SOPH", "ADHD")
pa <- mca_pa(ANR2, vars = vars, B = 100)
print(pa$K_star)
Plot Category Projections in MCA Space
Description
Visualizes category coordinates in a 2D MCA subspace and optionally displays projections onto the aggregated ALSI direction.
Usage
plot_category_projections(
fit,
K,
alpha_vec = NULL,
dim_pair = c(1, 2),
cex = 0.8,
top_n = 15
)
Arguments
fit |
MCA fit object (class |
K |
Number of dimensions in retained subspace |
alpha_vec |
Optional aggregated direction vector (from |
dim_pair |
Integer vector of length 2, dimensions to plot (default: c(1,2)) |
cex |
Character expansion for labels |
top_n |
Number of top categories to display by projection (default: 15) |
Value
No return value, called for side effects. The function creates
a scatter plot of category coordinates in the specified 2D subspace,
with category labels displayed. If alpha_vec is provided, it also
prints the top categories ranked by their absolute projection onto the
ALSI direction to the console.
Examples
data(ANR2)
vars <- c("MDD", "DYS", "DEP", "PTSD", "OCD", "GAD", "ANX", "SOPH", "ADHD")
pa <- mca_pa(ANR2, vars = vars, B = 100, verbose = FALSE)
fit <- pa$fit
plot_category_projections(fit, K = pa$K_star)
Plot Domain Loadings in SVD Space
Description
Visualizes domain loadings in a 2D subspace (biplot-style).
Usage
plot_domain_loadings(fit, dim_pair = c(1, 2), cex = 1)
Arguments
fit |
SVD fit object (class |
dim_pair |
Integer vector of length 2, dimensions to plot |
cex |
Character expansion for labels |
Plot Subspace Stability Diagnostics
Description
Creates diagnostic plots showing distributions of principal angles and Tucker congruence coefficients across bootstrap resamples.
Usage
plot_subspace_stability(boot_obj)
Arguments
boot_obj |
Object of class |
Value
No return value, called for side effects. The function creates a two-panel figure with: (1) boxplots of principal angles (left panel), showing the distribution of subspace similarity across bootstrap resamples for each dimension; and (2) boxplots of Tucker congruence coefficients (right panel), showing dimension-level replicability with reference lines at phi = 0.85 (good) and phi = 0.95 (excellent).
Examples
data(ANR2)
vars <- c("MDD", "DYS", "DEP", "PTSD", "OCD", "GAD", "ANX", "SOPH", "ADHD")
boot <- mca_bootstrap(ANR2, vars = vars, K = 3, B = 100)
plot_subspace_stability(boot)
Plot Subspace Stability Diagnostics for Continuous Data
Description
Plot Subspace Stability Diagnostics for Continuous Data
Usage
plot_subspace_stability_cont(boot_obj)
Arguments
boot_obj |
Object of class |
Align SVD solution via Procrustes rotation with sign anchoring
Description
Align SVD solution via Procrustes rotation with sign anchoring
Usage
svd_align(B, Bref)
Arguments
B |
Matrix of domain loadings to align |
Bref |
Reference matrix of domain loadings |
Value
List with aligned coordinates and rotation matrix
Bootstrap-Based Subspace Stability Assessment for Ipsatized SVD
Description
Evaluates reproducibility of retained dimensions via bootstrap resampling. Uses Procrustes principal angles (subspace-level) and Tucker's congruence coefficients (dimension-level).
Usage
svd_bootstrap(data, K, B = 2000, seed = 20260206, verbose = TRUE)
Arguments
data |
Data frame or matrix of continuous variables |
K |
Integer, number of dimensions to assess |
B |
Integer, number of bootstrap resamples (default: 2000) |
seed |
Integer, random seed for reproducibility |
verbose |
Logical, print progress messages |
Value
S3 object of class svd_bootstrap
Parallel Analysis for Ipsatized SVD Dimensionality Assessment
Description
Uses the paran package (Horn's parallel analysis with Longman-Allen-Chabassol bias adjustment) for dimensionality assessment, ensuring compatibility with SEPA methodology. Falls back to a built-in method if paran is unavailable.
Usage
svd_pa(data, B = 2000, q = 0.95, seed = 20260206, graph = TRUE, verbose = TRUE)
Arguments
data |
Data frame or matrix of continuous variables |
B |
Integer, number of iterations for paran (default: 2000) |
q |
Numeric, centile for retention threshold (default: 0.95) |
seed |
Integer, random seed for reproducibility |
graph |
Logical, whether to display the scree plot (default: TRUE) |
verbose |
Logical, print progress messages |
Details
This function primarily uses the paran package, which implements Horn's parallel analysis with the bias adjustment described in Longman, Cota, Holden, & Fekken (1989). This is the same method used in SEPA.
The paran package should be installed: install.packages("paran")
Value
S3 object of class svd_pa containing:
eig_obs |
Observed eigenvalues |
eig_adj |
Adjusted eigenvalues (from paran) |
eig_rand |
Random eigenvalues (threshold) |
K_star |
Number of dimensions to retain |
fit |
SVD fit object for downstream cALSI computation |
method |
Method used ("paran" or "fallback") |