rsdv: Synthetic Tabular Data Generation with Gaussian Copulas

Generates synthetic tabular data from real datasets using Gaussian copula models, with parametric marginal selection for numerical columns and a cumulative-frequency embedding that brings categorical and boolean columns into the same joint copula. Includes a metadata system with column types and primary keys, declarative constraints enforced via rejection sampling, conditional sampling, and quality, validity and privacy reports modeled on those of the 'SDMetrics' library. Inspired by the Python 'SDV' (Synthetic Data Vault) library by 'DataCebo'; see Patki, Wedge and Veeramachaneni (2016) "The Synthetic Data Vault" <doi:10.1109/DSAA.2016.49>.

Version: 0.1.0
Depends: R (≥ 4.3.0)
Imports: copula (≥ 1.1-0), generics (≥ 0.1.3), jsonlite (≥ 1.8.0), ggplot2 (≥ 3.4.0), tibble (≥ 3.2.0), FNN (≥ 1.1.3), rpart (≥ 4.1.0), scales (≥ 1.2.0), stats, utils
Suggests: testthat (≥ 3.0.0), withr, knitr (≥ 1.40), rmarkdown (≥ 2.20)
Published: 2026-06-08
DOI: 10.32614/CRAN.package.rsdv (may not be active yet)
Author: Kailas Venkitasubramanian [aut, cre]
Maintainer: Kailas Venkitasubramanian <kailasv at gmail.com>
BugReports: https://github.com/kvenkita/rsdv/issues
License: MIT + file LICENSE
URL: https://kvenkita.github.io/rsdv/, https://github.com/kvenkita/rsdv
NeedsCompilation: no
Language: en-US
Materials: README, NEWS
CRAN checks: rsdv results

Documentation:

Reference manual: rsdv.html , rsdv.pdf
Vignettes: Getting Started with rsdv: A Practitioner's Guide to Synthetic Data Generation (source, R code)
Migrating from synthpop (source, R code)

Downloads:

Package source: rsdv_0.1.0.tar.gz
Windows binaries: r-devel: not available, r-release: not available, r-oldrel: not available
macOS binaries: r-release (arm64): not available, r-oldrel (arm64): not available, r-release (x86_64): not available, r-oldrel (x86_64): not available

Linking:

Please use the canonical form https://CRAN.R-project.org/package=rsdv to link to this page.