Create Sample Data

TestGenerator takes as an input an Excel file with sheets that represent a table in the OMOP-CDM. The following example (testPatientsRSV.xlsx) represents a population of 10 patients, some of them with RSV.

#> # A tibble: 10 × 5
#>    person_id gender_concept_id year_of_birth race_concept_id
#>        <dbl>             <dbl>         <dbl>           <dbl>
#>  1         1              8532          1980               0
#>  2         2              8507          1980               0
#>  3         3              8532          1965               0
#>  4         4              8532          2010               0
#>  5         5              8532          1936               0
#>  6         6              8532          1970               0
#>  7         7              8532          1988               0
#>  8         8              8507          1998               0
#>  9         9              8507          1990               0
#> 10        10              8532          1945               0
#> # ℹ 1 more variable: ethnicity_concept_id <dbl>

The user can include only the tables that are relevant to the analysis.

#> [1] "person"               "observation_period"   "condition_occurrence"
#> [4] "visit_occurrence"     "visit_detail"         "death"

TestGenerator::readPatients() converts the file into JSON format and saves it in the project. The sample data is then pushed to a blank CDM with patientsCDM().

#> Unit Test Definition created successfully: test
#> Patients pushed to blank CDM successfully
#> # OMOP CDM reference (tbl_duckdb_connection)
#> 
#> Tables: person, observation_period, visit_occurrence, visit_detail, condition_occurrence, drug_exposure, procedure_occurrence, device_exposure, measurement, observation, death, note, note_nlp, specimen, fact_relationship, location, care_site, provider, payer_plan_period, cost, drug_era, dose_era, condition_era, metadata, cdm_source, concept, vocabulary, domain, concept_class, concept_relationship, relationship, concept_synonym, concept_ancestor, source_to_concept_map, drug_strength, cohort_definition, attribute_definition

That returns a CDM reference object that now can be used to perform unit tests.

#> # Source:   table<person> [?? x 18]
#> # Database: DuckDB v0.9.1 [cbarboza@Windows 10 x64:R 4.3.1/C:\Users\cbarboza\AppData\Local\Temp\Rtmp6Jtt0B\file92c81df13da1.duckdb]
#>    person_id gender_concept_id year_of_birth month_of_birth day_of_birth
#>        <int>             <int>         <int>          <int>        <int>
#>  1         1              8532          1980             NA           NA
#>  2         2              8507          1980             NA           NA
#>  3         3              8532          1965             NA           NA
#>  4         4              8532          2010             NA           NA
#>  5         5              8532          1936             NA           NA
#>  6         6              8532          1970             NA           NA
#>  7         7              8532          1988             NA           NA
#>  8         8              8507          1998             NA           NA
#>  9         9              8507          1990             NA           NA
#> 10        10              8532          1945             NA           NA
#> # ℹ more rows
#> # ℹ 13 more variables: birth_datetime <dttm>, race_concept_id <int>,
#> #   ethnicity_concept_id <int>, location_id <int>, provider_id <int>,
#> #   care_site_id <int>, person_source_value <chr>, gender_source_value <chr>,
#> #   gender_source_concept_id <int>, race_source_value <chr>,
#> #   race_source_concept_id <int>, ethnicity_source_value <chr>,
#> #   ethnicity_source_concept_id <int>