Working with ICU datasets, especially with publicly available ones as
provided by PhysioNet in R is
facilitated by ricu
, which provides data access, a level of
abstraction to encode clinical concepts in a data source agnostic way,
as well as classes and utilities for working with the arising types of
time series datasets.
Currently, installation is only possible from github directly, using
the remotes
if installed
::install_github("eth-mds/ricu") remotes
or by sourcing the required code for installation from github by running
<- source(
rem paste0("https://raw.githubusercontent.com/r-lib/remotes/main/",
"install-github.R")
)$value("eth-mds/ricu") rem
In order to make sure that some useful utility packages are installed
as well, consider installing the packages marked as
Suggests
as well by running
::install_github("eth-mds/ricu", dependencies = TRUE) remotes
instead, or by installing some of the utility packages (relevant for downloading and preprocessing PhysioNet datasets)
install.packages("xml2")
and demo dataset packages
install.packages(c("mimic.demo", "eicu.demo"),
repos = "https://eth-mds.github.io/physionet-demo")
explicitly.
Out of the box (provided the two data packages
mimic.demo
and eicu.demo
are available),
ricu
provides access to the demo datasets corresponding to
the PhysioNet Clinical Databases eICU and MIMIC-III. Tables are
available as
$admissions mimic_demo
#> # <mimic_tbl>: [129 ✖ 19]
#> # ID options: subject_id (patient) < hadm_id (hadm) < icustay_id (icustay)
#> # Defaults: `admission_type` (val)
#> # Time vars: `admittime`, `dischtime`, `deathtime`, `edregtime`, `edouttime`
#> row_id subject_id hadm_id admittime dischtime
#> <int> <int> <int> <dttm> <dttm>
#> 1 12258 10006 142345 2164-10-23 21:09:00 2164-11-01 17:15:00
#> 2 12263 10011 105331 2126-08-14 22:32:00 2126-08-28 18:59:00
#> 3 12265 10013 165520 2125-10-04 23:36:00 2125-10-07 15:13:00
#> 4 12269 10017 199207 2149-05-26 17:19:00 2149-06-03 18:42:00
#> 5 12270 10019 177759 2163-05-14 20:43:00 2163-05-15 12:00:00
#> …
#> 125 41055 44083 198330 2112-05-28 15:45:00 2112-06-07 16:50:00
#> 126 41070 44154 174245 2178-05-14 20:29:00 2178-05-15 09:45:00
#> 127 41087 44212 163189 2123-11-24 14:14:00 2123-12-30 14:31:00
#> 128 41090 44222 192189 2180-07-19 06:55:00 2180-07-20 13:00:00
#> 129 41092 44228 103379 2170-12-15 03:14:00 2170-12-24 18:00:00
#> # ℹ 124 more rows
#> # ℹ 14 more variables: deathtime <dttm>, admission_type <chr>,
#> # admission_location <chr>, discharge_location <chr>, insurance <chr>,
#> # language <chr>, religion <chr>, marital_status <chr>, ethnicity <chr>,
#> # edregtime <dttm>, edouttime <dttm>, diagnosis <chr>,
#> # hospital_expire_flag <int>, has_chartevents_data <int>
and data can be loaded into an R session for example using
load_ts("labevents", "mimic_demo", itemid == 50862L,
cols = c("valuenum", "valueuom"))
#> # A `ts_tbl`: 299 ✖ 4
#> # Id var: `icustay_id`
#> # Index var: `charttime` (1 hours)
#> icustay_id charttime valuenum valueuom
#> <int> <drtn> <dbl> <chr>
#> 1 201006 0 hours 2.4 g/dL
#> 2 203766 -18 hours 2 g/dL
#> 3 203766 4 hours 1.7 g/dL
#> 4 204132 7 hours 3.6 g/dL
#> 5 204201 9 hours 2.3 g/dL
#> …
#> 295 298685 130 hours 1.9 g/dL
#> 296 298685 154 hours 2 g/dL
#> 297 298685 203 hours 2 g/dL
#> 298 298685 272 hours 2.2 g/dL
#> 299 298685 299 hours 2.5 g/dL
#> # ℹ 294 more rows
which returns time series data as ts_tbl
object.
This work was supported by grant #2017-110 of the Strategic Focal Area “Personalized Health and Related Technologies (PHRT)” of the ETH Domain for the SPHN/PHRT Driver Project “Personalized Swiss Sepsis Study”.