Type: | Package |
Title: | D-Score for Child Development |
Version: | 1.10.0 |
Description: | The D-score summarizes the child's performance on a set of milestones into a single number. The package implements four Rasch model keys to convert milestone scores into a D-score. It provides tools to calculate the D-score and its precision from the child's milestone scores, to convert the D-score into the Development-for-Age Z-score (DAZ) using age-conditional references, and to map milestone names into a generic 9-position item naming convention. |
Depends: | R (≥ 4.1.0) |
Imports: | dplyr (≥ 1.0.0), Rcpp, stats, stringi, tidyr (≥ 1.0.0) |
LinkingTo: | Rcpp, RcppArmadillo |
Suggests: | ggplot2, kableExtra, knitr, lme4, patchwork, rmarkdown, testthat |
Encoding: | UTF-8 |
License: | AGPL-3 |
LazyData: | TRUE |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
URL: | https://github.com/d-score/dscore, https://d-score.org/dscore/, https://d-score.org/dbook1/ |
BugReports: | https://github.com/d-score/dscore/issues |
Copyright: | Stef van Buuren, Iris Eekhout, Arjan Huizing |
RoxygenNote: | 7.3.2 |
Packaged: | 2025-06-05 05:35:54 UTC; buurensv |
Author: | Stef van Buuren [cre, aut], Iris Eekhout [aut], Arjan Huizing [aut], Jonathan Seiden [aut] |
Maintainer: | Stef van Buuren <stef.vanbuuren@tno.nl> |
Repository: | CRAN |
Date/Publication: | 2025-06-05 05:50:02 UTC |
D-score for child development
Description
The dscore
package implements tools needed to calculate the D-score,
a numerical score that summarizes early development in children by
one number, the D-score.
User functions
The available functions are:
Function | Description |
get_itemnames() | Extract item names from an itemtable |
order_itemnames() | Order item names |
sort_itemnames() | Sort item names |
decompose_itemnames() | Get four components from itemname |
get_itemtable() | Get a subset from the itemtable |
get_labels() | Get labels for items |
rename_gcdg_gsed() | Rename gcdg into gsed lexicon |
dscore() | Estimate D-score and DAZ |
dscore_posterior() | Calculate full posterior of D-score |
get_tau() | Get difficulty parameters from item bank |
daz() | Transform to age-adjusted standardized D-score |
zad() | Inverse of daz() |
get_reference() | Get D-score reference tables |
get_age_equivalent() | Translate difficulty to age |
Built-in data
The package contains the following built-in data:
Data | Description |
builtin_keys() | Available keys for calculating the D-score |
builtin_itembank() | Collection of items fitting the Rasch model |
builtin_itemtable() | Collection of items from instruments measuring early child development |
builtin_references() | Collection of age-conditional reference distributions |
milestones() | Dataset with PASS/FAIL responses for 27 preterms |
gsample | Sample of 10 children from the GSED Phase 1 study, gsed lexicon |
sample_sf | Sample of 10 children from GSED Short Form (GSED-SF) |
sample_lf | Sample of 10 children from GSED Long Form (GSED-LF) |
sample_hf | Sample of 10 children from GSED Household Form (GSED-HF) |
Acknowledgements
The authors wish to recognize the principal investigators and their study team members for their generous contribution of the data that made this tool possible and the members of the Ki team who directly or indirectly contributed to the study: Amina Abubakar, Claudia R. Lindgren Alves, Orazio Attanasio, Maureen M. Black, Maria Caridad Araujo, Susan M. Chang-Lopez, Gary L. Darmstadt, Bernice M. Doove, Wafaie Fawzi, Lia C.H. Fernald, Günther Fink, Emanuela Galasso, Melissa Gladstone, Sally M. Grantham-McGregor, Cristina Gutierrez de Pineres, Pamela Jervis, Jena Derakhshani Hamadani, Charlotte Hanlon, Simone M. Karam, Gillian Lancaster, Betzy Lozoff, Gareth McCray, Jeffrey R Measelle, Girmay Medhin, Ana M. B. Menezes, Lauren Pisani, Helen Pitchik, Muneera Rasheed, Lisy Ratsifandrihamanana, Sarah Reynolds, Linda Richter, Marta Rubio-Codina, Norbert Schady, Limbika Sengani, Chris Sudfeld, Marcus Waldman, Susan P. Walker, Ann M. Weber and Aisha K. Yousafzai.
This study was supported by the Bill & Melinda Gates Foundation. The contents are the sole responsibility of the authors and may not necessarily represent the official views of the Bill & Melinda Gates Foundation or other agencies that may have supported the primary data studies used in the present study.
Author(s)
Maintainer: Stef van Buuren stef.vanbuuren@tno.nl
Authors:
Iris Eekhout iris.eekhout@tno.nl
Arjan Huizing arjan.huizing@tno.nl
Jonathan Seiden jseiden@g.harvard.edu
References
Jacobusse, G., S. van Buuren, and P.H. Verkerk. 2006. “An Interval Scale for Development of Children Aged 0-2 Years.” Statistics in Medicine 25 (13): 2272–83. https://stefvanbuuren.name/publication/jacobusse-2006/
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf.
GSED team (Maureen Black, Kieran Bromley, Vanessa Cavallera (lead author), Jorge Cuartas, Tarun Dua (corresponding author), Iris Eekhout, Gunther Fink, Melissa Gladstone, Katelyn Hepworth, Magdalena Janus, Patricia Kariger, Gillian Lancaster, Dana McCoy, Gareth McCray, Abbie Raikes, Marta Rubio-Codina, Stef van Buuren, Marcus Waldman, Susan Walker and Ann Weber). 2019. “The Global Scale for Early Development (GSED).” Early Childhood Matters. https://earlychildhoodmatters.online/2019/the-global-scale-for-early-development-gsed/
See Also
Useful links:
Report bugs at https://github.com/d-score/dscore/issues
Collection of items fitting the Rasch model
Description
A data frame with administrative information per item with difficulty
estimates (tau
) from the Rasch model. The item bank provides the basic
information to calculate D-scores. The items in the item bank
are a subset of all items as collected in builtin_itemtable.
Usage
builtin_itembank
Format
A data.frame
with variables:
Name | Label |
key | String indicating a specific Rasch model |
item | Item name, gsed lexicon |
tau | Difficulty estimate |
label | Label (English) |
instrument | Instrument code |
domain | Domain code |
mode | Administration mode |
number | Item number |
Details
The difficulty estimates were estimated by a Rasch model. The key
indicates the specific Rasch model used to estimate the difficulty.
Strictly speaking, one can only compare D-score calculated from the
same key
.
Note
Updates:
Dec 01, 2022 - Overwrite labels of gto by correct item order.
Dec 05, 2022 - Adds key
gsed2212
, adding instrumentsgl1
andgs1
, and defining correct order forgto
Jan 05, 2023 - Adds instrument
gh1
to keygsed2212
See Also
dscore()
, get_tau()
, builtin_itemtable()
Examples
# count number of items per instrument in each key
table(builtin_itembank$instrument, builtin_itembank$key)
Collection of items from instruments measuring early child development
Description
The built-in variable builtin_itemtable
contains the name and label
of items for measuring early child development.
Usage
builtin_itemtable
Format
A data.frame
with variables:
Name | Label |
item | Item name, gsed lexicon |
equate | Equate group |
label | Label (English) |
Details
The builtin_itemtable
is created by script
data-raw/R/save_builtin_itemtable.R
.
Updates:
May 30, 2022 - added gto (LF) and gpa (SF) items
June 1, 2022 - added seven gsd items
Nov 24, 2022 - Added instruments gs1, gs2
Dec 01, 2022 - Labels of gto replaced by correct order. Incorrect item order affects analyses done on LF between 20220530 - 20221201 !!!
Dec 05, 2022 - Redefines gs1 and instrument for Phase 2, removes gs2 (139) Adds gl1 (Long Form Phase 2 items 155)
Jan 05, 2023 - Adds 55 items from GSED-HF
Author(s)
Compiled by Stef van Buuren using different sources
Available keys for calculating the D-score
Description
A key contains the item difficulty estimates from a given Rasch model.
The difficulty estimates (tau
) are used to calculate D-scores.
D-scores can only be compared when calculated with the same key.
Usage
builtin_keys
Format
builtin_keys
is a data.frame
with variables:
Name | Label |
key | String. Name of the key indicating the Rasch model |
base_population | String. Name of the base population for the key |
n_items | Number of items in the key |
n_instruments | Number of instruments in the key |
intercept | Intercept to convert logit into D-score |
slope | Slope to convert logit into D-score |
from | Starting value of the quadrature points |
to | Stopping value of the quadrature points |
by | Increment of the quadrature points |
retired | Has the key been retired? |
Note
20240609 SvB: Added builtin_keys
table by
data-raw\data\R\save_builtin_keys.R
Collection of age-conditional reference distributions
Description
A data frame containing the age-dependent distribution of the D-score for children aged 0-5 years. The distribution is modelled after the LMS distribution (Cole & Green, 1992) or BCT model (Stasinopoulos & Rigby, 2022) and is equal for both boys and girls. The LMS/BCT values can be used to graph reference charts and to calculate age-conditional Z-scores, also known as the Development-for-Age Z-score (DAZ).
Usage
builtin_references
Format
A data.frame
with the following variables:
Name | Label |
population | Name of the reference population |
key | D-score key, e.g., "dutch" , "gcdg" or "gsed" |
distribution | Distribution family: "LMS" or "BCT" |
age | Decimal age in years |
mu | M-curve, median D-score, P50 |
sigma | S-curve, spread expressed as coefficient of variation |
nu | L-curve, the lambda coefficient of the LMS/BCT model for skewness |
tau | Kurtosis parameter in the BCT model |
P3 | P3 percentile |
P10 | P10 percentile |
P25 | P25 percentile |
P50 | P50 percentile |
P75 | P75 percentile |
P90 | P90 percentile |
P97 | P97 percentile |
SDM2 | -2SD centile |
SDM1 | -1SD centile |
SD0 | 0SD centile, median |
SDP1 | +1SD centile |
SDP2 | +2SD centile |
Details
Here are more details on the reference population:
The "dutch"
references were calculated from the SMOCC data, and cover
age range 0-2.5 years (van Buuren, 2014).
The "gcdg"
references were calculated from the 15 cohorts of the
GCDG-study, and cover age range 0-5 years (Weber, 2019).
The "phase1"
references were calculated from the GSED Phase 1 validation
data (GSED-BGD, GSED-PAK, GSED-TZA) cover age range 2w-3.5 years. The
age range 3.5-5 yrs is linearly extrapolated and are only indicative.
The "preliminary_standards"
were calculated from the GSED Phase 1 validation
data (GSED-BGD, GSED-PAK, GSED-TZA) using a subset of children with
covariate indicating healthy development.
References
Cole TJ, Green PJ (1992). Smoothing reference centile curves: The LMS method and penalized likelihood. Statistics in Medicine, 11(10), 1305-1319.
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf
Stasinopoulos M, Rigby R (2022). gamlss.dist: Distributions for Generalized Additive Models for Location Scale and Shape, R package version 6.0-3, https://CRAN.R-project.org/package=gamlss.dist
See Also
Examples
# get an overview of available references per key
table(builtin_references$population, builtin_references$key)
Calculate posterior of ability
Description
If the tauj is not within the range rello - relhi from the dynamic EAP, the procedure ignores the score of item j.
Usage
calculate_posterior(scores, tau, qp, scale, mu, sd, relhi, rello)
Arguments
scores |
A vector with PASS/FAIL observations.
Scores are coded numerically as |
tau |
A vector containing the item difficulties for the item
scores in |
qp |
Numeric vector of equally spaced quadrature points. |
scale |
Scale expansion |
mu |
Numeric scalar. The mean of the prior. |
sd |
Numeric scalar. Standard deviation of the prior. |
relhi |
Positive numeric scalar. Upper end of the relevance interval |
rello |
Negative numeric scalar. Lower end of the relevance interval |
Value
A list
with three elements:
Name | Label |
eap | Mean of the posterior |
gp | Vector of quadrature points |
posterior | Vector with posterior distribution. |
Since dscore V40.1
the function does not return the "start"
element.
Author(s)
Stef van Buuren, Arjan Huizing, 2020
Median D-score from the default references for the given key
Description
Returns the age-interpolated median of the D-score of the default reference for a given key.
Usage
count_mu(t, key, prior_mean_NA = NA_real_)
Arguments
t |
Decimal age, numeric vector |
key |
Character, key of the reference population |
prior_mean_NA |
Numeric, prior mean when age is missing |
Details
Do not use this function if you want the median D-score for a specific reference.
DEPRECATED in dscore 1.9.6
Value
A vector of length length(t)
with the median of the default reference
population for the key.
Median of Dutch references
Description
Returns the age-interpolated median of the Dutch references (van Buuren 2014).
The working range is 0-3 years. This function is used
to set prior mean under key "dutch"
.
Usage
count_mu_dutch(t)
Arguments
t |
Decimal age, numeric vector |
Value
A vector of length length(t)
with the median of the Dutch references.
Note
Internal function. Called by dscore()
Examples
dscore:::count_mu_dutch(0:2)
Median of GCDG references
Description
Returns the age-interpolated median of the GCDG references (Weber
et al, 2019). The working range is 0-4 years. This function is used
to set prior mean under keys "gcdg"
and "gsed1912"
.
Usage
count_mu_gcdg(t)
Arguments
t |
Decimal age, numeric vector |
Value
A vector of length length(t)
with the median of the GCDG references.
Note
Internal function. Called by dscore()
Examples
dscore:::count_mu_gcdg(0:2)
Median of phase1 references
Description
Returns the age-interpolated median of the phase1 references
based on LF & SF in GSED-BGD, GSED-PAK, GSED-TZA. This function is used
to set prior mean under keys "293_0"
and "gsed2212"
.
Usage
count_mu_phase1(t)
Arguments
t |
Decimal age, numeric vector |
Details
The interpolation is done in two rounds. First round: Calculate D-scores using .gcdg prior-mean, calculate reference, estimate round 1 parameters used in this function. Round 2: Calculate D-score using round 1 estimates as the prior mean (most differences are within 0.1 D-score points), recalculate references, estimate round 2 parameters used in this function.
Round 1: Count model: <= 9MN: 21.3449 + 26.4916 t + 7.0251(t + 0.2) Count model: > 9Mn & <= 3.5 YR: 14.69947 - 12.18636 t + 69.11675(t + 0.92) Linear model: > 3.5 YRS: 61.40956 + 3.80904 t
Round 2: Count model: < 9MND: 20.5883 + 27.3376 t + 6.4254(t + 0.2) Count model: > 9MND & < 3.5 YR: 14.63748 - 12.11774 t + 69.05463(t + 0.92) Linear model: > 3.5 YRS: 61.37967 + 3.83513 t
The working range is 0-3.5 years. After the age of 3.5 years, the function will increase at an arbitrary rate of 3.8 D-score points per year.
Value
A vector of length length(t)
with the median of the GCDG references.
Note
Internal function. Called by dscore()
Author(s)
Stef van Buuren, on behalf of GSED project
Examples
dscore:::count_mu_phase1(0:5)
Median of preliminary_standards
Description
Returns the age-interpolated median of the preliminary_standards
based on LF & SF in GSED-BGD, GSED-PAK, GSED-TZA. This function is used
to set prior mean under key "gsed2406"
.
Usage
count_mu_preliminary_standards(t)
Arguments
t |
Decimal age, numeric vector |
Value
A vector of length length(t)
with the median of the GCDG references.
Note
Internal function. Called by dscore()
Author(s)
Stef van Buuren, on behalf of GSED project
Examples
dscore:::count_mu_preliminary_standards(0:5)
Calculate Development-for-Age Z-score (DAZ)
Description
The daz()
function calculated the Development-for-Age Z-score (DAZ).
The DAZ represents a child's D-score after adjusting for age by an
external age-conditional reference.
Usage
daz(d, x, reference_table = NULL, dec = 3, verbose = FALSE)
zad(z, x, reference_table = NULL, dec = 2, verbose = FALSE)
Arguments
d |
Vector of D-scores |
x |
Vector of ages (decimal age) |
reference_table |
A |
dec |
The number of decimals (default |
verbose |
Print out the used reference table (default |
z |
Vector of standard deviation scores (DAZ) |
Details
The zad()
is the inverse of daz()
: Given age and
the Z-score, it finds the raw D-score.
Note 1: The Box-Cox Cole and Green (BCCG) and Box-Cox t (BCT)
distributions model only positive D-score values. To increase
robustness, the daz()
and zad()
functions will round up any
D-scores lower than 1.0 to 1.0.
Note 2: The daz()
and zad()
function call modified version of the
pBCT()
and qBCT()
functions from gamlss
for better handling
of NA
's and rounding.
Value
Unnamed numeric vector with Z-scores of length length(d)
.
Unnamed numeric vector with D-scores of length length(z)
.
Author(s)
Stef van Buuren
References
Cole TJ, Green PJ (1992). Smoothing reference centile curves: The LMS method and penalized likelihood. Statistics in Medicine, 11(10), 1305-1319.
See Also
Examples
# using default reference and key
daz(d = c(35, 50), x = c(0.5, 1.0))
# print out names of the used reference table
daz(d = c(35, 50), x = c(0.5, 1.0), verbose = TRUE)
# using the default reference in key gcdg
reftab <- get_reference(key = "gcdg")
daz(d = c(35, 50), x = c(0.5, 1.0), reference_table = reftab)
# using Dutch reference in default key
reftab <- get_reference(population = "dutch", verbose = TRUE)
daz(d = c(35, 50), x = c(0.5, 1.0), reference_table = reftab)
# population median at ages 0.5, 1 and 2 years, default reference
zad(z = rep(0, 3), x = c(0.5, 1, 2))
# population median at ages 0.5, 1 and 2 years, gcdg key
reftab <- get_reference(key = "gcdg", verbose = TRUE)
zad(z = rep(0, 3), x = c(0.5, 1, 2), reference_table = reftab)
# population median at ages 0.5, 1 and 2 years, dutch key
reftab <- get_reference(key = "dutch", verbose = TRUE)
zad(z = rep(0, 3), x = c(0.5, 1, 2), reference = reftab)
Decomposes item names into their four components
Description
This utility function decomposes item names into components: instrument, domain, mode and number
Usage
decompose_itemnames(x)
Arguments
x |
A character vector containing item names (gsed lexicon) |
Details
The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.
Value
A data.frame
with length(x)
rows and
four columns, named: instrument
, domain
, mode
,
and number
.
Author(s)
Stef van Buuren
References
https://docs.google.com/spreadsheets/d/1zLsSW9CzqshL8ubb7K5R9987jF4YGDVAW_NBw1hR2aQ/edit#gid=0
See Also
Examples
itemnames <- c("aqigmc028", "grihsd219", "", "by1mdd157", "mdsgmd006")
decompose_itemnames(itemnames)
D-score estimation
Description
The dscore()
function estimates the following quantities: D-score,
a numeric score that quantifies child development by one number,
Development-for-Age Z-score (DAZ) that corrects the D-score for age,
standard error of measurement (SEM) of the D-score.
Usage
dscore(
data,
items = names(data),
key = NULL,
population = NULL,
xname = "age",
xunit = c("decimal", "days", "months"),
prepend = NULL,
itembank = NULL,
metric = c("dscore", "logit"),
prior_mean = NULL,
prior_mean_NA = NULL,
prior_sd = NULL,
prior_sd_NA = NULL,
transform = NULL,
qp = NULL,
dec = c(2L, 3L),
relevance = c(-Inf, Inf),
algorithm = c("current", "1.8.7"),
verbose = FALSE
)
dscore_posterior(
data,
items = names(data),
key = NULL,
population = NULL,
xname = "age",
xunit = c("decimal", "days", "months"),
prepend = NULL,
itembank = NULL,
metric = c("dscore", "logit"),
prior_mean = NULL,
prior_mean_NA = NULL,
prior_sd = NULL,
prior_sd_NA = NULL,
transform = NULL,
qp = NULL,
dec = c(2L, 3L),
relevance = c(-Inf, Inf),
algorithm = c("current", "1.8.7"),
verbose = FALSE
)
Arguments
data |
A |
items |
A character vector containing names of items to be
included into the D-score calculation. Milestone scores are coded
numerically as |
key |
String. They key identifies 1) the difficulty estimates
pertaining to a particular Rasch model, and 2) the prior mean and standard
deviation of the prior distribution for calculating the D-score.
The default key |
population |
String. The name of the reference population to calculate
DAZ.
Use |
xname |
A string with the name of the age variable in
|
xunit |
A string specifying the unit in which age is measured
(either |
prepend |
Character vector with column names in |
itembank |
A |
metric |
A string, either |
prior_mean |
|
prior_mean_NA |
|
prior_sd |
|
prior_sd_NA |
|
transform |
Numeric vector, length 2, containing the intercept
and slope of the linear transform from the logit scale into the
the D-score scale. The default ( |
qp |
Numeric vector of equally spaced quadrature points.
This vector should span the range of all D-score or logit values.
The default ( |
dec |
A vector of two integers specifying the number of
decimals for rounding the D-score and DAZ, respectively.
The default is |
relevance |
A numeric vector of length with the lower and
upper bounds of the relevance interval. The procedure calculates
a dynamic EAP for each item. If the difficulty level (tau) of the
next item is outside the relevance interval around EAP, the procedure
ignore the score on the item. The default is |
algorithm |
Computational method, for backward compatibility.
Either |
verbose |
Logical. Print settings. |
Details
The scoring algorithm is based on the method by Bock and Mislevy (1982). The method uses Bayes rule to update a prior ability into a posterior ability.
The item names should correspond to the "gsed"
lexicon.
A key is defined by the set of estimated item difficulties.
Key | Model | Quadrature | Instruments | Direct/Caregiver | Reference |
"dutch" | 75_0 | -10:80 | 1 | direct | Van Buuren, 2014/2020 |
"gcdg" | 565_18 | -10:100 | 13 | direct | Weber, 2019 |
"gsed1912" | 807_17 | -10:100 | 21 | mixed | GSED Team, 2019 |
"293_0" | 293_0 | -10:100 | 2 | mixed | GSED Team, 2022 |
"gsed2212" | 818_6 | -10:100 | 27 | mixed | GSED Team, 2022 |
"gsed2406" | 818_6 | -10:100 | 27 | mixed | GSED Team, 2024 |
As a general rule, one should only compare D-scores
that are calculated using the same key and the same
set of quadrature points. For calculating D-scores on new data,
the advice is to use the default, which currently is "gsed2406"
.
The default starting prior is a mean calculated from a so-called
"Count model" that describes mean D-score as a function of age. The
The Count models are implemented in the function [get_mu()]
.
By default, the spread of the starting prior
is 5 D-score points around the mean D-score, which corresponds to
approximately 1.5 to 2 times the normal spread of child of a given age. The
starting prior is informative for very short test (say <5 items), but has
little impact on the posterior for larger tests.
Value
The dscore()
function returns a data.frame
with nrow(data)
rows.
Optionally, the first block of columns can be copied to the
result by using prepend
. The second block consists of the
following columns:
Name | Label |
a | Decimal age (years) |
n | Number of items with valid (0/1) data |
p | Percentage of passed milestones |
d | D-score, mean of posterior distribution |
sem | Standard error of measurement, standard deviation of the posterior |
daz | D-score corrected for age, calculated in Z-scale (for metric "dscore" ) |
The D-score in column d
is a linear scale, with values usually ranging
from 0 to 100. The D-score is NA
if age is missing or if age is lower
than -1/12. It is possible to calculate D-scores for cases with missing ages
by setting prior_mean_NA
and prior_sd_NA
to some reasonable value, e.g.,
prior_mean_NA = 50
and prior_sd_NA = 20
, for the sample at hand.
The SEM is a positive number that quantifies the uncertainty of the D-score.
It is NA
if the D-score is NA
.
The DAZ in column daz
is a Z-score that corrects the D-score for age. It
is NA
when there are no reference values for the given age, or when
the D-score is extremely unlikely to be valid at the given age.
Advanced applications: The dscore_posterior()
function returns a
data frame with nrow(data)
rows and length(qp)
plus prepended columns
with the full posterior density of the D-score at each quadrature point.
If no valid responses are found, dscore_posterior()
returns the
prior density. Versions prior to 1.8.5 returned a matrix
(instead of
a data.frame
). Code that depends on the result being a matrix
may break
and may need adaptation.
Author(s)
Stef van Buuren, Iris Eekhout, Arjan Huizing (2022)
References
Bock DD, Mislevy RJ (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6(4), 431-444.
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf
See Also
builtin_keys()
, builtin_itembank()
, builtin_itemtable()
,
builtin_references()
, get_tau()
, posterior()
, milestones()
Examples
# using all defaults and properly formatted data
ds <- dscore(milestones)
head(ds)
# step-by-step example
data <- data.frame(
id = c(
"Jane", "Martin", "ID-3", "No. 4", "Five", "6",
NA_character_, as.character(8:10)
),
age = rep(round(21 / 365.25, 4), 10),
ddifmd001 = c(NA, NA, 0, 0, 0, 1, 0, 1, 1, 1),
ddicmm029 = c(NA, NA, NA, 0, 1, 0, 1, 0, 1, 1),
ddigmd053 = c(NA, 0, 0, 1, 0, 0, 1, 1, 0, 1)
)
items <- names(data)[3:5]
# third item is not part of the default key
get_tau(items, verbose = TRUE)
# calculate D-score
dscore(data)
# prepend id variable to output
dscore(data, prepend = "id")
# or prepend all data
# dscore(data, prepend = colnames(data))
# calculate full posterior
p <- dscore_posterior(data)
# check that rows sum to 1
rowSums(p)
# plot full posterior for measurement 7
barplot(as.matrix(p[7, 12:36]),
names = 1:25,
xlab = "D-score", ylab = "Density", col = "grey",
main = "Full D-score posterior for measurement in row 7",
sub = "D-score (EAP) = 11.58, SEM = 3.99")
# plot P10, P50 and P90 of D-score references
g <- expand.grid(age = seq(0.1, 4, 0.1), p = c(0.1, 0.5, 0.9))
d <- zad(z = qnorm(g$p), x = g$age, verbose = TRUE)
matplot(
x = matrix(g$age, ncol = 3), y = matrix(d, ncol = 3), type = "l",
lty = 1, col = "blue", xlab = "Age (years)", ylab = "D-score",
main = "D-score preliminary standards: P10, P50 and P90")
abline(h = seq(10, 80, 10), v = seq(0, 4, 0.5), col = "gray", lty = 2)
# add measurements made on very preterms, ga < 32 weeks
ds <- dscore(milestones)
points(x = ds$a, y = ds$d, pch = 19, col = "red")
Get age equivalents of items that have a difficulty estimate
Description
This function calculates the ages at which a certain percent in the reference population passes the items.
Usage
get_age_equivalent(
items,
pct = c(10, 50, 90),
key = NULL,
population = NULL,
transform = NULL,
itembank = dscore::builtin_itembank,
xunit = c("decimal", "days", "months"),
verbose = FALSE
)
Arguments
items |
A character vector containing names of items to be
included into the D-score calculation. Milestone scores are coded
numerically as |
pct |
Numeric vector with requested percentiles (0-100). The
default is |
key |
String. They key identifies 1) the difficulty estimates
pertaining to a particular Rasch model, and 2) the prior mean and standard
deviation of the prior distribution for calculating the D-score.
The default key |
population |
String. The name of the reference population to calculate
DAZ.
Use |
transform |
Numeric vector, length 2, containing the intercept
and slope of the linear transform from the logit scale into the
the D-score scale. The default ( |
itembank |
A |
xunit |
A string specifying the unit in which age is measured
(either |
verbose |
Logical. Print settings. |
Value
data.frame
with four columns: item
, d
(D-score),
pct
(percentile), and a
(age-equivalent, in xunit
units).
Note
The function internally defines a scale factor given the key.
Examples
get_age_equivalent(c("gpagmc018", "gtogmd026", "ddicmm050"))
Extract item names
Description
The get_itemnames()
function matches names against the 9-code
template. This is useful for quickly selecting names of items from a larger
set of names.
Usage
get_itemnames(
x,
instrument = NULL,
domain = NULL,
mode = NULL,
number = NULL,
strict = FALSE,
itemtable = NULL,
order = "idnm"
)
Arguments
x |
A character vector, |
instrument |
A character vector with 3-position codes of instruments
that should match. The default |
domain |
A character vector with 2-position codes of domains
that should match. The default |
mode |
A character vector with 1-position codes of the mode
of administration. The default |
number |
A numeric or character vector with item numbers.
The default |
strict |
A logical specifying whether the resulting item
names must conform to one of the built-in names. The default is
|
itemtable |
A |
order |
A four-letter string specifying the sorting order.
The four letters are: |
Details
The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.
Value
A vector with names of items
Author(s)
Stef van Buuren 2020
See Also
Examples
itemnames <- c("aqigmc028", "grihsd219", "", "age", "mdsgmd999")
# filter out impossible names
get_itemnames(itemnames)
get_itemnames(itemnames, strict = TRUE)
# only items from specific instruments
get_itemnames(itemnames, instrument = c("aqi", "mds"))
get_itemnames(itemnames, instrument = c("aqi", "mds"), strict = TRUE)
# get all items from the se domain of iyo instrument
get_itemnames(domain = "se", instrument = "iyo")
# get all item from the se domain with direct assessment mode
get_itemnames(domain = "se", mode = "d")
# get all item numbers 70 and 73 from gm domain
get_itemnames(number = c(70, 73), domain = "gm")
Get a subset of items from the itemtable
Description
The builtin_itemtable
object in the dscore
package
contains basic meta-information about items: a name, the equate group,
and the item label.
The get_itemtable()
function returns a subset of items
in the itemtable.
Usage
get_itemtable(items = NULL, itemtable = NULL, decompose = FALSE)
Arguments
items |
A logical or character vector of item names to return. The
default ( |
itemtable |
A |
decompose |
If |
Value
A data.frame
with seven columns.
See Also
Examples
head(get_itemtable(), 3)
get_itemtable(LETTERS[1:3], "")
Get labels for items
Description
The get_labels()
function obtains the item labels for a
specified set of items.
Usage
get_labels(items = NULL, trim = NULL, itemtable = NULL)
Arguments
items |
A character vector of item names to return. The
default ( |
trim |
The maximum number of characters in the label. The
default |
itemtable |
A |
Value
A named character vector with length(items)
elements with
item labels, in the same order as in items
.
See Also
builtin_itemtable()
, get_itemnames()
Examples
# get labels of first two Macarthur items
get_labels(get_itemnames(instrument = "mac", number = 1:2), trim = 40)
Median D-score from the base population for a given key
Description
Returns the age-interpolated median of the D-score of the default reference for a given key.
Usage
get_mu(t, key, prior_mean_NA = NA_real_)
Arguments
t |
Decimal age, numeric vector |
key |
Character, key of the reference population |
prior_mean_NA |
Numeric, prior mean when age is missing |
Details
Use get_reference()
for more options.
Value
A vector of length length(t)
with the median of the default reference
population for the key.
Get D-score reference
Description
The get_reference()
function selects the D-score reference
distribution.
Usage
get_reference(
population = NULL,
key = NULL,
references = dscore::builtin_references,
verbose = FALSE,
...
)
Arguments
population |
String. The name of the reference population to calculate
DAZ.
Use |
key |
String. They key identifies 1) the difficulty estimates
pertaining to a particular Rasch model, and 2) the prior mean and standard
deviation of the prior distribution for calculating the D-score.
The default key |
references |
A |
verbose |
Logical. Print settings. |
... |
Used to test whether the call contained the deprecated argument
|
Value
A data.frame
with the LMS reference values.
Note
No references for population "gsed"
exist.
The function will silently rewrite population = "gsed"
into to the population = "gsed"
.
The "dutch"
reference was published in Van Buuren (2014)
The "gcdg"
was calculated from 15 cohorts with direct
observations (Weber, 2019).
The "phase1"
references were calculated from the GSED Phase 1 validation
data (GSED-BGD, GSED-PAK, GSED-TZA) cover age range 2w-3.5 years. The
age range 3.5-5 yrs is linearly extrapolated and are only indicative.
The "preliminary_standards"
references were calculated from the GSED
Phase 1 validation using a subset of children with healthy development.
References
Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368.
Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf.
See Also
Examples
# see key-population combinations of builtin_references
table(builtin_references$key, builtin_references$population)
# get the default reference
reftab <- get_reference()
head(reftab, 2)
# get the default reference for the key "gsed2212"
reftab <- get_reference(key = "gsed2212", verbose = TRUE)
# get dutch reference for default key
reftab <- get_reference(population = "dutch", verbose = TRUE)
# loading a non-existing reference yields zero rows
reftab <- get_reference(population = "france", verbose = TRUE)
nrow(reftab)
Obtain difficulty parameters from item bank
Description
Searches the item bank for matching items, and returns the difficulty estimates. Matching is done by item name. Comparisons are done in lower case.
Usage
get_tau(
items,
key = NULL,
itembank = dscore::builtin_itembank,
verbose = FALSE
)
Arguments
items |
A character vector containing names of items to be
included into the D-score calculation. Milestone scores are coded
numerically as |
key |
String. They key identifies 1) the difficulty estimates
pertaining to a particular Rasch model, and 2) the prior mean and standard
deviation of the prior distribution for calculating the D-score.
The default key |
itembank |
A |
verbose |
Logical. Print settings. |
Value
A named vector with the difficulty estimate per item with
length(items)
elements.
Author(s)
Stef van Buuren 2020
See Also
Examples
# difficulty levels in the GHAP lexicon
get_tau(items = c("ddifmd001", "DDigmd052", "xyz"))
Sample of 10 children from the GSED Phase 1 study
Description
A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.
Usage
gsample
Format
A data.frame
with 10 rows and 295 variables:
Name | Label |
id | Integer, child ID |
agedays | Integer, age in days |
gpalac001 | Integer, Cry when hungry...: 1 = yes, 0 = no, NA = not administered |
gpalac002 | Integer, Look at/focus...: 1 = yes, 0 = no, NA = not administered |
... | and so on.. |
There are 138 gpa
items (item gpamoc008
(clench fists) removed) from GSED SF and
and 155 gto
items from GSED LF.
See Also
Examples
head(gsample)
Outcomes on developmental milestones for preterm-born children
Description
A demo dataset with developmental scores at the item level for a set of 27 preterm children.
Usage
milestones
Format
A data.frame
with 100 rows and 62 variables:
Name | Label |
id | Integer, child ID |
agedays | Integer, age in days |
age | Numeric, decimal age in years |
sex | Character, "male", "female" |
gagebrth | Integer, gestational age in days |
ddifmd001 | Integer, Fixates eyes: 1 = yes, 0 = no |
... | and so on.. |
See Also
Examples
head(milestones)
Normalize distribution
Description
Normalizes the distribution so that the total mass equals 1.
Usage
normalize(d, qp)
Arguments
d |
A vector with |
qp |
Vector of equally spaced quadrature points. |
Value
A vector of length(d)
elements with
the prior density estimate at each quadature point.
Note
: Internal function
Examples
dscore:::normalize(c(5, 10, 5), qp = c(0, 1, 2))
sum(dscore:::normalize(rnorm(5), qp = 1:5))
Calculate posterior for one item given score, difficulty and prior
Description
Calculate posterior for one item given score, difficulty and prior
Usage
posterior(score, tau, prior, qp, scale)
Arguments
score |
Integer, either 0 (fail) and 1 (pass) |
tau |
Numeric, difficulty parameter |
prior |
Vector of prior values on quadrature points |
qp |
vector of equally spaced quadrature points |
scale |
expansion relative to the logit scale |
Details
This function assumes that the difficulties have been estimated by
a binary Rasch model, e.g. by rasch.pairwise.itemcluster()
of
the sirt
package.
Value
A vector of length length(prior)
Note
: Internal function
Author(s)
Stef van Buuren, Arjan Huizing, 2020
See Also
Rename items from gcdg into gsed lexicon
Description
Function rename_gcdg_gsed()
translates item names in the
gcdg lexicon to item names in the gsed lexicon.
Usage
rename_gcdg_gsed(x, copy = TRUE)
Arguments
x |
A character vector containing item names in the gcdg lexicon |
copy |
A logical indicating whether any unmatches names should
be copied ( |
Details
The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.
The function currently support ASQ-I (aqi), Barrera-Moncade (bar), Batelle (bat), Bayley I (by1), Bayley II (by2), Bayley III (by3), Dutch Development Instrument (ddi), Denver (den), Griffith (gri), MacArthur (mac), WHO milestones (mds), Mullen (mul), pegboard (peg), South African Griffith (sgr), Stanford Binet (sbi), Tepsi (tep), Vineland (vin).
In cases where the domain of the items isn't clear (vin, bar), the domain is coded as 'xx'.
Value
A character vector of length length(x)
with gcdg
item names replaced by gsed item name.
Author(s)
Iris Eekhout, Stef van Buuren
References
https://docs.google.com/spreadsheets/d/1zLsSW9CzqshL8ubb7K5R9987jF4YGDVAW_NBw1hR2aQ/edit#gid=0
Examples
from <- c(
"ag28", "gh2_19", "a14ps4", "b1m157", "mil6",
"bm19", "a16fm4", "n22", "ag9", "gh6_5"
)
to <- rename_gcdg_gsed(from, copy = FALSE)
to
Sample of 10 children from GSED HF
Description
A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.
Usage
sample_hf
Format
A data.frame
with 10 rows and 57 variables:
Name | Label |
subjid | Integer, child ID |
agedays | Integer, age in days |
hf001 | Integer, ...: 1 = yes, 0 = no, NA = not administered |
hf002 | Integer, ...: 1 = yes, 0 = no, NA = not administered |
... | and so on.. |
Sample data for 55 gpa
items forming GSED HF V1
See Also
Examples
head(sample_hf)
Sample of 10 children from gto (LF)
Description
A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.
Usage
sample_lf
Format
A data.frame
with 10 rows and 157 variables:
Name | Label |
subjid | Integer, child ID |
agedays | Integer, age in days |
lf001 | Integer, ...: 1 = yes, 0 = no, NA = not administered |
lf002 | Integer, ...: 1 = yes, 0 = no, NA = not administered |
... | and so on.. |
Sample data for 155 gto
items from GSED SF
See Also
Examples
head(sample_lf)
Sample of 10 children from gpa (SF)
Description
A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.
Usage
sample_sf
Format
A data.frame
with 10 rows and 141 variables:
Name | Label |
subjid | Integer, child ID |
agedays | Integer, age in days |
sf001 | Integer, Cry when hungry...: 1 = yes, 0 = no, NA = not administered |
sf002 | Integer, Look at/focus...: 1 = yes, 0 = no, NA = not administered |
... | and so on.. |
Sample data for 139 gpa
items from GSED SF
See Also
Examples
head(sample_sf)
Sorts item names according to user-specified priority
Description
This function sorts the item names according to instrument, domain, mode and number. The user can specify the sorting order.
Usage
sort_itemnames(x, order = "idnm")
order_itemnames(x, order = "idnm")
Arguments
x |
A character vector containing item names (gsed lexicon) |
order |
A four-letter string specifying the sorting order.
The four letters are: |
Value
sort_itemnames()
return a character vector with
length(x)
sorted elements. order_itemnames()
return
an integer vector of length length(x)
with positions of
the sorted elements.
Author(s)
Stef van Buuren
See Also
Examples
itemnames <- c("aqigmc028", "grihsd219", "", "by1mdd157", "mdsgmd006")
sort_itemnames(itemnames)