Help for package dscore

Type:

Package

Title:

D-Score for Child Development

Version:

1.10.0

Description:

The D-score summarizes the child's performance on a set of milestones into a single number. The package implements four Rasch model keys to convert milestone scores into a D-score. It provides tools to calculate the D-score and its precision from the child's milestone scores, to convert the D-score into the Development-for-Age Z-score (DAZ) using age-conditional references, and to map milestone names into a generic 9-position item naming convention.

Depends:

R (≥ 4.1.0)

Imports:

dplyr (≥ 1.0.0), Rcpp, stats, stringi, tidyr (≥ 1.0.0)

LinkingTo:

Rcpp, RcppArmadillo

Suggests:

ggplot2, kableExtra, knitr, lme4, patchwork, rmarkdown, testthat

Encoding:

UTF-8

License:

AGPL-3

LazyData:

TRUE

VignetteBuilder:

knitr

NeedsCompilation:

yes

URL:

https://github.com/d-score/dscore, https://d-score.org/dscore/, https://d-score.org/dbook1/

BugReports:

https://github.com/d-score/dscore/issues

Stef van Buuren, Iris Eekhout, Arjan Huizing

RoxygenNote:

7.3.2

Packaged:

2025-06-05 05:35:54 UTC; buurensv

Author:

Stef van Buuren [cre, aut], Iris Eekhout [aut], Arjan Huizing [aut], Jonathan Seiden [aut]

Maintainer:

Stef van Buuren <stef.vanbuuren@tno.nl>

Repository:

CRAN

Date/Publication:

2025-06-05 05:50:02 UTC

D-score for child development

Description

The dscore package implements tools needed to calculate the D-score, a numerical score that summarizes early development in children by one number, the D-score.

User functions

The available functions are:

Function	Description
`get_itemnames()`	Extract item names from an itemtable
`order_itemnames()`	Order item names
`sort_itemnames()`	Sort item names
`decompose_itemnames()`	Get four components from itemname

`get_itemtable()`	Get a subset from the itemtable
`get_labels()`	Get labels for items
`rename_gcdg_gsed()`	Rename gcdg into gsed lexicon

`dscore()`	Estimate D-score and DAZ
`dscore_posterior()`	Calculate full posterior of D-score
`get_tau()`	Get difficulty parameters from item bank

`daz()`	Transform to age-adjusted standardized D-score
`zad()`	Inverse of `daz()`
`get_reference()`	Get D-score reference tables
`get_age_equivalent()`	Translate difficulty to age

Built-in data

The package contains the following built-in data:

Data	Description
`builtin_keys()`	Available keys for calculating the D-score
`builtin_itembank()`	Collection of items fitting the Rasch model
`builtin_itemtable()`	Collection of items from instruments measuring early child development
`builtin_references()`	Collection of age-conditional reference distributions

`milestones()`	Dataset with PASS/FAIL responses for 27 preterms
gsample	Sample of 10 children from the GSED Phase 1 study, gsed lexicon
sample_sf	Sample of 10 children from GSED Short Form (GSED-SF)
sample_lf	Sample of 10 children from GSED Long Form (GSED-LF)
sample_hf	Sample of 10 children from GSED Household Form (GSED-HF)

Acknowledgements

The authors wish to recognize the principal investigators and their study team members for their generous contribution of the data that made this tool possible and the members of the Ki team who directly or indirectly contributed to the study: Amina Abubakar, Claudia R. Lindgren Alves, Orazio Attanasio, Maureen M. Black, Maria Caridad Araujo, Susan M. Chang-Lopez, Gary L. Darmstadt, Bernice M. Doove, Wafaie Fawzi, Lia C.H. Fernald, Günther Fink, Emanuela Galasso, Melissa Gladstone, Sally M. Grantham-McGregor, Cristina Gutierrez de Pineres, Pamela Jervis, Jena Derakhshani Hamadani, Charlotte Hanlon, Simone M. Karam, Gillian Lancaster, Betzy Lozoff, Gareth McCray, Jeffrey R Measelle, Girmay Medhin, Ana M. B. Menezes, Lauren Pisani, Helen Pitchik, Muneera Rasheed, Lisy Ratsifandrihamanana, Sarah Reynolds, Linda Richter, Marta Rubio-Codina, Norbert Schady, Limbika Sengani, Chris Sudfeld, Marcus Waldman, Susan P. Walker, Ann M. Weber and Aisha K. Yousafzai.

This study was supported by the Bill & Melinda Gates Foundation. The contents are the sole responsibility of the authors and may not necessarily represent the official views of the Bill & Melinda Gates Foundation or other agencies that may have supported the primary data studies used in the present study.

Author(s)

Maintainer: Stef van Buuren stef.vanbuuren@tno.nl

Authors:

Iris Eekhout iris.eekhout@tno.nl
Arjan Huizing arjan.huizing@tno.nl
Jonathan Seiden jseiden@g.harvard.edu

References

Jacobusse, G., S. van Buuren, and P.H. Verkerk. 2006. “An Interval Scale for Development of Children Aged 0-2 Years.” Statistics in Medicine 25 (13): 2272–83. https://stefvanbuuren.name/publication/jacobusse-2006/

Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/

Weber AM, Rubio-Codina M, Walker SP, van Buuren S, Eekhout I, Grantham-McGregor S, Caridad Araujo M, Chang SM, Fernald LCH, Hamadani JD, Hanlon A, Karam SM, Lozoff B, Ratsifandrihamanana L, Richter L, Black MM (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, BMJ Global Health 4: e001724. https://gh.bmj.com/content/bmjgh/4/6/e001724.full.pdf.

GSED team (Maureen Black, Kieran Bromley, Vanessa Cavallera (lead author), Jorge Cuartas, Tarun Dua (corresponding author), Iris Eekhout, Gunther Fink, Melissa Gladstone, Katelyn Hepworth, Magdalena Janus, Patricia Kariger, Gillian Lancaster, Dana McCoy, Gareth McCray, Abbie Raikes, Marta Rubio-Codina, Stef van Buuren, Marcus Waldman, Susan Walker and Ann Weber). 2019. “The Global Scale for Early Development (GSED).” Early Childhood Matters. https://earlychildhoodmatters.online/2019/the-global-scale-for-early-development-gsed/

Collection of items fitting the Rasch model

Description

A data frame with administrative information per item with difficulty estimates (tau) from the Rasch model. The item bank provides the basic information to calculate D-scores. The items in the item bank are a subset of all items as collected in builtin_itemtable.

Usage

builtin_itembank

Format

A data.frame with variables:

Name	Label
`key`	String indicating a specific Rasch model
`item`	Item name, gsed lexicon
`tau`	Difficulty estimate
`label`	Label (English)
`instrument`	Instrument code
`domain`	Domain code
`mode`	Administration mode
`number`	Item number

Details

The difficulty estimates were estimated by a Rasch model. The key indicates the specific Rasch model used to estimate the difficulty. Strictly speaking, one can only compare D-score calculated from the same key.

Note

Updates:

Dec 01, 2022 - Overwrite labels of gto by correct item order.
Dec 05, 2022 - Adds key gsed2212, adding instruments gl1 and gs1, and defining correct order for gto
Jan 05, 2023 - Adds instrument gh1 to key gsed2212

Examples

# count number of items per instrument in each key
table(builtin_itembank$instrument, builtin_itembank$key)

Collection of items from instruments measuring early child development

Description

The built-in variable builtin_itemtable contains the name and label of items for measuring early child development.

Usage

builtin_itemtable

Format

A data.frame with variables:

Name	Label
`item`	Item name, gsed lexicon
`equate`	Equate group
`label`	Label (English)

Details

The builtin_itemtable is created by script data-raw/R/save_builtin_itemtable.R.

Updates:

May 30, 2022 - added gto (LF) and gpa (SF) items
June 1, 2022 - added seven gsd items
Nov 24, 2022 - Added instruments gs1, gs2
Dec 01, 2022 - Labels of gto replaced by correct order. Incorrect item order affects analyses done on LF between 20220530 - 20221201 !!!
Dec 05, 2022 - Redefines gs1 and instrument for Phase 2, removes gs2 (139) Adds gl1 (Long Form Phase 2 items 155)
Jan 05, 2023 - Adds 55 items from GSED-HF

Author(s)

Compiled by Stef van Buuren using different sources

Available keys for calculating the D-score

Description

A key contains the item difficulty estimates from a given Rasch model. The difficulty estimates (tau) are used to calculate D-scores. D-scores can only be compared when calculated with the same key.

Usage

builtin_keys

Format

builtin_keys is a data.frame with variables:

Name	Label
`key`	String. Name of the key indicating the Rasch model
`base_population`	String. Name of the base population for the key
`n_items`	Number of items in the key
`n_instruments`	Number of instruments in the key
`intercept`	Intercept to convert logit into D-score
`slope`	Slope to convert logit into D-score
`from`	Starting value of the quadrature points
`to`	Stopping value of the quadrature points
`by`	Increment of the quadrature points
`retired`	Has the key been retired?

Note

20240609 SvB: Added builtin_keys table by ⁠data-raw\data\R\save_builtin_keys.R⁠

Collection of age-conditional reference distributions

Description

A data frame containing the age-dependent distribution of the D-score for children aged 0-5 years. The distribution is modelled after the LMS distribution (Cole & Green, 1992) or BCT model (Stasinopoulos & Rigby, 2022) and is equal for both boys and girls. The LMS/BCT values can be used to graph reference charts and to calculate age-conditional Z-scores, also known as the Development-for-Age Z-score (DAZ).

Usage

builtin_references

Format

A data.frame with the following variables:

Name	Label
`population`	Name of the reference population
`key`	D-score key, e.g., `"dutch"`, `"gcdg"` or `"gsed"`
`distribution`	Distribution family: `"LMS"` or `"BCT"`
`age`	Decimal age in years
`mu`	M-curve, median D-score, P50
`sigma`	S-curve, spread expressed as coefficient of variation
`nu`	L-curve, the lambda coefficient of the LMS/BCT model for skewness
`tau`	Kurtosis parameter in the BCT model
`P3`	P3 percentile
`P10`	P10 percentile
`P25`	P25 percentile
`P50`	P50 percentile
`P75`	P75 percentile
`P90`	P90 percentile
`P97`	P97 percentile
`SDM2`	-2SD centile
`SDM1`	-1SD centile
`SD0`	0SD centile, median
`SDP1`	+1SD centile
`SDP2`	+2SD centile

Details

Here are more details on the reference population: The "dutch" references were calculated from the SMOCC data, and cover age range 0-2.5 years (van Buuren, 2014). The "gcdg" references were calculated from the 15 cohorts of the GCDG-study, and cover age range 0-5 years (Weber, 2019). The "phase1" references were calculated from the GSED Phase 1 validation data (GSED-BGD, GSED-PAK, GSED-TZA) cover age range 2w-3.5 years. The age range 3.5-5 yrs is linearly extrapolated and are only indicative. The "preliminary_standards" were calculated from the GSED Phase 1 validation data (GSED-BGD, GSED-PAK, GSED-TZA) using a subset of children with covariate indicating healthy development.

References

Cole TJ, Green PJ (1992). Smoothing reference centile curves: The LMS method and penalized likelihood. Statistics in Medicine, 11(10), 1305-1319.

Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/

Stasinopoulos M, Rigby R (2022). gamlss.dist: Distributions for Generalized Additive Models for Location Scale and Shape, R package version 6.0-3, https://CRAN.R-project.org/package=gamlss.dist

Examples

# get an overview of available references per key
table(builtin_references$population, builtin_references$key)

Calculate posterior of ability

Description

If the tauj is not within the range rello - relhi from the dynamic EAP, the procedure ignores the score of item j.

Usage

calculate_posterior(scores, tau, qp, scale, mu, sd, relhi, rello)

Arguments

scores

A vector with PASS/FAIL observations. Scores are coded numerically as pass = 1 and fail = 0.

tau

A vector containing the item difficulties for the item scores in scores estimated from the Rasch model in the preferred metric/scale.

qp

Numeric vector of equally spaced quadrature points.

scale

Scale expansion

mu

Numeric scalar. The mean of the prior.

sd

Numeric scalar. Standard deviation of the prior.

relhi

Positive numeric scalar. Upper end of the relevance interval

rello

Negative numeric scalar. Lower end of the relevance interval

Value

A list with three elements:

Name	Label
`eap`	Mean of the posterior
`gp`	Vector of quadrature points
`posterior`	Vector with posterior distribution.

Since ⁠dscore V40.1⁠ the function does not return the "start" element.

Author(s)

Stef van Buuren, Arjan Huizing, 2020

Median D-score from the default references for the given key

Description

Returns the age-interpolated median of the D-score of the default reference for a given key.

Usage

count_mu(t, key, prior_mean_NA = NA_real_)

Arguments

t

Decimal age, numeric vector

key

Character, key of the reference population

prior_mean_NA

Numeric, prior mean when age is missing

Details

Do not use this function if you want the median D-score for a specific reference.

DEPRECATED in dscore 1.9.6

Value

A vector of length length(t) with the median of the default reference population for the key.

Median of Dutch references

Description

Returns the age-interpolated median of the Dutch references (van Buuren 2014). The working range is 0-3 years. This function is used to set prior mean under key "dutch".

Usage

count_mu_dutch(t)

Arguments

t

Decimal age, numeric vector

Value

A vector of length length(t) with the median of the Dutch references.

Note

Internal function. Called by dscore()

Examples

dscore:::count_mu_dutch(0:2)

Median of GCDG references

Description

Returns the age-interpolated median of the GCDG references (Weber et al, 2019). The working range is 0-4 years. This function is used to set prior mean under keys "gcdg" and "gsed1912".

Usage

count_mu_gcdg(t)

Arguments

t

Decimal age, numeric vector

Value

A vector of length length(t) with the median of the GCDG references.

Note

Internal function. Called by dscore()

Examples

dscore:::count_mu_gcdg(0:2)

Median of phase1 references

Description

Returns the age-interpolated median of the phase1 references based on LF & SF in GSED-BGD, GSED-PAK, GSED-TZA. This function is used to set prior mean under keys "293_0" and "gsed2212".

Usage

count_mu_phase1(t)

Arguments

t

Decimal age, numeric vector

Details

The interpolation is done in two rounds. First round: Calculate D-scores using .gcdg prior-mean, calculate reference, estimate round 1 parameters used in this function. Round 2: Calculate D-score using round 1 estimates as the prior mean (most differences are within 0.1 D-score points), recalculate references, estimate round 2 parameters used in this function.

Round 1: Count model: <= 9MN: 21.3449 + 26.4916 t + 7.0251(t + 0.2) Count model: > 9Mn & <= 3.5 YR: 14.69947 - 12.18636 t + 69.11675(t + 0.92) Linear model: > 3.5 YRS: 61.40956 + 3.80904 t

Round 2: Count model: < 9MND: 20.5883 + 27.3376 t + 6.4254(t + 0.2) Count model: > 9MND & < 3.5 YR: 14.63748 - 12.11774 t + 69.05463(t + 0.92) Linear model: > 3.5 YRS: 61.37967 + 3.83513 t

The working range is 0-3.5 years. After the age of 3.5 years, the function will increase at an arbitrary rate of 3.8 D-score points per year.

Value

A vector of length length(t) with the median of the GCDG references.

Note

Internal function. Called by dscore()

Author(s)

Stef van Buuren, on behalf of GSED project

Examples

dscore:::count_mu_phase1(0:5)

Median of preliminary_standards

Description

Returns the age-interpolated median of the preliminary_standards based on LF & SF in GSED-BGD, GSED-PAK, GSED-TZA. This function is used to set prior mean under key "gsed2406".

Usage

count_mu_preliminary_standards(t)

Arguments

t

Decimal age, numeric vector

Value

A vector of length length(t) with the median of the GCDG references.

Note

Internal function. Called by dscore()

Author(s)

Stef van Buuren, on behalf of GSED project

Examples

dscore:::count_mu_preliminary_standards(0:5)

Calculate Development-for-Age Z-score (DAZ)

Description

The daz() function calculated the Development-for-Age Z-score (DAZ). The DAZ represents a child's D-score after adjusting for age by an external age-conditional reference.

Usage

daz(d, x, reference_table = NULL, dec = 3, verbose = FALSE)

zad(z, x, reference_table = NULL, dec = 2, verbose = FALSE)

Arguments

d

Vector of D-scores

x

Vector of ages (decimal age)

reference_table

A data.frame with the LMS or BCT reference values. The default NULL selects the default reference belonging to the key, as specified in the base_population field in dscore::builtin_keys.

dec

The number of decimals (default dec = 3).

verbose

Print out the used reference table (default verbose = FALSE).

z

Vector of standard deviation scores (DAZ)

Details

The zad() is the inverse of daz(): Given age and the Z-score, it finds the raw D-score.

Note 1: The Box-Cox Cole and Green (BCCG) and Box-Cox t (BCT) distributions model only positive D-score values. To increase robustness, the daz() and zad() functions will round up any D-scores lower than 1.0 to 1.0.

Note 2: The daz() and zad() function call modified version of the pBCT() and qBCT() functions from gamlss for better handling of NA's and rounding.

Value

Unnamed numeric vector with Z-scores of length length(d).

Unnamed numeric vector with D-scores of length length(z).

Author(s)

Stef van Buuren

References

Cole TJ, Green PJ (1992). Smoothing reference centile curves: The LMS method and penalized likelihood. Statistics in Medicine, 11(10), 1305-1319.

Examples

# using default reference and key
daz(d = c(35, 50), x = c(0.5, 1.0))

# print out names of the used reference table
daz(d = c(35, 50), x = c(0.5, 1.0), verbose = TRUE)

# using the default reference in key gcdg
reftab <- get_reference(key = "gcdg")
daz(d = c(35, 50), x = c(0.5, 1.0), reference_table = reftab)

# using Dutch reference in default key
reftab <- get_reference(population = "dutch", verbose = TRUE)
daz(d = c(35, 50), x = c(0.5, 1.0), reference_table = reftab)
# population median at ages 0.5, 1 and 2 years, default reference
zad(z = rep(0, 3), x = c(0.5, 1, 2))

# population median at ages 0.5, 1 and 2 years, gcdg key
reftab <- get_reference(key = "gcdg", verbose = TRUE)
zad(z = rep(0, 3), x = c(0.5, 1, 2), reference_table = reftab)

# population median at ages 0.5, 1 and 2 years, dutch key
reftab <- get_reference(key = "dutch", verbose = TRUE)
zad(z = rep(0, 3), x = c(0.5, 1, 2), reference = reftab)

Decomposes item names into their four components

Description

This utility function decomposes item names into components: instrument, domain, mode and number

Usage

decompose_itemnames(x)

Arguments

x

A character vector containing item names (gsed lexicon)

Details

The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.

Value

A data.frame with length(x) rows and four columns, named: instrument, domain, mode, and number.

Author(s)

Stef van Buuren

References

https://docs.google.com/spreadsheets/d/1zLsSW9CzqshL8ubb7K5R9987jF4YGDVAW_NBw1hR2aQ/edit#gid=0

Examples

itemnames <- c("aqigmc028", "grihsd219", "", "by1mdd157", "mdsgmd006")
decompose_itemnames(itemnames)

D-score estimation

Description

The dscore() function estimates the following quantities: D-score, a numeric score that quantifies child development by one number, Development-for-Age Z-score (DAZ) that corrects the D-score for age, standard error of measurement (SEM) of the D-score.

Usage

dscore(
  data,
  items = names(data),
  key = NULL,
  population = NULL,
  xname = "age",
  xunit = c("decimal", "days", "months"),
  prepend = NULL,
  itembank = NULL,
  metric = c("dscore", "logit"),
  prior_mean = NULL,
  prior_mean_NA = NULL,
  prior_sd = NULL,
  prior_sd_NA = NULL,
  transform = NULL,
  qp = NULL,
  dec = c(2L, 3L),
  relevance = c(-Inf, Inf),
  algorithm = c("current", "1.8.7"),
  verbose = FALSE
)

dscore_posterior(
  data,
  items = names(data),
  key = NULL,
  population = NULL,
  xname = "age",
  xunit = c("decimal", "days", "months"),
  prepend = NULL,
  itembank = NULL,
  metric = c("dscore", "logit"),
  prior_mean = NULL,
  prior_mean_NA = NULL,
  prior_sd = NULL,
  prior_sd_NA = NULL,
  transform = NULL,
  qp = NULL,
  dec = c(2L, 3L),
  relevance = c(-Inf, Inf),
  algorithm = c("current", "1.8.7"),
  verbose = FALSE
)

Arguments

data

A data.frame or matrix with the data. A row collects all observations made on a child on a set of milestones administered at a given age. The function calculates a D-score for each row. Different rows can correspond to different children or ages.

items

A character vector containing names of items to be included into the D-score calculation. Milestone scores are coded numerically as 1 (pass) and 0 (fail). By default, D-score calculation is done on all items found in the data that have a difficulty parameter under the specified key.

key

String. They key identifies 1) the difficulty estimates pertaining to a particular Rasch model, and 2) the prior mean and standard deviation of the prior distribution for calculating the D-score. The default key NULL sets key = "gsed2406". View builtin_keys for an overview of the available keys.

population

String. The name of the reference population to calculate DAZ. Use with(builtin_references, table(key, population)) to see which built-in references are available for key - population combinations. If not specified, the function set the default population as builtin_keys$base_population[key == builtin_keys$key].

xname

A string with the name of the age variable in data. The default is "age". Do not round age.

xunit

A string specifying the unit in which age is measured (either "decimal", "days" or "months"). The default "decimal" corresponds to decimal age in years.

prepend

Character vector with column names in data that will be prepended to the returned data frame. This is useful for copying columns from data into the result, e.g., for matching.

itembank

A data.frame with at least three columns named key, item and tau. By default, the function uses dscore::builtin_itembank. If you specify your own itembank, then you should also provide the relevant transform and qp arguments.

metric

A string, either "dscore" (default) or "logit", signalling the metric in which ability is estimated. daz is not calculated for the logit scale.

prior_mean

NULL (default), a string, a numeric scalar, or a numeric vector with nrow(data) elements. The default value NULL will consult the base_population field in builtin_keys, and use the corresponding median of that reference as prior mean for the D-score. The string should refer to a column name in data that contains user-supplied values of the prior mean for each observation. A numeric scalar will be expanded to all observations. A numeric vector will be used as is.

prior_mean_NA

NULL (default) or a scalar numeric, representing the prior mean for observations with missing ages. By default, D-scores with missing ages will we NA. We suggest setting prior_mean_NA = 50 as a reasonable choice for samples between 0-3 years. The argument is ignored if prior_mean is specified per observation, which gives you full control of priors for observations with missing ages.

prior_sd

NULL (default), a string, a numeric scalar, or a numeric vector with nrow(data) elements. The default (NULL) uses a value of 5 for all ages. The string should refer to a column name in data that contains user-supplied values of the prior sd for each observation. A numeric scalar will be expanded to all observations. A numeric vector will be used as is.

prior_sd_NA

NULL (default) or a scalar numeric, representing the prior sd for observations with missing ages. By default, D-scores with missing ages will we NA. We suggest setting prior_sd_NA = 20 as a reasonable choice for samples between 0-3 years. The argument is ignored if prior_sd is specified per observation, which gives you full control of priors for observations with missing ages.

transform

Numeric vector, length 2, containing the intercept and slope of the linear transform from the logit scale into the the D-score scale. The default (NULL) searches builtin_keys for intercept and slope values.

qp

Numeric vector of equally spaced quadrature points. This vector should span the range of all D-score or logit values. The default (NULL) creates seq(from, to, by) searching the arguments from builtin_keys.

dec

A vector of two integers specifying the number of decimals for rounding the D-score and DAZ, respectively. The default is dec = c(2L, 3L).

relevance

A numeric vector of length with the lower and upper bounds of the relevance interval. The procedure calculates a dynamic EAP for each item. If the difficulty level (tau) of the next item is outside the relevance interval around EAP, the procedure ignore the score on the item. The default is c(-Inf, +Inf) does not ignore scores.

algorithm

Computational method, for backward compatibility. Either "current" (default) or "1.8.7" (deprecated).

verbose

Logical. Print settings.

Details

The scoring algorithm is based on the method by Bock and Mislevy (1982). The method uses Bayes rule to update a prior ability into a posterior ability.

The item names should correspond to the "gsed" lexicon.

A key is defined by the set of estimated item difficulties.

Key	Model	Quadrature	Instruments	Direct/Caregiver	Reference
`"dutch"`	`⁠75_0⁠`	`-10:80`	1	direct	Van Buuren, 2014/2020
`"gcdg"`	`⁠565_18⁠`	`-10:100`	13	direct	Weber, 2019
`"gsed1912"`	`⁠807_17⁠`	`-10:100`	21	mixed	GSED Team, 2019
`"293_0"`	`⁠293_0⁠`	`-10:100`	2	mixed	GSED Team, 2022
`"gsed2212"`	`⁠818_6⁠`	`-10:100`	27	mixed	GSED Team, 2022
`"gsed2406"`	`⁠818_6⁠`	`-10:100`	27	mixed	GSED Team, 2024

As a general rule, one should only compare D-scores that are calculated using the same key and the same set of quadrature points. For calculating D-scores on new data, the advice is to use the default, which currently is "gsed2406".

The default starting prior is a mean calculated from a so-called "Count model" that describes mean D-score as a function of age. The The Count models are implemented in the function ⁠[get_mu()]⁠. By default, the spread of the starting prior is 5 D-score points around the mean D-score, which corresponds to approximately 1.5 to 2 times the normal spread of child of a given age. The starting prior is informative for very short test (say <5 items), but has little impact on the posterior for larger tests.

Value

The dscore() function returns a data.frame with nrow(data) rows. Optionally, the first block of columns can be copied to the result by using prepend. The second block consists of the following columns:

Name	Label
`a`	Decimal age (years)
`n`	Number of items with valid (0/1) data
`p`	Percentage of passed milestones
`d`	D-score, mean of posterior distribution
`sem`	Standard error of measurement, standard deviation of the posterior
`daz`	D-score corrected for age, calculated in Z-scale (for metric `"dscore"`)

The D-score in column d is a linear scale, with values usually ranging from 0 to 100. The D-score is NA if age is missing or if age is lower than -1/12. It is possible to calculate D-scores for cases with missing ages by setting prior_mean_NA and prior_sd_NA to some reasonable value, e.g., prior_mean_NA = 50 and prior_sd_NA = 20, for the sample at hand.

The SEM is a positive number that quantifies the uncertainty of the D-score. It is NA if the D-score is NA.

The DAZ in column daz is a Z-score that corrects the D-score for age. It is NA when there are no reference values for the given age, or when the D-score is extremely unlikely to be valid at the given age.

Advanced applications: The dscore_posterior() function returns a data frame with nrow(data) rows and length(qp) plus prepended columns with the full posterior density of the D-score at each quadrature point. If no valid responses are found, dscore_posterior() returns the prior density. Versions prior to 1.8.5 returned a matrix (instead of a data.frame). Code that depends on the result being a matrix may break and may need adaptation.

Author(s)

Stef van Buuren, Iris Eekhout, Arjan Huizing (2022)

References

Bock DD, Mislevy RJ (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6(4), 431-444.

Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368. https://stefvanbuuren.name/publication/van-buuren-2014-gc/

Examples

# using all defaults and properly formatted data
ds <- dscore(milestones)
head(ds)

# step-by-step example
data <- data.frame(
  id = c(
    "Jane", "Martin", "ID-3", "No. 4", "Five", "6",
    NA_character_, as.character(8:10)
  ),
  age = rep(round(21 / 365.25, 4), 10),
  ddifmd001 = c(NA, NA, 0, 0, 0, 1, 0, 1, 1, 1),
  ddicmm029 = c(NA, NA, NA, 0, 1, 0, 1, 0, 1, 1),
  ddigmd053 = c(NA, 0, 0, 1, 0, 0, 1, 1, 0, 1)
)
items <- names(data)[3:5]

# third item is not part of the default key
get_tau(items, verbose = TRUE)

# calculate D-score
dscore(data)

# prepend id variable to output
dscore(data, prepend = "id")

# or prepend all data
# dscore(data, prepend = colnames(data))

# calculate full posterior
p <- dscore_posterior(data)

# check that rows sum to 1
rowSums(p)

# plot full posterior for measurement 7
barplot(as.matrix(p[7, 12:36]),
  names = 1:25,
  xlab = "D-score", ylab = "Density", col = "grey",
  main = "Full D-score posterior for measurement in row 7",
  sub = "D-score (EAP) = 11.58, SEM = 3.99")

# plot P10, P50 and P90 of D-score references
g <- expand.grid(age = seq(0.1, 4, 0.1), p = c(0.1, 0.5, 0.9))
d <- zad(z = qnorm(g$p), x = g$age, verbose = TRUE)
matplot(
  x = matrix(g$age, ncol = 3), y = matrix(d, ncol = 3), type = "l",
  lty = 1, col = "blue", xlab = "Age (years)", ylab = "D-score",
  main = "D-score preliminary standards: P10, P50 and P90")
abline(h = seq(10, 80, 10), v = seq(0, 4, 0.5), col = "gray", lty = 2)

# add measurements made on very preterms, ga < 32 weeks
ds <- dscore(milestones)
points(x = ds$a, y = ds$d, pch = 19, col = "red")

Get age equivalents of items that have a difficulty estimate

Description

This function calculates the ages at which a certain percent in the reference population passes the items.

Usage

get_age_equivalent(
  items,
  pct = c(10, 50, 90),
  key = NULL,
  population = NULL,
  transform = NULL,
  itembank = dscore::builtin_itembank,
  xunit = c("decimal", "days", "months"),
  verbose = FALSE
)

Arguments

items

pct

Numeric vector with requested percentiles (0-100). The default is pct = c(10, 50, 90).

key

population

transform

itembank

xunit

A string specifying the unit in which age is measured (either "decimal", "days" or "months"). The default "decimal" corresponds to decimal age in years.

verbose

Logical. Print settings.

Value

data.frame with four columns: item, d (D-score), pct (percentile), and a (age-equivalent, in xunit units).

Note

The function internally defines a scale factor given the key.

Examples

get_age_equivalent(c("gpagmc018", "gtogmd026", "ddicmm050"))

Extract item names

Description

The get_itemnames() function matches names against the 9-code template. This is useful for quickly selecting names of items from a larger set of names.

Usage

get_itemnames(
  x,
  instrument = NULL,
  domain = NULL,
  mode = NULL,
  number = NULL,
  strict = FALSE,
  itemtable = NULL,
  order = "idnm"
)

Arguments

x

A character vector, data.frame or an object of class lean. If not specified, the function will return all item names in itemtable.

instrument

A character vector with 3-position codes of instruments that should match. The default instrument = NULL allows for all instruments.

domain

A character vector with 2-position codes of domains that should match. The default instrument = NULL allows for all domains.

mode

A character vector with 1-position codes of the mode of administration. The default mode = NULL allows for all modes.

number

A numeric or character vector with item numbers. The default number = NULL allows for all numbers.

strict

A logical specifying whether the resulting item names must conform to one of the built-in names. The default is strict = FALSE.

itemtable

A data.frame set up according to the same structure as builtin_itemtable(). If not specified, the builtin_itemtable is used.

order

A four-letter string specifying the sorting order. The four letters are: i for instrument, d for domain, m for mode and n for number. The default is "idnm".

Details

The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.

Value

A vector with names of items

Author(s)

Stef van Buuren 2020

Examples

itemnames <- c("aqigmc028", "grihsd219", "", "age", "mdsgmd999")

# filter out impossible names
get_itemnames(itemnames)
get_itemnames(itemnames, strict = TRUE)

# only items from specific instruments
get_itemnames(itemnames, instrument = c("aqi", "mds"))
get_itemnames(itemnames, instrument = c("aqi", "mds"), strict = TRUE)

# get all items from the se domain of iyo instrument
get_itemnames(domain = "se", instrument = "iyo")

# get all item from the se domain with direct assessment mode
get_itemnames(domain = "se", mode = "d")

# get all item numbers 70 and 73 from gm domain
get_itemnames(number = c(70, 73), domain = "gm")

Get a subset of items from the itemtable

Description

The builtin_itemtable object in the dscore package contains basic meta-information about items: a name, the equate group, and the item label. The get_itemtable() function returns a subset of items in the itemtable.

Usage

get_itemtable(items = NULL, itemtable = NULL, decompose = FALSE)

Arguments

items

A logical or character vector of item names to return. The default (NULL) returns all items.

itemtable

A data.frame set up according to the same structure as builtin_itemtable(). If not specified, the builtin_itemtable is used. If itemtable = "", then a dynamic item table is created from any specified item names.

decompose

If TRUE, the function adds four columns: instrument, domain, mode and number.

Value

A data.frame with seven columns.

Examples

head(get_itemtable(), 3)
get_itemtable(LETTERS[1:3], "")

Get labels for items

Description

The get_labels() function obtains the item labels for a specified set of items.

Usage

get_labels(items = NULL, trim = NULL, itemtable = NULL)

Arguments

items

A character vector of item names to return. The default (NULL) returns the labels of all items.

trim

The maximum number of characters in the label. The default trim = NULL does not trim labels.

itemtable

A data.frame set up according to the same structure as builtin_itemtable(). If not specified, the builtin_itemtable is used.

Value

A named character vector with length(items) elements with item labels, in the same order as in items.

Examples

# get labels of first two Macarthur items
get_labels(get_itemnames(instrument = "mac", number = 1:2), trim = 40)

Median D-score from the base population for a given key

Description

Returns the age-interpolated median of the D-score of the default reference for a given key.

Usage

get_mu(t, key, prior_mean_NA = NA_real_)

Arguments

t

Decimal age, numeric vector

key

Character, key of the reference population

prior_mean_NA

Numeric, prior mean when age is missing

Details

Use get_reference() for more options.

Value

A vector of length length(t) with the median of the default reference population for the key.

Get D-score reference

Description

The get_reference() function selects the D-score reference distribution.

Usage

get_reference(
  population = NULL,
  key = NULL,
  references = dscore::builtin_references,
  verbose = FALSE,
  ...
)

Arguments

population

key

references

A data.frame with the same structure as builtin_references. The default is to use builtin_references.

verbose

Logical. Print settings.

...

Used to test whether the call contained the deprecated argument references.

Value

A data.frame with the LMS reference values.

Note

No references for population "gsed" exist. The function will silently rewrite population = "gsed" into to the population = "gsed".

The "dutch" reference was published in Van Buuren (2014) The "gcdg" was calculated from 15 cohorts with direct observations (Weber, 2019). The "phase1" references were calculated from the GSED Phase 1 validation data (GSED-BGD, GSED-PAK, GSED-TZA) cover age range 2w-3.5 years. The age range 3.5-5 yrs is linearly extrapolated and are only indicative. The "preliminary_standards" references were calculated from the GSED Phase 1 validation using a subset of children with healthy development.

References

Van Buuren S (2014). Growth charts of human development. Stat Methods Med Res, 23(4), 346-368.

Examples

# see key-population combinations of builtin_references
table(builtin_references$key, builtin_references$population)

# get the default reference
reftab <- get_reference()
head(reftab, 2)

# get the default reference for the key "gsed2212"
reftab <- get_reference(key = "gsed2212", verbose = TRUE)

# get dutch reference for default key
reftab <- get_reference(population = "dutch", verbose = TRUE)

# loading a non-existing reference yields zero rows
reftab <- get_reference(population = "france", verbose = TRUE)
nrow(reftab)

Obtain difficulty parameters from item bank

Description

Searches the item bank for matching items, and returns the difficulty estimates. Matching is done by item name. Comparisons are done in lower case.

Usage

get_tau(
  items,
  key = NULL,
  itembank = dscore::builtin_itembank,
  verbose = FALSE
)

Arguments

items

key

itembank

verbose

Logical. Print settings.

Value

A named vector with the difficulty estimate per item with length(items) elements.

Author(s)

Stef van Buuren 2020

Examples

# difficulty levels in the GHAP lexicon
get_tau(items = c("ddifmd001", "DDigmd052", "xyz"))

Sample of 10 children from the GSED Phase 1 study

Description

A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.

Usage

gsample

Format

A data.frame with 10 rows and 295 variables:

Name	Label
`id`	Integer, child ID
`agedays`	Integer, age in days
`gpalac001`	Integer, Cry when hungry...: 1 = yes, 0 = no, NA = not administered
`gpalac002`	Integer, Look at/focus...: 1 = yes, 0 = no, NA = not administered
`...`	and so on..

There are 138 gpa items (item gpamoc008 (clench fists) removed) from GSED SF and and 155 gto items from GSED LF.

Examples

head(gsample)

Outcomes on developmental milestones for preterm-born children

Description

A demo dataset with developmental scores at the item level for a set of 27 preterm children.

Usage

milestones

Format

A data.frame with 100 rows and 62 variables:

Name	Label
`id`	Integer, child ID
`agedays`	Integer, age in days
`age`	Numeric, decimal age in years
`sex`	Character, "male", "female"
`gagebrth`	Integer, gestational age in days
`ddifmd001`	Integer, Fixates eyes: 1 = yes, 0 = no
`...`	and so on..

Examples

head(milestones)

Normalize distribution

Description

Normalizes the distribution so that the total mass equals 1.

Usage

normalize(d, qp)

Arguments

d

A vector with length(qp) elements representing the unscaled density at each quadrature point.

qp

Vector of equally spaced quadrature points.

Value

A vector of length(d) elements with the prior density estimate at each quadature point.

Note

: Internal function

Examples

dscore:::normalize(c(5, 10, 5), qp = c(0, 1, 2))

sum(dscore:::normalize(rnorm(5), qp = 1:5))

Calculate posterior for one item given score, difficulty and prior

Description

Calculate posterior for one item given score, difficulty and prior

Usage

posterior(score, tau, prior, qp, scale)

Arguments

score

Integer, either 0 (fail) and 1 (pass)

tau

Numeric, difficulty parameter

prior

Vector of prior values on quadrature points qp

qp

vector of equally spaced quadrature points

scale

expansion relative to the logit scale

Details

This function assumes that the difficulties have been estimated by a binary Rasch model, e.g. by rasch.pairwise.itemcluster() of the sirt package.

Value

A vector of length length(prior)

Note

: Internal function

Author(s)

Stef van Buuren, Arjan Huizing, 2020

Rename items from gcdg into gsed lexicon

Description

Function rename_gcdg_gsed() translates item names in the gcdg lexicon to item names in the gsed lexicon.

Usage

rename_gcdg_gsed(x, copy = TRUE)

Arguments

x

A character vector containing item names in the gcdg lexicon

copy

A logical indicating whether any unmatches names should be copied (copy = TRUE) or set to an empty string.

Details

The gsed-naming convention is as follows. Position 1-3 codes the instrument, position 4-5 codes the domain, position 6 codes direct/caregiver/message, positions 7-9 is a item sequence number.

The function currently support ASQ-I (aqi), Barrera-Moncade (bar), Batelle (bat), Bayley I (by1), Bayley II (by2), Bayley III (by3), Dutch Development Instrument (ddi), Denver (den), Griffith (gri), MacArthur (mac), WHO milestones (mds), Mullen (mul), pegboard (peg), South African Griffith (sgr), Stanford Binet (sbi), Tepsi (tep), Vineland (vin).

In cases where the domain of the items isn't clear (vin, bar), the domain is coded as 'xx'.

Value

A character vector of length length(x) with gcdg item names replaced by gsed item name.

Author(s)

Iris Eekhout, Stef van Buuren

References

https://docs.google.com/spreadsheets/d/1zLsSW9CzqshL8ubb7K5R9987jF4YGDVAW_NBw1hR2aQ/edit#gid=0

Examples

from <- c(
  "ag28", "gh2_19", "a14ps4", "b1m157", "mil6",
  "bm19", "a16fm4", "n22", "ag9", "gh6_5"
)
to <- rename_gcdg_gsed(from, copy = FALSE)
to

Sample of 10 children from GSED HF

Description

A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.

Usage

sample_hf

Format

A data.frame with 10 rows and 57 variables:

Name	Label
`subjid`	Integer, child ID
`agedays`	Integer, age in days
`hf001`	Integer, ...: 1 = yes, 0 = no, NA = not administered
`hf002`	Integer, ...: 1 = yes, 0 = no, NA = not administered
`...`	and so on..

Sample data for 55 gpa items forming GSED HF V1

Examples

head(sample_hf)

Sample of 10 children from gto (LF)

Description

A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.

Usage

sample_lf

Format

A data.frame with 10 rows and 157 variables:

Name	Label
`subjid`	Integer, child ID
`agedays`	Integer, age in days
`lf001`	Integer, ...: 1 = yes, 0 = no, NA = not administered
`lf002`	Integer, ...: 1 = yes, 0 = no, NA = not administered
`...`	and so on..

Sample data for 155 gto items from GSED SF

Examples

head(sample_lf)

Sample of 10 children from gpa (SF)

Description

A demo dataset with developmental scores at the item level for 10 random children from the GSED Phase 1 data.

Usage

sample_sf

Format

A data.frame with 10 rows and 141 variables:

Name	Label
`subjid`	Integer, child ID
`agedays`	Integer, age in days
`sf001`	Integer, Cry when hungry...: 1 = yes, 0 = no, NA = not administered
`sf002`	Integer, Look at/focus...: 1 = yes, 0 = no, NA = not administered
`...`	and so on..

Sample data for 139 gpa items from GSED SF

Examples

head(sample_sf)

Sorts item names according to user-specified priority

Description

This function sorts the item names according to instrument, domain, mode and number. The user can specify the sorting order.

Usage

sort_itemnames(x, order = "idnm")

order_itemnames(x, order = "idnm")

Arguments

x

A character vector containing item names (gsed lexicon)

order

A four-letter string specifying the sorting order. The four letters are: i for instrument, d for domain, m for mode and n for number. The default is "idnm".

Value

sort_itemnames() return a character vector with length(x) sorted elements. order_itemnames() return an integer vector of length length(x) with positions of the sorted elements.

Author(s)

Stef van Buuren

Examples

itemnames <- c("aqigmc028", "grihsd219", "", "by1mdd157", "mdsgmd006")
sort_itemnames(itemnames)

D-score for child development

Description

User functions

Built-in data

Acknowledgements

Author(s)

References

See Also

Collection of items fitting the Rasch model

Description

Usage

Format

Details

Note

See Also

Examples

Collection of items from instruments measuring early child development

Description

Usage

Format

Details

Author(s)

Available keys for calculating the D-score

Description

Usage

Format

Note

Collection of age-conditional reference distributions

Description

Usage

Format

Details

References

See Also

Examples

Calculate posterior of ability

Description

Usage

Arguments

Value

Author(s)

Median D-score from the default references for the given key

Description

Usage

Arguments

Details

Value

Median of Dutch references

Description

Usage

Arguments

Value

Note

Examples

Median of GCDG references

Description

Usage

Arguments

Value

Note

Examples

Median of phase1 references

Description

Usage

Arguments

Details

Value

Note

Author(s)

Examples

Median of preliminary_standards

Description

Usage

Arguments

Value

Note

Author(s)

Examples

Calculate Development-for-Age Z-score (DAZ)

Description