Modeling binary choice data

Classic methods

For binary choice data not explicitly designed to titrate out indifference points (as in an adjusting amount procedure), there are a few widely-used traditional scoring methods to quantify discounting.

Kirby scoring

One scoring method is the one designed for the Monetary Choice Questionnaire (Kirby, 1999):

data("td_bc_single_ptpt")
mod <- kirby_score(td_bc_single_ptpt)
print(mod)
#> 
#> Temporal discounting indifference point model
#> 
#> Discount function: hyperbolic, with coefficients:
#> 
#>          k 
#> 0.02176563 
#> 
#> ED50: 45.9439876371218
#> AUC: 0.0551987542147013

Although this method computes \(k\) values according to the hyperbolic discount function, in principle it’s possible to use other single-parameter discount functions (though this is not an established practice and should be considered an experimental feature of tempodisco):

mod_exp <- kirby_score(td_bc_single_ptpt, discount_function = 'exponential')
print(mod_exp)
#> 
#> Temporal discounting indifference point model
#> 
#> Discount function: exponential, with coefficients:
#> 
#>           k 
#> 0.008170247 
#> 
#> ED50: 84.8379742026149
#> AUC: 0.0335100135964859
mod_pow <- kirby_score(td_bc_single_ptpt, discount_function = 'power')
print(mod_pow)
#> 
#> Temporal discounting indifference point model
#> 
#> Discount function: power, with coefficients:
#> 
#>         k 
#> 0.3052023 
#> 
#> ED50: 8.69012421478859
#> AUC: 0.1173431135372
mod_ari <- kirby_score(td_bc_single_ptpt, discount_function = 'arithmetic')
print(mod_ari)
#> 
#> Temporal discounting indifference point model
#> 
#> Discount function: arithmetic, with coefficients:
#> 
#>        k 
#> 1.034244 
#> 
#> ED50: 95.8739950116839
#> AUC: 0.0262488714558682

Wileyto scoring

It is also possible to use the logistic regression method of Wileyto et al. (2004), where we can solve for the \(k\) value of the hyperbolic discount function in terms of the regression coefficients:

mod <- wileyto_score(td_bc_single_ptpt)
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
print(mod)
#> 
#> Temporal discounting binary choice linear model
#> 
#> Discount function: hyperbolic from model hyperbolic.1, with coefficients:
#> 
#>          k 
#> 0.04372626 
#> 
#> Call:  glm(formula = fml, family = binomial(link = "logit"), data = data)
#> 
#> Coefficients:
#>     .B1      .B2  
#> 0.49900  0.02182  
#> 
#> Degrees of Freedom: 70 Total (i.e. Null);  68 Residual
#> Null Deviance:       97.04 
#> Residual Deviance: 37.47     AIC: 41.47

Newer methods

Linear models

The Wileyto et al. (2004) approach turns out to be possible for other discount functions as well (Kinley, Oluwasola & Becker, 2025.):

Name	Discount function	Linear predictor	Parameters
`hyperbolic.1`	Hyperbolic (Mazur, 1987): \(\frac{1}{1 + kt}\)	\(\beta_1 \left(1 - \frac{v_D}{v_I} \right) + \beta_2 t\)	\(k = \frac{\beta_2}{\beta_1}\)
`hyperbolic.2`	(Mazur, 1987): \(\frac{1}{1 + kt}\)	\(\beta_1\left( \sigma^{-1}\left[\frac{v_\mathcal{I}}{v_\mathcal{D}}\right] + \log t \right) + \beta_2\)	\(k = e^\frac{\beta_2}{\beta_1}\)
`exponential.1`	Exponential (Samuelson, 1937): \(e^{-kt}\)	\(\beta_1 \log \frac{v_I}{v_D} + \beta_2 t\)	\(k = \frac{\beta_2}{\beta_1}\)
`exponential.2`	Exponential (Samuelson, 1937): \(e^{-kt}\)	\(\beta_1\left( G^{-1}\left[\frac{v_\mathcal{I}}{v_\mathcal{D}}\right] + \log t \right) + \beta_2\)	\(k = e^\frac{\beta_2}{\beta_1}\)
`power`	Power (Harvey, 1986): \(\frac{1}{(1 + t)^k}\)	\(\beta_1 \log \frac{v_I}{v_D} + \beta_2 \log (1 + t)\)	\(k = \frac{\beta_2}{\beta_1}\)
`scaled-exponential`	Scaled exponential (beta-delta; Laibson, 1997): \(w e^{-kt}\)	\(\beta_1\log\frac{v_{I}}{v_{D}} + \beta_2 t + \beta_3\)	\(k = \frac{\beta_2}{\beta_1}\), \(w = e^{-\frac{\beta_3}{\beta_1}}\)
`nonlinear-time-hyperbolic`	Nonlinear-time hyperbolic (Rachlin, 2006): \(\frac{1}{1 + k t^s}\)	\(\beta_1 \sigma^{-1}\left[\frac{v_{I}}{v_{D}}\right] + \beta_2\log t + \beta_3\)	\(k = e^\frac{\beta_3}{\beta_1}\), \(s = \frac{\beta_2}{\beta_1}\)
`nonlinear-time-exponential`	Nonlinear-time exponential (Ebert & Prelec, 2007): \(e^{-kt^s}\)	\(\beta_1 G^{-1}\left[\frac{v_\mathcal{I}}{v_\mathcal{D}}\right] + \beta_2\log t + \beta_3\)	\(k = e^\frac{\beta_3}{\beta_1}\), \(s = \frac{\beta_2}{\beta_1}\)

Where \(\sigma^{-1}[\cdot]\) is the logit function, or the quantile function of a standard logistic distribution, and \(G^{-1}[\cdot]\) is the quantile function of a standard Gumbel distribution.

We can test all of these and select the best according to the Bayesian Information Criterion as follows:

mod <- td_bclm(td_bc_single_ptpt, model = 'all')
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
print(mod)
#> 
#> Temporal discounting binary choice linear model
#> 
#> Discount function: exponential from model exponential.2, with coefficients:
#> 
#>          k 
#> 0.01003216 
#> 
#> Call:  glm(formula = fml, family = binomial(link = "logit"), data = data)
#> 
#> Coefficients:
#>     .B1      .B2  
#>   3.597  -16.553  
#> 
#> Degrees of Freedom: 70 Total (i.e. Null);  68 Residual
#> Null Deviance:       97.04 
#> Residual Deviance: 15.7  AIC: 19.7

Nonlinear models

To explore a wider range of discount functions, we can fit a nonlinear model by calling td_bcnm. The full list of built-in discount functions is as follows:

Name	Functional form	Notes
`exponential` (Samuelson, 1937)	\(f(t; k) = e^{-k t}\)
`hyperbolic` (Mazur, 1987)	\(f(t; k) = \frac{1}{1 + kt}\)
`scaled-exponential` (Laibson, 1997)	\(f(t; k, w) = w e^{-k t}\)	Also known as quasi-hyperbolic or beta-delta and written as \(f(t; \beta, \delta) = \beta e^{-\delta t}\)
`nonlinear-time-exponential` (Ebert & Prelec, 2007)	\(f(t; k, s) = e^{-k t^s}\)	Also known as constant sensitivity
`inverse-q-exponential` (Green & Myerson, 2004)	\(f(t; k, s) = \frac{1}{(1 + k t)^s}\)	Also known as generalized hyperbolic (Loewenstin & Prelec), hyperboloid (Green & Myerson, 2004), or q-exponential (Han & Takahashi, 2012)
`nonlinear-time-hyperbolic` (Rachlin, 2006)	\(f(t; k, s) = \frac{1}{1 + k t^s}\)	Also known as power-function (Rachlin, 2006)
`dual-systems-exponential` (Ven den Bos & McClure, 2013)	\(f(t; k_1, k_2, w) = w e^{-k_1 t} + (1 - w) e^{-k_2 t}\)
`additive-utility` (Killeen, 2009)	\(f(t; k, s, a) = \left( 1 - \frac{k}{V_D^a}t^s\right)^\frac{1}{a}\)	\(V_D\) is the value of the delayed reward. \(f(t; k, s, a) = 0\) for \(t > \left(V_D^a / k\right)^{1/s}\).
`power` (Harvey, 1986, eq. 2)	\(f(t; k) = \frac{1}{(1 + t)^k}\)	In equation 2 of the reference, the discount function is described as \(\frac{1}{t^k}\), but time begins at \(t = 1\).
`arithmetic` (Doyle & Chen, 2010)	\(f(t; k) = 1 - \frac{kt}{V_D}\)	\(V_D\) is the value of the delayed reward. \(f(t; k) = 0\) for \(kt > V_D\).
`fixed-cost` (Benhabib, Bisin, & Schotter, 2010)	\(f(t; w) = e^{-kt} - \frac{w}{V_D}\)	\(V_D\) is the value of the delayed reward. \(f(t; w) = 0\) for \(\frac{w}{V_D} > e^{-kt}\).
`absolute-stationarity` (Blavatskyy, 2024, eq. 3)	\(f(t; k, s) = \exp\left\{ -k\frac{ts}{ts + 1}\right\}\)	The original paper uses \(t\) rather than \(ts\). However, a scale factor appears necessary to account for different time units.
`relative-stationarity` (Blavatskyy, 2024, eq. 7)	\(f(t; k, s) = \left( \frac{ts + 1}{2ts + 1} \right)^k\)	The original paper uses \(t\) rather than \(ts\). However, a scale factor appears necessary to account for different time units.
`constant` (Franck et al., 2015)	\(f(t; k) = k\)	Null model; participants can be excluded if this model provides the best fit (Franck et al., 2015)
`nonlinear-time-power`	\(f(t; k) = \frac{1}{(1 + t^s)^k}\)	Experimental extension of the `power` discount function along the lines of the `nonlinear-time-hyperbolic` and `nonlinear-time-exponential` functions.
`nonlinear-time-arithmetic`	\(f(t; k) = 1 - \frac{kt^s}{V_D}\)	Experimental extension of the `arithmetic` discount function along the lines of the `nonlinear-time-hyperbolic` and `nonlinear-time-exponential` functions.
`scaled-hyperbolic`	\(f(t; k, w) = \frac{w}{1 + kt}\)	Experimental extension of the `hyperbolic` discount function along the lines of the `scaled-exponential` function.

mod <- td_bcnm(td_bc_single_ptpt, discount_function = 'all')
print(mod)
#> 
#> Temporal discounting binary choice model
#> 
#> Discount function: arithmetic, with coefficients:
#> 
#>          k      gamma 
#> 1.05682221 0.08007441 
#> 
#> Config:
#>  noise_dist: logis
#>  gamma_scale: linear
#>  transform: identity
#> 
#> ED50: 93.8257558942613
#> AUC: 0.0256880960367851
#> BIC: 25.5896625868426

Choice rules

Several additional arguments can be used to customize the model. For example, we can use different choice rules—the “logistic” choice rule is the default, but the “probit” and “power” choice rules are also available (see this tutorial for more details):

# Probit choice rule:
mod <- td_bcnm(td_bc_single_ptpt, discount_function = 'exponential', choice_rule = 'probit')
# Power choice rule:
mod <- td_bcnm(td_bc_single_ptpt, discount_function = 'exponential', choice_rule = 'power')

Error rates

It is also possible to fit an error rate \(\epsilon\) that describes the probability of the participant making a response error (see Vincent, 2015). I.e.: \[P(\text{imm}) = \epsilon + (1 - 2\epsilon) g^{-1}[\eta]\] where \(P(\text{imm})\) is the probability of choosing the immediate reward, \(g\) is the link function, and \(\eta\) is the linear predictor.

data("td_bc_study")
# Select the second participant
second_ptpt_id <- unique(td_bc_study$id)[2]
df <- subset(td_bc_study, id == second_ptpt_id)
mod <- td_bcnm(df, discount_function = 'exponential', fit_err_rate = T)
plot(mod, type = 'endpoints', verbose = F)
lines(c(0, 1), c(0, 0), lty = 2)
lines(c(0, 1), c(1, 1), lty = 2)

cat(sprintf("epsilon = %.2f\n", coef(mod)['eps']))
#> epsilon = 0.11

We can see that the probability of choosing the immediate reward doesn’t approach 0 or 1 but instead approaches a value of \(\epsilon \approx 0.11\).

Fixed endpoints

Alternatively, we might expect that participants should never choose an immediate reward worth 0 and should never choose a delayed reward worth the same face amount as an immediate reward (Kinley, Oluwasola & Becker, 2025; see here for more details). We can control this by setting fixed_ends = T, which “fixes” the endpoints of the psychometric curve, where val_imm = 0 and val_imm = val_del, at 0 and 1, respectively:

mod <- td_bcnm(df, discount_function = 'exponential', fixed_ends = T)
plot(mod, type = 'endpoints', verbose = F, del = 50, val_del = 200)