An example of market-clearing assessment

This short tutorial gives an example of statistically assessing whether a market is in an equilibrium state. The tutorial assumes some familiarity with the concepts and the functionality of the package. The basic_usage vignette can be helpful in acquiring this familiarity.

Setup the environment

Load the required libraries.

library(markets)

Prepare the data. Here, we simply simulate data using a data generating process for a market in equilibrium.

nobs <- 1000
tobs <- 5

alpha_d <- -3.9
beta_d0 <- 28.9
beta_d <- c(2.1, -0.7)
eta_d <- c(3.5, 6.25)

alpha_s <- 2.8
beta_s0 <- 26.2
beta_s <- c(2.65)
eta_s <- c(1.15, 4.2)

sigma_d <- 0.8
sigma_s <- 1.1
rho_ds <- 0.0

seed <- 42

eq_data <- simulate_data(
  "equilibrium_model", nobs, tobs,
  alpha_d, beta_d0, beta_d, eta_d,
  alpha_s, beta_s0, beta_s, eta_s,
  NA, NA, c(NA),
  sigma_d = sigma_d, sigma_s = sigma_s, rho_ds = rho_ds,
  seed = seed
)

Estimate the models

Prepare the basic parameters for model initialization.

verbose <- 2
correlated_shocks <- TRUE
formula <-   Q | P | id | date ~ P + Xd1 + Xd2 + X1 + X2 | P + Xs1 + X1 + X2

Set the estimation parameters.

optimization_method <- "BFGS"
optimization_options <- list(maxit = 10000, reltol = 1e-8)

Using the above parameterization, construct and estimate the model objects. Here we estimate two equilibrium models and four disequilibrium models. All the models are constructed using the simulated data from a model of market in equilibrium.

eq_reg <- equilibrium_model(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(method = "2SLS")
)
#> Info: This is Equilibrium model.
#> Warning: Removing unobserved '1' level(s).
eq_fit <- equilibrium_model(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info: This is Equilibrium model.
#> Warning: Removing unobserved '1' level(s).
bs_fit <- diseq_basic(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info: This is Basic model.
#> Warning: Removing unobserved '1' level(s).
da_fit <- diseq_deterministic_adjustment(
  formula, eq_data,
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info: This is Deterministic Adjustment model.
#> Info: Dropping 1000 rows to generate 'LAGGED_P'.
#> Info: Sample separated with 2000 rows in excess supply and 2000 rows in
#>   excess demand states.

Post estimation analysis

Summaries

All the models provide estimates for the simulated data. Even with simulated data, it is difficult to assess which model performs better by examining only the summaries in separation or collectively.

summary(eq_reg)
#> Equilibrium Model for Markets in Equilibrium:
#>   Demand RHS        :   D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        :   S_P + S_Xs1 + S_X1 + S_X2
#>   Market Clearing   : Q = D_Q = S_Q
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Least squares estimation:
#>   Method              : 2SLS
#> 
#> Shocks:
#>   D_VARIANCE          : 0.649543
#>   S_VARIANCE          : 1.28046
#>   RHO                 : -0.0191552
#> 
#> First Stage:
#> 
#> Call:
#> lm(formula = first_stage_formula, data = object@data)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.88388 -0.14215  0.00305  0.14088  0.81967 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  0.402566   0.003302  121.91   <2e-16 ***
#> Xd1          0.313982   0.003288   95.50   <2e-16 ***
#> Xd2         -0.103262   0.003274  -31.54   <2e-16 ***
#> X1           0.348932   0.003262  106.98   <2e-16 ***
#> X2           0.306357   0.003222   95.07   <2e-16 ***
#> Xs1         -0.398531   0.003284 -121.36   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.2087 on 3994 degrees of freedom
#> Multiple R-squared:  0.9194, Adjusted R-squared:  0.9193 
#> F-statistic:  9110 on 5 and 3994 DF,  p-value: < 2.2e-16
#> 
#> 
#> Demand Equation:
#> 
#> Call:
#> lm(formula = demand_formula, data = object@data)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -2.41586 -0.50003  0.00523  0.49533  2.71617 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 28.89089    0.01657 1744.02   <2e-16 ***
#> P_FITTED    -3.87656    0.02886 -134.32   <2e-16 ***
#> Xd1          2.10510    0.01475  142.74   <2e-16 ***
#> Xd2         -0.70778    0.01190  -59.49   <2e-16 ***
#> X1           3.50747    0.01514  231.61   <2e-16 ***
#> X2           6.24655    0.01436  434.93   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.7311 on 3994 degrees of freedom
#> Multiple R-squared:  0.9849, Adjusted R-squared:  0.9849 
#> F-statistic: 5.214e+04 on 5 and 3994 DF,  p-value: < 2.2e-16
#> 
#> 
#> Supply Equation:
#> 
#> Call:
#> lm(formula = supply_formula, data = object@data)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -2.4183 -0.4975  0.0040  0.4951  2.7379 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 26.18624    0.01803 1452.13   <2e-16 ***
#> P_FITTED     2.84269    0.03483   81.62   <2e-16 ***
#> Xs1          2.67766    0.01818  147.27   <2e-16 ***
#> X1           1.16279    0.01658   70.13   <2e-16 ***
#> X2           4.18829    0.01569  266.97   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.7312 on 3995 degrees of freedom
#> Multiple R-squared:  0.9849, Adjusted R-squared:  0.9849 
#> F-statistic: 6.517e+04 on 4 and 3995 DF,  p-value: < 2.2e-16
summary(eq_fit)
#> Equilibrium Model for Markets in Equilibrium:
#>   Demand RHS        :   D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        :   S_P + S_Xs1 + S_X1 + S_X2
#>   Market Clearing   : Q = D_Q = S_Q
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation:
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>   -3.87656   28.89089    2.10510   -0.70778    3.50747    6.24655    2.84269 
#>    S_CONST      S_Xs1       S_X1       S_X2 D_VARIANCE S_VARIANCE        RHO 
#>   26.18624    2.67766    1.16279    4.18829    0.64954    1.28046   -0.01916 
#> 
#> Coefficients:
#>              Estimate Std. Error     z value      Pr(>|z|) 
#>  D_P        -3.8765719 0.03186718 -121.647776  0.000000e+00 ***
#>  D_CONST    28.8908986 0.01829018 1579.584931  0.000000e+00 ***
#>  D_Xd1       2.1050283 0.01626942  129.385552  0.000000e+00 ***
#>  D_Xd2      -0.7080066 0.01315280  -53.829327  0.000000e+00 ***
#>  D_X1        3.5074834 0.01672158  209.757904  0.000000e+00 ***
#>  D_X2        6.2465545 0.01585720  393.925394  0.000000e+00 ***
#>  S_P         2.8428079 0.05398669   52.657571  0.000000e+00 ***
#>  S_CONST    26.1861986 0.02795397  936.761226  0.000000e+00 ***
#>  S_Xs1       2.6776937 0.02818529   95.003218  0.000000e+00 ***
#>  S_X1        1.1627557 0.02570360   45.237072  0.000000e+00 ***
#>  S_X2        4.1882570 0.02431923  172.219984  0.000000e+00 ***
#>  D_VARIANCE  0.6493577 0.01586064   40.941451  0.000000e+00 ***
#>  S_VARIANCE  1.2802231 0.03545676   36.106606 1.786441e-285 ***
#>  RHO        -0.0191747 0.01810478   -1.059096  2.895561e-01  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> -2 log L: 6722.675
summary(bs_fit)
#> Basic Model for Markets in Disequilibrium:
#>   Demand RHS        :   D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        :   S_P + S_Xs1 + S_X1 + S_X2
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation:
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    -3.3899    28.6909     1.9497    -0.6543     3.3398     6.0968     1.5864 
#>    S_CONST      S_Xs1       S_X1       S_X2 D_VARIANCE S_VARIANCE        RHO 
#>    26.6854     2.1697     1.5963     4.5814     0.6012     1.0378     0.0000 
#> 
#> Coefficients:
#>              Estimate Std. Error   z value      Pr(>|z|) 
#>  D_P        -3.5088088 0.04551884 -77.08476  0.000000e+00 ***
#>  D_CONST    29.1228978 0.03417797 852.09566  0.000000e+00 ***
#>  D_Xd1       2.0718884 0.02548591  81.29544  0.000000e+00 ***
#>  D_Xd2      -0.7088276 0.01664702 -42.57985  0.000000e+00 ***
#>  D_X1        3.3652961 0.02550096 131.96745  0.000000e+00 ***
#>  D_X2        6.1435192 0.02476660 248.05664  0.000000e+00 ***
#>  S_P         2.0706215 0.10815459  19.14502  1.065016e-81 ***
#>  S_CONST    27.8541734 0.05866940 474.76489  0.000000e+00 ***
#>  S_Xs1       2.5828274 0.06442783  40.08869  0.000000e+00 ***
#>  S_X1        1.4475854 0.05411164  26.75183 1.175668e-157 ***
#>  S_X2        4.3986442 0.05382927  81.71473  0.000000e+00 ***
#>  D_VARIANCE  0.6386261 0.02339828  27.29373 5.035179e-164 ***
#>  S_VARIANCE  1.1248997 0.07255514  15.50407  3.256172e-54 ***
#>  RHO        -0.4708618 0.04042810 -11.64689  2.379768e-31 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> -2 log L: 8338.912
summary(da_fit)
#> Deterministic Adjustment Model for Markets in Disequilibrium:
#>   Demand RHS        :   D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        :   S_P + S_Xs1 + S_X1 + S_X2
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Separation Rule   : P_DIFF analogous to (D_Q - S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Demand Obs = 2000, Supply Obs = 2000
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation:
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#> -3.3899420 28.6908902  1.9497497 -0.6543147  3.3398395  6.0967924  1.5864480 
#>    S_CONST      S_Xs1       S_X1       S_X2     P_DIFF D_VARIANCE S_VARIANCE 
#> 26.6853581  2.1696502  1.5962938  4.5814415 -0.0007971  0.6012371  1.0378385 
#>        RHO 
#>  0.0000000 
#> 
#> Coefficients:
#>                Estimate Std. Error       z value      Pr(>|z|) 
#>  D_P        -3.875716134 0.03401979 -113.92535355  0.000000e+00 ***
#>  D_CONST    28.891196214 0.01878435 1538.04621654  0.000000e+00 ***
#>  D_Xd1       2.105016163 0.01627352  129.35220630  0.000000e+00 ***
#>  D_Xd2      -0.707967193 0.01315450  -53.81940567  0.000000e+00 ***
#>  D_X1        3.507453596 0.01672789  209.67704234  0.000000e+00 ***
#>  D_X2        6.246539693 0.01585981  393.85974949  0.000000e+00 ***
#>  S_P         2.842225125 0.05500796   51.66933974  0.000000e+00 ***
#>  S_CONST    26.187074381 0.03101179  844.42312307  0.000000e+00 ***
#>  S_Xs1       2.677786510 0.02819124   94.98646042  0.000000e+00 ***
#>  S_X1        1.162670211 0.02570939   45.22355534  0.000000e+00 ***
#>  S_X2        4.188174719 0.02432559  172.17158032  0.000000e+00 ***
#>  P_DIFF      0.001552936 0.02179850    0.07124051  9.432063e-01  
#>  D_VARIANCE  0.649372950 0.01586510   40.93091422  0.000000e+00 ***
#>  S_VARIANCE  1.280272325 0.03546419   36.10042407 2.233524e-285 ***
#>  RHO        -0.019190550 0.01810735   -1.05982079  2.892261e-01  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> -2 log L: 6722.67

Model selection

The deterministic adjustment model has price dynamics that are analogous to excess demand and estimates one extra parameter. The directional model estimates one parameter less as the model does not have enough equations to identify prices in both demand and supply equations. The estimated parameters are summarized as follows.

da_coef <- coef(da_fit)
coef_names <- names(da_coef)

sim_coef <- c(
  alpha_d, beta_d0, beta_d, eta_d,
  alpha_s, beta_s0, beta_s, eta_s,
  NA,
  sigma_d, sigma_s,
  rho_ds
)

coef_tbl <- function(fit) {
    dt <- data.frame(names(coef(fit)), coef(fit))
    names(dt) <- c("coef", substitute(fit))
    dt
}

comp <- coef_tbl(da_fit) |> 
    dplyr::left_join(coef_tbl(bs_fit), by = "coef") |>
    dplyr::left_join(coef_tbl(eq_reg), by = "coef") |>
    dplyr::left_join(coef_tbl(eq_fit), by = "coef") |>
    dplyr::mutate(sim = sim_coef) |>
    dplyr::mutate(sim = sim_coef,
                  da_fit_err = abs(da_fit - sim),
                  bs_fit_err = abs(bs_fit - sim),
                  eq_fit_err = abs(eq_fit - sim)) 

comp
#>          coef       da_fit     bs_fit      eq_reg     eq_fit   sim  da_fit_err
#> 1         D_P -3.875716134 -3.5088088 -3.87655763 -3.8765719 -3.90 0.024283866
#> 2     D_CONST 28.891196214 29.1228978 28.89089359 28.8908986 28.90 0.008803786
#> 3       D_Xd1  2.105016163  2.0718884  2.10509733  2.1050283  2.10 0.005016163
#> 4       D_Xd2 -0.707967193 -0.7088276 -0.70777632 -0.7080066 -0.70 0.007967193
#> 5        D_X1  3.507453596  3.3652961  3.50747427  3.5074834  3.50 0.007453596
#> 6        D_X2  6.246539693  6.1435192  6.24655044  6.2465545  6.25 0.003460307
#> 7         S_P  2.842225125  2.0706215  2.84268683  2.8428079  2.80 0.042225125
#> 8     S_CONST 26.187074381 27.8541734 26.18623502 26.1861986 26.20 0.012925619
#> 9       S_Xs1  2.677786510  2.5828274  2.67765683  2.6776937  2.65 0.027786510
#> 10       S_X1  1.162670211  1.4475854  1.16278699  1.1627557  1.15 0.012670211
#> 11       S_X2  4.188174719  4.3986442  4.18828591  4.1882570  4.20 0.011825281
#> 12     P_DIFF  0.001552936         NA          NA         NA    NA          NA
#> 13 D_VARIANCE  0.649372950  0.6386261  0.64954289  0.6493577  0.80 0.150627050
#> 14 S_VARIANCE  1.280272325  1.1248997  1.28046017  1.2802231  1.10 0.180272325
#> 15        RHO -0.019190550 -0.4708618 -0.01915521 -0.0191747  0.00 0.019190550
#>     bs_fit_err  eq_fit_err
#> 1  0.391191222 0.023428107
#> 2  0.222897771 0.009101362
#> 3  0.028111597 0.005028251
#> 4  0.008827565 0.008006611
#> 5  0.134703926 0.007483361
#> 6  0.106480801 0.003445517
#> 7  0.729378471 0.042807896
#> 8  1.654173423 0.013801391
#> 9  0.067172625 0.027693701
#> 10 0.297585420 0.012755653
#> 11 0.198644190 0.011743025
#> 12          NA          NA
#> 13 0.161373855 0.150642286
#> 14 0.024899730 0.180223091
#> 15 0.470861756 0.019174705

Since we have used simulated data, we can calculate the average absolute error of the parameter estimation for each of the models. The population values are unknown in practice, and this calculation is impossible.

model_errors <- colMeans(comp[, grep("err", colnames(comp))], na.rm = TRUE) 
model_errors
#> da_fit_err bs_fit_err eq_fit_err 
#> 0.03675054 0.32116445 0.03680964

Moreover, the average absolute error cannot provide an overall estimation assessment as the market models have different parameter spaces. To assess the overall model performance, one can instead use an information criterion.

fits <- c(da_fit, bs_fit, eq_fit)
model_names <- sapply(fits, function(m) name(m))
model_obs <- sapply(fits, nobs)
aic <- sapply(fits, AIC)
df <- sapply(fits, function(m) attr(logLik(m), "df"))
seltbl <- data.frame(Model = model_names, AIC = aic,
                         D.F = df, Obs. = model_obs,
                         Abs.Error = model_errors) |>
  dplyr::arrange(AIC)
seltbl
#>                               Model      AIC D.F Obs.  Abs.Error
#> eq_fit_err              Equilibrium 6750.675  14 4000 0.03680964
#> da_fit_err Deterministic Adjustment 6752.670  15 4000 0.03675054
#> bs_fit_err                    Basic 8366.912  14 4000 0.32116445