---
title: "7. Miscellaneous models"
output: rmarkdown::html_vignette
bibliography: ../inst/REFERENCES.bib
vignette: >
  %\VignetteIndexEntry{7. Miscellaneous models}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r label = setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE, widtht = 65)
options(width = 65)
```
## Paired combinatorial logit model

@KOPP:WEN:00 proposed the *paired combinatorial logit model*, which is
  a nested logit model with nests composed by every combination of two
  alternatives. This model is obtained by using the following $G$
  function :

$$
G(y_1, y_2, ...,
y_n)=\sum_{k=1}^{J-1}\sum_{l=k+1}^J\left(y_k^{1/\lambda_{kl}}+y_l^{1/\lambda_{kl}}
\right)^{\lambda_{kl}}
$$

The *pcl* model is consistent with random utility maximisation if
$0<\lambda_{kl}\leq 1$ and the multinomial logit results if
$\lambda_{kl}=1 \;\forall (k,l)$. The resulting probabilities are :

$$
P_l = \frac{\sum_{k\neq l}e^{V_l/\lambda_{lk}}\left(e^{V_k/\lambda_{lk}} + e^{V_l/\lambda_{lk}}\right)^{\lambda_{lk}-1}}
{\sum_{k=1}^{J-1}\sum_{l=k+1}^{J}\left(e^{V_k/\lambda_{lk}} + e^{V_l/\lambda_{lk}}\right)^{\lambda_{lk}}}
$$

which can be expressed as a sum of $J-1$ product of a conditional
probability of choosing the alternative and the marginal probability
of choosing the nest :

$$
P_l=\sum_{k\neq l}P_{l\mid lk} P_{lk}
$$

with :

$$
P_{l \mid lk} = \frac{e^{V_l/\lambda_{lk}}}{e^{V_k/\lambda_{lk}} + e^{V_l/\lambda_{lk}}}
$$
$$
P_{lk}= \frac{\left(e^{V_k/\lambda_{lk}} +
    e^{V_l/\lambda_{lk}}\right)^{\lambda_{lk}}}{\sum_{k=1}^{J-1}\sum_{l=k+1}^{J}\left(e^{V_k/\lambda_{lk}}
    + e^{V_l/\lambda_{lk}}\right)^{\lambda_{lk}}}
$$

We reproduce the example used by @KOPP:WEN:00 on the same subset of
the `ModeCanada` than the one used by @BHAT:95. Three modes are
considered and there are therefore three nests. The elasticity of the
train-air nest is set to one. To estimate this model, one has to set
the `nests` argument to `"pcl"`. All the nests of two alternatives are then
automatically created. The restriction on the nest elasticity for the
train-air nest is performed by using the `constPar` argument.


```{r }
library("mlogit")
data("ModeCanada", package = "mlogit")
busUsers <- with(ModeCanada, case[choice == 1 & alt == 'bus'])
Bhat <- subset(ModeCanada, ! case %in% busUsers & alt != 'bus' & noalt == 4)
Bhat$alt <- Bhat$alt[drop = TRUE]
Bhat <- dfidx(Bhat, idx = c("case", "alt"), choice = "choice", idnames = c("chid", "alt"))
pcl <- mlogit(choice ~ freq + cost + ivt + ovt, Bhat, reflevel = 'car',
              nests = 'pcl', constPar=c('iv:train.air'))
summary(pcl)
``` 

## The rank-ordered logit model

Sometimes, in stated-preference surveys, the respondents are asked to
give the full rank of their preference for all the alternative, and
not only the prefered alternative. The relevant model for this kind of
data is the rank-ordered logit model, which can be estimated as a
standard multinomial logit model if the data is reshaped
correctly\footnote{see for example [@BEGG:CARD:HAUS:81],
[@CHAP:STAE:82] and [@HAUS:RUUD:87].}

The ranking can be decomposed in a series of choices of the best
alternative within a decreasing set of available alternatives. For
example, with 4 alternatives, the probability that the ranking would
be 3-1-4-2 can be writen as follow :

- alternative 3 is in the first position, the probability is then
  $\frac{e^{\beta^{\top}x_3}}{e^{\beta^{\top}x_1}+e^{\beta^{\top}x_2}+e^{\beta^{\top}x_3}+e^{\beta^{\top}x_4}}$,
- alternative 1 is in second position, the relevant probability is
  the logit probability that 1 is the chosen alternative in the set of
  alternatives (1-2-4) :
  $\frac{e^{\beta^{\top}x_1}}{e^{\beta^{\top}x_1}+e^{\beta^{\top}x_2}+e^{\beta^{\top}x_4}}$,
- alternative 4 is in third position, the relevant probability is
  the logit probability that 4 is the chosen alternative in the set of
  alternatives (2-4) :
  $\frac{e^{\beta^{\top}x_4}}{e^{\beta^{\top}x_2}+e^{\beta^{\top}x_4}}$,
- the probability of the full ranking is then simply the product
  of these 3 probabilities.

This model can therefore simply be fitted as a multinomial logit model
; the ranking for one individual amoung J alternatives is writen as
$J-1$ choices among $J, J-1, ..., 2$ alternatives.

The estimation of the rank-ordered logit model is illustrated using
the `Game` data set [@FOK:PAAP:VAND:12]. Respondents are asked to rank
6 gaming platforms. The covariates are a dummy `own` which indicates
whether a specific platform is curently owned, the age of the
respondent (`age`) and the number of hours spent on gaming per week
(`hours`). The data set is available in wide (`game`) and long
(`game2`) format. In wide format, the consists on $J$ columns which
indicate the ranking of each alternative.

```{r }
data("Game", package = "mlogit")
data("Game2", package = "mlogit")
head(Game,2)
head(Game2, 7)
nrow(Game)
nrow(Game2)
``` 

Note that `Game` contains 91 rows (there are 91 individuals) and that
`Game2` contains 546 rows ($91$ individuals $times$ 6 alternatives)

To use `dfidx`, the `ranked` argument should `TRUE`:

```{r }
G <- dfidx(Game, varying = 1:12, choice = "ch", ranked = TRUE, idnames = c("chid", "alt"))
G <- dfidx(Game2, choice = "ch", ranked = TRUE, idx = c("chid", "platform"),
           idnames = c("chid", "alt"))
head(G)
nrow(G)
``` 

Note that the choice variable is now a logical variable and that the
number of row is now 1820 (91 individuals $\times (6+5+4+3+2)$
alternatives). 

Using `PC` as the reference level, we can then reproduce the
results of the original reference :

```{r }
summary(mlogit(ch ~ own | hours + age, G, reflevel = "PC"))
``` 

## Bibliography