\documentclass{article}
\usepackage{natbib}
\usepackage{amsmath}
\usepackage[margin=1in]{geometry}

\newcommand*\diff{\mathop{}\!\mathrm{d}}
\newcommand{\Lpar}[1]{\left( #1\right)}
\newcommand{\Lbra}[1]{\left\{ #1\right\}}
\newcommand{\Lver}[1]{\left| #1\right|}
\newcommand{\Lvert}[1]{\left\rVert #1\right\lVert}

%\VignetteEngine{knitr::knitr}
%\VignetteIndexEntry{Using Distance to Default}

\begin{document}

<<setup, echo = FALSE>>=
options(width = 60, digits = 4)
knitr::opts_chunk$set(
  echo = TRUE, dpi = 128, message = FALSE, error = FALSE, fig.height = 3.5,
  size = "small")

.def.chunk.hook  <- knitr::knit_hooks$get("chunk")
knitr::knit_hooks$set(chunk = function(x, options) {
  x <- .def.chunk.hook(x, options)
  ifelse(options$size != "normalsize",
         paste0("\\", options$size,"\n\n", x, "\n\n \\normalsize"),
         x)
})
@

\title{Distance to default package}
\author{Benjamin Christoffersen}

\maketitle

This package provides fast functions to work with the Merton's distance to default model. We will only briefly cover the model here. See e.g., \cite{Lando09} for a more complete coverage. Denote the observed market values by $S_t$ and unobserved asset values by $V_t$. We assume that $V_t$ follows a geometric Brownian motion%
%
$$dV_t = \mu V_t\diff t + \sigma V_t \diff W_t$$%
%
We observe the asset values over increments of $dt$ in time. Let $V_k$ denote the value at $t_0 + k \cdot dt$. Thus,%
%
$$V_{k + 1} = V_{k}\exp\Lpar{\Lpar{\mu - \frac{1}{2}\sigma^2}dt + \sigma W_t}$$%
%
We further let $r$ denote the risk free rate, $D_t$ denote debt due at time $t + T$. Then%
%
\begin{align}
C(V_t, D_t, T, \sigma, r) &= V_t N(d_1) - D_t \exp\Lpar{-rT} N(d_1 - \sigma\sqrt T) \nonumber \\
%
d_1 &= \frac{\log(V_t) - \log{D_t} + \Lpar{r + \frac{1}{2}\sigma^2}T}{\sigma\sqrt T} \label{eq:dd} \\
%
S_t &= C(V_t, D_t, T, \sigma, r) \label{eq:stock_relation}
\end{align}%
%
where $C$ is a European call option price, $T$ is the time to maturity, $D_t$ is the debt to due at time $T + t$, and $r$ is the risk free rate. Common choices tend to be $T = 1$ year and $D_t$ is the short term debt plus half of the long term debt.
The distance-to-default is%
%
$$
DD_t = \frac{\log(V_t) - \log(D_t) + (\mu - \sigma^2 / 2)T}{\sigma \sqrt T}
$$%
%
It is a very good predictor of default risk despite it's simplicity
\citep[see e.g.,][]{Bharath08}. However, to compute it we would need the value
of the underlying asset, the drift, and volatility. This package package provides
methods to estimate these.

Equation~\eqref{eq:stock_relation} can be computed with the \verb|BS_call| function. Further, the \verb|get_underlying| function can be used to invert call option price in equation~\eqref{eq:stock_relation}

<<show_call>>=
library(DtD)
(S <- BS_call(100, 90, 1, .1, .3))
get_underlying(S, 90, 1, .1, .3)
@

To illustrate the above then we can simulate the underlying and transform the data into the stock price as follows

<<first_sim>>=
# assign parameters
vol <- .1
mu  <- .05
dt  <- .05
V_0 <- 100
t.  <- (1:50 - 1) * dt
D   <- c(rep(80, 27), rep( 70, length(t.) - 27))
r   <- c(rep( 0, 13), rep(.02, length(t.) - 13))

# simulate underlying
set.seed(seed <- 83992673)
V <- V_0 * exp(
  (mu - vol^2/2) * t. + cumsum(c(
    0, rnorm(length(t.) - 1, sd = vol * sqrt(dt)))))

# compute stock price
S <- BS_call(V, D, T. = 1, r, vol)
plot(t., S, type = "l", xlab = "Time", ylab = expression(S[t]))
@

Despite that the model assume a constant risk free rate than we let it vary in this example. We end by plotting the stock price. Further, we can confirm that we the same underlying after transforming back

<<check_back>>=
all.equal(V, get_underlying(S, D, 1, r, vol))
@

We could also have used the simulation function in the package

<<show_sim_gives_same>>=
set.seed(seed) # use same seed
sims <- BS_sim(
  vol = vol, mu = mu, dt = dt, V_0 = V_0, D = D, r = r, T. = 1)

isTRUE(all.equal(sims$V, V))
isTRUE(all.equal(sims$S, S))
@


\section{Drift and volatility estimation}

There are a few ways to estimate the volatility, $\sigma$,  and drift, $\mu$. This package only includes the iterative procedure and maximum likelihood method covered in \cite{Duan94,Vassalou04,Duan04,Bharath08}. We denote the former as the ``iterative'' method and the latter as the MLE. We have not implemented the method where one solves two simultaneous equation as it will be based on two measurements and may be quite variable \citep[as mentioned in][]{Kmv03}.

The iterative methods is as follows. Start with an initial guess of the volatility and denote this $\hat{\sigma}^{(0)}$. Then for $i = 1,2,\dots$

\begin{enumerate}
  \item compute the underlying asset values $V_k = C^{-1}(S_k, \hat{\sigma}^{(i - 1)})$ where $C^{-1}$ is the inverse of the call option price in equation~\eqref{eq:stock_relation} and implicitly depend on $D_t$, $T$, and $r$. Then compute the $\log$ returns $x_k = \log V_k - \log V_{k-1}$. \label{enum:it_first_step}
%
  \item compute the maximum likelihood estimate as if we observed the $\log$ returns. I.e. compute
  \begin{align}
    \tilde\mu &= \frac{\sum_{k=1}^nx_k}{\sum_{k = 1}^n dt_k} =
         \frac{\log V_n - \log V_0}{\sum_{k = 1}^n dt_k} \nonumber\\
    \left(\hat\sigma^{(i)}\right)^2 &= \frac{1}{n}\sum_{k = 1}^n
        \left(\frac{x_k}{\sqrt{dt_k}} - \sqrt{dt_k}\tilde\mu \right)^2 \nonumber\\
    \hat\mu &= \tilde\mu + \frac{\left(\hat\sigma^{(i)}\right)^2}{2} \label{eqn:iteDrift}
  \end{align}
    where we have extended the model to unequal gaps each with length $dt_k$.
%
  \item Repeat step~\ref{enum:it_first_step} if $(\hat{\sigma}^{(i)}, \hat{\mu}^{(i)})$ is far from $(\hat{\sigma}^{(i - 1)}, \hat{\mu}^{(i - 1)})$. Otherwise stop.
\end{enumerate}

The parameters can be estimated with the \verb|BS_fit| function. The iterative method is used in the following call

<<use_iterative>>=
# simulate data
set.seed(52722174)
sims <- BS_sim(
  vol = .2, mu = .05, dt = 1/252, V_0 = 100, r = .01, T. = 1,
  # simulate firm that grows partly by lending
  D = 70 * (1 + .01 * (0:(252 * 4)) / 252))

# the sims data.frame has a time column. We need to pass this
head(sims$time, 6)

# estimate parameters
it_est <- BS_fit(
  S = sims$S, D = sims$D, T. = sims$T, r = sims$r, time = sims$time,
  method = "iterative")
it_est
@

The volatility is quite close the actual value while the drift is a bit off. This may be due to the fact that the likelihood is flat in the drift. The maximum likelihood estimator is obtained by maximizing the observed log likelihood%
%
\begin{equation}\label{eq:log_like}\begin{split}
L(\mu,\sigma,\vec{S}) & =
  - n \log \Lpar{\sigma^2}
  - \sum_{k = 1}^n
    \frac{\Lpar{\log \frac{C^{-1}(S_k, \sigma)}{C^{-1}(S_{k - 1}, \sigma)} - \Lpar{\mu - \sigma^2/2}dt_k}^2}{\sigma^2dt_k} \\
%
 & \hphantom{=}\hspace{12pt}
  - 2 \sum_{k = 1}^n\Lpar{\log C^{-1}\Lpar{S_k, \sigma}
 + \log \Lver{
  C'\Lpar{C^{-1}\Lpar{S_k, \sigma},\sigma }}} + \dots
\end{split}\end{equation}%
%
where $C^{-1}$ is the inverse of the call option price in equation~\eqref{eq:stock_relation} and implicitly depend on $D_t$, $T$, and $r$. Notice that we need to use $dt_k$ in \eqref{eq:log_like} and the time to maturity, $T$, in $C$ and $C^{-1}$. The last term in equation~\eqref{eq:log_like} follows from the change of variable%
%
\begin{equation}
\begin{split}
  X &= h^{-1}(Y) \\
  f_Y(y) &= f_X\Lpar{h^{-1}\Lpar{y}}\Lver{(h^{-1})'\Lpar{y}} \\
         &= f_X\Lpar{h^{-1}\Lpar{y}}\Lver{\frac{1}{h'\Lpar{h^{-1}\Lpar{y}}}}
\end{split}
\end{equation}%
%
where $f$ denotes a density and the subscript denotes which random variable the density is for.

The first order conditions for the drift, $\mu$, is%
%
\begin{align*}
&& \sum_{k = 1}^n
  x_k - (\mu - \sigma^2/2)dt_k &=
\sum_{k = 1}^n x_k - (\mu - \sigma^2 / 2)
  \sum_{k = 1}^n dt_k \\
&&&= 0 \\
\Leftrightarrow && \mu &= \frac
  {\sum_{k = 1}^n x_k}{\sum_{k = 1}^n dt_k} +  \frac{\sigma^2}{2} \\
&&&=  \frac{\log C^{-1}(S_n, \sigma) - \log C^{-1}(S_0, \sigma)}
  {\sum_{k = 1}^n dt_k} + \frac{\sigma^2}2
\end{align*}%
%
as in Equation \eqref{eqn:iteDrift}. Substituting into the log-likelihood
in Equation \eqref{eq:log_like} yields %
%
\begin{equation*}\begin{split}
L(\sigma,\vec{S}) & =
  - n \log \Lpar{\sigma^2}
  - \sum_{k = 1}^n
    \frac
    {\Lpar{\log \frac{C^{-1}(S_k, \sigma)}{C^{-1}(S_{k - 1}, \sigma)} -
    \frac
    {\log C^{-1}(S_n, \sigma)/C^{-1}(S_0, \sigma)}
    {\sum_{k = 1}^n dt_k}dt_k}^2}{\sigma^2dt_k} \\
%
 & \hphantom{=}\hspace{12pt}
  - 2 \sum_{k = 1}^n\Lpar{\log C^{-1}\Lpar{S_k, \sigma}
 + \log \Lver{
  C'\Lpar{C^{-1}\Lpar{S_k, \sigma},\sigma }}} + \dots
\end{split}\end{equation*}%
%
which we can optimize numerically.


We can estimate the parameters with the MLE method as follows

<<use_mle>>=
mle_est <- BS_fit(
  S = sims$S, D = sims$D, T. = sims$T, r = sims$r, time = sims$time,
  method = "mle")
mle_est
@

The result are usually very similar although they need not to as far as I gather

<<show_diff>>=
it_est$est - mle_est$est
@

The iterative method is faster though

<<show_time>>=
library(microbenchmark)
with(sims,
  microbenchmark(
    iter = BS_fit(
      S = S, D = D, T. = T, r = r, time = time, method = "iterative"),
    mle = BS_fit(
      S = S, D = D, T. = T, r = r, time = time, method = "mle"),
    times = 5))
@

We can also estimate the parameters when there unequal time gaps in the data set

<<include = FALSE, eval = FALSE>>=
out <- replicate(
  1000L, {
    sims <<- BS_sim(
      vol = .2, mu = .05, dt = 1/252, V_0 = 100, r = .01, T. = 1,
      # simulate firm that grows partly by lending
      D = 80 * (1 + .03 * (0:(252 * 1)) / 252))

    BS_fit(S = sims$S, D = sims$D, T. = sims$T, r = sims$r, time = sims$time)$ests
  })
@

<<show_works_with_uneq_gaps>>=
# drop random rows
sims <- sims[sort(sample.int(nrow(sims), 100L)), ]

# the gap lengths are not equal anymore
range(diff(sims$time))

# estimate parameters
BS_fit(
  S = sims$S, D = sims$D, T. = sims$T, r = sims$r, time = sims$time,
  method = "iterative")
@


\medskip

\bibliographystyle{plain}
\bibliography{Distance-to-default}


\end{document}