4 Exotic options and American options

4.1 Probability theory for the binomial model

In this chapter we look at the probability theory behind the ideas we used to price contingent claims in the binomial model. This will allow us to put the important financial concepts on a proper mathematical foundation. This is also good preparation for Epiphany term, when it will be harder to rely on our intuition, and we will need to approach the material using a rigorous mathematical theory.

We begin with the probability space on which the binomial model lives.

Coin Toss Space

Toss a coin \(T\) times and let \(\Omega_T\) denote the set of all possible outcomes. Each element \(\omega \in \Omega_T\) is expressed as \(\omega=\omega_1\omega_2\cdots \omega_T\), where \(\omega_i=\H\) or \(\T\) depending on the outcome of \(i\)-th coin toss. The set \(\Omega_T\) has total \(2^T\) elements in it. Let \(\F=2^{\Omega_T}\) be the \(\sigma\)-algebra of all subsets of \(\Omega_T\). We can define a probability measure on \(\Omega_T\) as follows. We let \(\heads(\omega)\) be the number of heads in \(\omega= \omega_1\cdots\omega_T\), and \(\tails(\omega)\) be the number of tails. Then, \[ \P(\omega)=p^{\heads(\omega)}(1-p)^{\tails(\omega)}, \;\;\; \P(A)=\sum_{\omega \in A}\P(\omega) \] where \(0<p<1\) is a fixed real number and \(A\in \F\) is any set. For a different \(0<q<1\) we can define a different measure on \(\Omega_T\) as follows \[ \Q(\omega)=q^{\heads(\omega)}(1-q)^{\tails(\omega)}, \;\;\; \Q(A)=\sum_{\omega \in A}\Q(\omega). \]

Two probability measures \(\P\) and \(\Q\) are said to be equivalent to each other if \[ \P(A)>0 \;\; \mbox{if and only if} \;\; \Q(A)>0. \] It doesn’t matter what values they (\(\P\) and \(\Q\)) assign on the same event \(A\). But if they are equivalent, they should agree that either \(A\) is a possible event (i.e., \(\P(A)>0\) and \(\Q(A)>0\)) or an impossible event (i.e., \(\P(A)=0\) and \(\Q(A)=0\)). For example, it is easy to see that the two measures \(\P\) and \(\Q\) above are equivalent to each other as long as \(0<p<1\) and \(0<q<1\). Typically, we think of \(\P\) as being the objective measure (so that \(p\) is our estimate of how likely a head is to occur in reality), and \(\Q\) as the martingale measure that we must use for pricing (so \(q = q_u\) is the martingale probability of an up-move.)

Binomial Stock Price on \(\Omega_T\)

We define the stock prices \(S_0, S_1, \dots, S_T\) on \(\Omega_T\) as follows: \[ S_0(\omega)=s, \;\; \mbox{for all $\omega\in \Omega_T$}, \] and for each \(t= 1, 2, \dots, T\), \[ S_t(\omega)=Z_t(\omega)S_{t-1}(\omega). \] Here \(Z_1(\omega), Z_2(\omega), \dots, Z_T(\omega)\) are independent identically distributed random variables with \[ Z_t(\omega)=\left \{ \begin{array}{ll} u & \mbox{if $\omega_t=\H$},\\ d& \mbox{if $\omega_t=\T$}. \end{array} \right. \]

Remark. The \(T\) random variables \(Z_1,Z_2,\dots,Z_T\) encapsulate all the randomness available to us in the probability space \((\Omega_T ,\F, \P)\); the \(2^T\) possible real-valued sequences \((z_1,z_2,\dots,z_T)\) that the sequence \((Z_1,Z_2,\dots,Z_T)\) of random variables could take are in one-to-one correspondence with the state space \(\Omega_T\). (Knowing the values of all the random variables \(Z_1,Z_2,\dots,Z_T\) tells us the state \(\omega \in \Omega_T\).) In probabilisitic language, we say that \(\F = 2^{\Omega_T}\) is equal to the \(\sigma\)-algebra generated by the random variables \(Z_1, Z_2, \dots, Z_T\), and we write \(\F = \sigma(Z_1,Z_2,\dots,Z_T)\).

Example 4.1 The market with \(s=4\), \(u=2\), \(d=\frac12\) has: \[ \begin{split} S_0(\omega_1\omega_2\omega_3)&=4, \quad \mbox{for all $\omega\in \Omega_T$};\\ S_1(\omega_1\omega_2\omega_3)&=\begin{cases} 8& \mbox{if $\omega_1=\H$},\\ 2& \mbox{if $\omega_1=\T$}; \end{cases} \\ S_2(\omega_1\omega_2\omega_3)&=\begin{cases} 16& \mbox{if $\omega_1=\omega_2=\H$},\\ 4& \mbox{if $\omega_1 \neq \omega_2$},\\ 1& \mbox{if $\omega_1=\omega_2=\T$}; \end{cases} \end{split} \] and \[ S_3(\omega_1\omega_2\omega_3)=\begin{cases} 32& \mbox{if $\omega_1=\omega_2=\omega_3=\H$},\\ 8& \mbox{if there are two heads and one tail,}\\ 2& \mbox{if there is one head and two tails,}\\ \frac{1}{2}& \mbox{if $\omega_1=\omega_2=\omega_3=\T$}. \end{cases} \]

It is clear that the stock price \(S_t\) depends on the outcome of the first \(t\) coin tosses. So we write \(S_t(\omega_1 \omega_2 \cdots \omega_t)\) instead of \(S_t(\omega_1\omega_2\cdots \omega_T)\) hereafter. A contingent claim is any random variable \(X(\omega_1\omega_2\cdots \omega_T)\). For example, a European call option is of the form \[ X(\omega_1\omega_2\cdots \omega_T)=(S_T(\omega_1\omega_2\cdots \omega_T)-K)^+. \]

4.2 Pricing general contingent claims

Let \(X(\omega)\) be any given contingent claim at time \(T\). We denote this claim by \(X_T(\omega)\) to stress that it is a contingent claim at time \(T\). We would like to find a self-financing portfolio \((x_t, y_t)\) with value process \(V_t=x_{t+1}B_t+y_{t+1}S_t\) such that \(V_T(\omega)=X_T(\omega)\) for all \(\omega \in \Omega\).

First we define recursively backward in time the random variables \(X_{T-1}, X_{T-2}, \cdots, X_0\) by, \[ X_t(\omega_1\omega_2\cdots \omega_t)=\frac{1}{1+r}[q_{u }X_{t+1}(\omega_1\omega_2\cdots \omega_t\H) + q_dX_{t+1}(\omega_1\omega_2\cdots \omega_t\T)], \] for \(t=T-1, T-2, \dots, 1, 0\), where \[ q_{u }=\frac{1+r-d}{u -d}, \;\;\; q_d=\frac{u -(1+r)}{u -d}. \]

We choose \[ y_t(\omega_1\omega_2\cdots \omega_{t-1})=\frac{X_t(\omega_1\omega_2\cdots \omega_{t-1}\H)-X_t(\omega_1\omega_2\cdots \omega_{t-1}\T)}{S_t(\omega_1\omega_2\cdots \omega_{t-1}\H)-S_t(\omega_1\omega_2\cdots \omega_{t-1}\T)}, \quad t=1, \dots, T. \]

The self-financing condition tells us that the value of the portfolio \((x_{t+1},y_{t+1})\) after the share prices change (as time goes from \(t\) to \(t+1\)) must be the same as the value \(V_{t+1}\) of the new portfolio \((x_{t+2},y_{t+2})\). In other words, \[ V_{t+1} = x_{t+1}B_{t+1} + y_{t+1}S_{t+1}. \] But we also know that \(V_t = x_{t+1}B_t + y_{t+1}S_t\) (by definition), so we can use this to eliminate \(x_{t+1}\), and using the fact that \(B_{t+1} = (1+r)B_t\) we get \[ \quad\quad V_{t+1}=y_{t+1}S_{t+1}+(1+r)(V_t-y_{t+1}S_t) \quad\quad\quad(\textbf{Wealth Equation}) \] If we let \(V_0=X_0\), then it follows by induction that \(V_T=X_T\) for all \(\omega\). We conclude that the multi-period binomial model is a complete market.

Let us prove the stronger statement that \(V_t(\omega_1\dotsm\omega_t) = X_t(\omega_1\dotsm\omega_t)\) for all \(t\) using induction. The base case \(V_0 = X_0\) is true by definition. For the inductive step, it is enough to fix \(\omega_1\dotsm\omega_t\) and use \(V_t(\omega_1\dotsm\omega_t) = X_t(\omega_1\dotsm\omega_t)\) to show that \[\begin{align*} V_{t+1}(\omega_1\dotsm\omega_t\H) &= X_{t+1}(\omega_1\dotsm\omega_t\H), \quad\text{and}\\ V_{t+1}(\omega_1\dotsm\omega_t\T) &= X_{t+1}(\omega_1\dotsm\omega_t\T). \end{align*}\]

We show \(V_{t+1}(\omega_1\dotsc\omega_t\H) = X_{t+1}(\omega_1\dotsm\omega_t\H)\). To simplify the following notation we suppress the appearance of \(\omega_1\dotsm\omega_t\) and just use \(V_t, X_t, S_t, y_{t+1}\) and \(V_{t+1}(\H), S_{t+1}(\H), X_{t+1}(\H)\), etc. for short. (But remember that these random variables do also depend on \(\omega_1\dotsc\omega_t\).)

Using the wealth equation, and the fact that \(S_{t+1}(\H) = uS_t\), we can write \[\begin{split} V_{t+1}(\H) &= y_{t+1} u S_t +(1+r)( V_t-y_{t+1}S_t )\\ &=y_{t+1}S_t\big(u - (1+r)\big) + (1+r)V_t. \end{split} \] Now, using our inductive hypothesis, the expression for \(y_{t+1}\), and \(S_{t+1}(\T)=dS_t\), we get \[ \begin{split} V_{t+1}(\H) &= \frac{X_{t+1}(\H) - X_{t+1}(\T)}{uS_t - dS_t}S_t \big(u - (1+r)\big) + (1+r)X_t\\ &=( X_{t+1}(\H) - X_{t+1}(\T) ) \frac{u - (1+r)}{u-d} + (1+r) X_t\\ &=( X_{t+1}(\H) - X_{t+1}(\T) ) q_d + [q_u X_{t+1}(\H) + q_d X_{t+1}(\T)]\\ &= X_{t+1}(\H), \end{split} \] as required.

Exercise: Check that we also have \(V_{t+1}(\T) = X_{t+1}(\T)\).

4.3 Filtrations, conditional expectation and martingales

Definition 4.1 A filtration is a non-descending family of \(\sigma\)-algebras: \[ \mathcal{F}_0\subseteq \mathcal{F}_1\subseteq \cdots \subseteq \mathcal{F}_T. \]

For example, in our coin toss space we can define a filtration as follows:

  • Take \(\mathcal{F}_0=\{\emptyset, \Omega\}\). This \(\sigma\)-algebra corresponds to the stock price \(S_0\) at time \(0\).
  • Let \(A_\H=\{\omega: \omega_1=\H\}, A_\T=\{\omega: \omega_1=\T\}\) and let \[ \mathcal{F}_1=\{\emptyset, \Omega, A_\H, A_\T\}. \]
  • Let \[ A_{\H\H}=\{\omega: \omega_1=\H, \omega_2=\H\}, A_{\H\T}=\{\omega: \omega_1=\H, \omega_2=\T\}, \] \[ A_{\T\H}=\{\omega: \omega_1=\T, \omega_2=\H\}, A_{\T\T}=\{\omega: \omega_1=\T, \omega_2=\T\} \] and let \[ \mathcal{F}_2=\sigma( A_{\H\H}, A_{\T\H}, A_{\H\T}, A_{\T\T}) \] In other words, the elements of \(\mathcal{F}_2\) are obtained by doing set algebras (intersection, union, complement, etc.) on the sets \(A_{\H\H}, A_{\H\T}, A_{\T\H}, A_{\T\T}\).
  • For \(\mathcal{F}_3\) we need to add events of the form \[ A_{\H\H\H}, A_{\H\H\T}, A_{\H\T\T}, A_{\H\T\H}, A_{\T\H\H}, A_{\T\H\T}, A_{\T\T\H}, A_{\T\T\T}, \] which are defined in a similar way.

If we continue up to \(t=T\), we obtain a filtration \(\mathcal{F}_0\subseteq \mathcal{F}_1\subseteq \mathcal{F}_2\subseteq \cdots \subseteq \mathcal{F}_T = 2^{\Omega_T}\).

Remark. Equivalently, we can set \(\F_t := \sigma(Z_1,\dots,Z_t)\) to be the \(\sigma\)-algebra generated by the random variables \(Z_1,\dots, Z_t\). Informally, \(\F_t\) contains all the events whose outcomes are determined by (and only by) the first \(t\) coin tosses.

Definition 4.2 Given a random variable \(X_T(\omega_1\omega_2\cdots \omega_T)\) on \(\Omega_T\), the conditional expectation of \(X_T\) given the \(\sigma\)-algebra \(\mathcal{F}_{T-1}\), denoted by \(\E_\Q[X_T \mid \F_{T-1}]\), is a random variable whose value depends on the first \(T-1\) coin tosses and is given by \[ \E_\Q[X_T\mid\F_{T-1}](\omega_1\omega_2\dotsc\omega_{T-1}) = q_{u }X_T(\omega_1\omega_2\cdots \omega_{T-1}\H)+q_dX_T(\omega_1\omega_2\cdots \omega_{T-1}\T). \] Here \(\E_\Q\) stresses that we are taking the conditional expectation under the measure \(\Q\).

We can generalize this definition as follows: let \(\heads(\omega_{t+1}\cdots \omega_T)\) denote the number of heads in \(\omega_{t+1}\cdots \omega_T\) and let \(\tails(\omega_{t+1}\cdots \omega_T)\) denote the number of tails in \(\omega_{t+1}\cdots \omega_T\). The conditional expectation of \(X_T\) given \(\F_t\) is a random variable that depends on the first \(t\) coin tosses, and is given by \[ \E_\Q[X_T\mid\F_t](\omega_1\cdots \omega_t)=\sum_{\omega_{t+1}\omega_{t+2}\cdots \omega_T}q_{u }^{\heads(\omega_{t+1}\cdots \omega_T)}q_d^{\tails(\omega_{t+1}\cdots \omega_T)}X_T(\omega_1\cdots \omega_t\omega_{t+1}\cdots \omega_T), \] for each \(\omega_1\dots\omega_t\).

If \(X_T = \Phi(S_1, \dots, S_T)\) then the conditional expectation \(\E_\Q[X_T | \F_t]\) should be thought of as the expected value of \(X_T\) if we fix the outcomes of \(S_1, \dots, S_t\) and average over the remaining randomness that determines \(S_{t+1}, \dots, S_T\), i.e., over \(Z_{t+1}, \dots, Z_T\).

Note that \(\E_\Q[X_T \mid \F_T] = X_T\) and \(\E_\Q[X_T \mid \F_0 ]=\E_\Q[X_T]\)

Example 4.2 We find the conditional expectations \(\E_\Q[S_2\mid\F_1]\) and \(\E_\Q[S_3\mid\F_1]\) for the data in Example 4.1 with \(r=0\). First note that \(q_{u }=\frac{1-0.5}{2-\frac{1}{2}}=\frac{1}{3}, q_d=\frac{2}{3}\). \[\begin{split} \E_\Q[S_2\mid\F_1](H)&=16\times \frac{1}{3}+4\times \frac{2}{3}=8,\\ \E_\Q[S_2\mid\F_1](T)&=4\times \frac{1}{3}+1\times \frac{2}{3}=2. \end{split} \] Observe that: \(\E_\Q[S_2\mid\F_1]=S_1\) exactly.

Now, \(q_u^2 = \frac19, 2q_uq_d = \frac49\) and \(q_d^2 = \frac49\), so \[ \begin{split} \E_\Q[S_3\mid\F_1](H)&=32\times \frac{1}{9}+8\times \frac{4}{9}+ 2\times \frac{4}{9} = 8,\\ \E_\Q[S_3\mid\F_1](T)&=8\times \frac{1}{9} +2\times \frac{4}{9} +\frac{1}{2}\times \frac{4}{9}=2. \end{split} \] Observe that: \(\E_\Q[S_3\mid\F_1]=S_1\) exactly too.

Important Properties of Conditional Expectations

  • Linearity. For constants \(a_1,a_2\), we have \[ \E[a_1X+a_2Y\mid\F_t]=a_1\E[X\mid\F_t]+a_2\E[Y\mid\F_t]. \]
  • Taking out what is known. If \(X\) depends only on the first \(t\) coin flips, then \[ \E[XY\mid\F_t]=X\cdot\E[Y\mid\F_t]. \]
  • Iterated conditioning. If \(s \leq t\) then \[ \E[\E[X\mid\F_t]\mid\F_s]=\E[X\mid\F_s]. \]
  • Independence. If \(X\) depends only on coin tosses \(t+1\) to \(T\), then \[ \E[X\mid\F_t]=\E[X]. \]

Definition 4.3 A sequence of random variables \(Y_0, Y_1, \cdots, Y_T\) is called a martingale under the measure \(\Q\) if for each \(t\), the value of \(Y_t\) depends on the outcome of the first \(t\) coin flips (we say the sequence is adapted to the filtration) and \[ \E_\Q[Y_{t+1}\mid\F_t]=Y_t, \quad t=0, 1, \dots, T-1. \]

Theorem 4.1 The sequence of discounted stock prices \[ \frac{S_t}{(1+r)^t}, \quad t=0, 1, 2, \dots, T, \] is a martingale under the risk-neutral measure \(\Q\).

Remark. The converse of this statement is also true in the multi-period binomial model. That is to say, the martingale measure \(\Q\) in an arbitrage-free and complete multi-period binomial model is determined by the property that \(\frac{S_t}{(1+r)^t}\) forms a martingale sequence under \(\Q\).

Proof. We have \[ \begin{split} \E_\Q\bigg[\frac{S_{t+1}}{(1+r)^{t+1}}\bmid\F_t\bigg](\omega_1\cdots \omega_t)&= q_{u }\frac{S_{t+1}(\omega_1\cdots \omega_tH)}{(1+r)^{t+1}}+q_d\frac{S_{t+1}(\omega_1\cdots \omega_tT)}{(1+r)^{t+1}} \\ &=\frac{S_{t}(\omega_1\cdots \omega_t)}{(1+r)^{t+1}}[q_{u }u +q_dd]=\frac{S_{t}(\omega_1\cdots \omega_t)}{(1+r)^t}. \end{split} \]

Theorem 4.2 The discounted value process \[ \frac{V_t}{(1+r)^t},\quad t=0, 1, \dots, T, \] of any self-financing strategy is a martingale under the risk-neutral measure.

Proof. Recall that the self-financing condition implies the wealth equation, \[ V_{t+1}=y_{t+1}S_{t+1}+(1+r)(V_t-y_{t+1}S_t). \] We have \[ \begin{split} \E_\Q\bigg[\frac{V_{t+1}}{(1+r)^{t+1}}\bmid\F_t\bigg] &=\E_\Q \bigg[\frac{y_{t+1}S_{t+1}+(1+r)(V_t-y_{t+1}S_t)}{(1+r)^{t+1}}\bmid\F_t\bigg] \\ \text{(linearity)}\to\quad &=\E_\Q \bigg[\frac{y_{t+1}S_{t+1}}{(1+r)^{t+1}}\bmid\F_t\bigg]+\E_\Q \bigg[\frac{(1+r)(V_t-y_{t+1}S_t)}{(1+r)^{t+1}}\bmid\F_t\bigg] \\ \text{(taking out what is known)}\to\quad &=y_{t+1}\E_\Q \bigg[\frac{S_{t+1}}{(1+r)^{t+1}}\bmid\F_t\bigg]+\frac{V_t-y_{t+1}S_t}{(1+r)^t} \\ &=y_{t+1}\frac{S_{t}}{(1+r)^{t}}+\frac{V_t-y_{t+1}S_t}{(1+r)^t}\\ &=\frac{V_t}{(1+r)^t}, \end{split} \] showing that \(V_t/(1+r)^t\) is a martingale.

This proves the correctness of the risk-neutral valuation formula for pricing contingent claims: \[ V_t=\frac{1}{(1+r)^{T-t}}\E_\Q[V_T\mid\F_t]. \]

We finish this section with a version of the First Fundamental Theorem for the multi-period binomial model. We require a version of Definition 2.2 for arbitrage on the multi-period binomial model.

Definition 4.4 A portfolio \(h \equiv \big( h_t = (x_t,y_t), t=0,1,\dots,T+1 \big)\) on the multi-period binomial model \(\mathcal{M} = (B_t, S_t)\) is an arbitrage portfolio if it is self-financing and its value process \(V^h_t = x_{t+1} B_t + y_{t+1} S_t\) satisfies: \[ V_0^h=0, \quad \P(V_T^h\geq 0)=1, \quad \P(V_T^h>0)>0. \]

Theorem 4.3 The following conditions are equivalent for a multi-period binomial model \(\mathcal{M} = (B_t, S_t)\), \(t=0,1,\ldots, T\), with interest rate \(r\).

  1. The model is arbitrage-free according to Definition 4.4.
  2. The condition \(d < 1+r < u\) holds, where \(d < u\) are the two possible values of \(Z_t = S_t/S_{t-1}\) at each time \(t\). (\(Z_t\) equals \(u\) with probability \(p\) and \(d\) with probability \(1-p\) for some \(0<p<1\)).
  3. There is a measure \(\Q\) defined by \[ \Q(\omega)=q_u^{\heads(\omega)}(1-q_u)^{\tails(\omega)}, \quad q_u = \frac{1+r - d}{u - d}, \] such that \(\frac{S_t}{(1+r)^t}\) is a martingale under \(\Q\).

Proof. We have done most of the work needed to prove this theorem. Let us show the implications \((1) \implies (2) \implies (3) \implies (1)\).

  • \((1) \implies (2)\): Consider the number of periods \(T\) in the model. If \(T=1\) then the implication holds by Theorem 2.1. For \(T > 1\), the 1-period model obtained by observing the market from \(t=0\) to \(t=1\) has no arbitrage, and so \(d < 1+r < u\) by Theorem 2.1.
  • \((2) \implies (3)\): This implication is Theorem 4.1 above.
  • \((3) \implies (1)\): Suppose \(h_t = (x_t,y_t)\) is a self-financing portfolio that satisfies the conditions \(\P(V^h_0 = 0) =1\) and \(\P(V^h_T \geq 0) = 1\). Since the measure \(\Q\) is equivalent to \(\P\), it follows that \(\Q(V^h_0 = 0) =1\) and \(\Q(V^h_T \geq 0) = 1\). Now, by Theorem 4.2, \(\frac{V_t}{(1+r)^t}\) is a martingale under \(\Q\) and so in particular, \[\E_{\Q} \left [\frac{V_T}{(1+r)^T}\right ] = \E_{\Q} [V_0] = 0.\] This shows that \(\E_{\Q}[V_T] = 0\) and thus \(V_T\) is a non-negative random variable with mean 0, which implies \(V_T\) is identically zero: \(\Q(V_T > 0) = 0\). Consequently, \(\P(V_T > 0) = 0\) as well so the \(h\) is not an arbitrage portfolio.

4.5 American Options

An American call or put option gives the right to buy or, respectively, to sell the underlying asset for the strike price \(K\) at any time between now and a specified future time \(T\), called the expiry time. In other words, an American option can be exercised at any time up to and including expiry. The holder of an American type contingent claim with contract function \(\Phi(x)\) will receive a payoff \(\Phi(S_{\tau})\) at time \(\tau\), where \(\tau\) is a random variable chosen by the holder. The random variable \(\tau\) must take values in \(\{0,1,\dots,T\}\) and specifies the choice of the exercise time for the holder. This means that if the option will be exercised at time \(\tau=t\) then the payoff will be \(\Phi(S_t)\) at time \(t\). Of course, it can be exercised only once. The holder does not have complete freedom to choose \(\tau\) arbitrarily; it must be a special type of random variable known as a stopping time. A stopping time \(\tau\) has the property that the event \(\{\tau \leq t\}\) is in \(\F_t\) for all \(t=0,\dots,T\), i.e., the decision to exercise the option at time \(t\) can only depend on what has happened upto time \(t\) and not on the future randomness. Some examples of stopping times are \(\tau \equiv T\) (always exercise at time \(T\)), and \(\tau = \inf\{t : S_t \geq L\} \wedge T\) (exercise at the first time that the share price is at least price \(L\), or at time \(T\) if that never happens).

It it possible to show that the price of the American option at time 0 equals \(\sup_{\tau} \{ \E_\Q[ (1+r)^{-\tau} \Phi(S_\tau)] \}\), where the supremum is taken over all stopping times \(\tau\). We give a rough argument, as follows. Suppose the holder exercises the American option according to the stopping time \(\tau\), so that the payoff to the holder is the amount \(\Phi(S_\tau)\) at time \(\tau\). This is equivalent to a present value of \((1+r)^{-\tau} \Phi(S_\tau)\), so risk-neutral valuation tells us that the value at time 0 would be \(\E_\Q[ (1+r)^{-\tau} \Phi(S_\tau)]\). But since the holder is free to choose any stopping time \(\tau\), they will choose the \(\tau\) that maximises this value at time 0, hence the value must be \(\sup_{\tau} \{ \E_\Q[ (1+r)^{-\tau} \Phi(S_\tau)] \}\).

The following pricing algorithm allows us to compute the value of the American option at any time \(t = 0,1,\dots,T\). Let \(V^A_t\) denote the price of the American option at time \(t\) (that has not been exercised yet). Using the risk-neutral valuation formula, we can price an American option inductively, as follows:

At \(t= T\): \(V^A_T = \Phi(S_T)\), because if we hold an American option at time \(t = T\), the only choice is to exercise or not at the expiry time \(T\), so it has the same value as the European version of the option.

At \(t< T\): suppose we know the value of the American option at time \(t+1\) is \(V^A_{t+1}\), then \(V^A_t = \max \big\{\, \Phi(S_t)\, ,\, \frac{1}{1+r} \E_\Q[V^A_{t+1} \mid \F_t]\, \big\}\).

Why do we take a “max” here? It’s because if we hold an American option at time \(t\), we have the choice to either exercise early at time \(t\), or wait. The value of exercising early is \(\Phi(S_t)\), the contract function \(\Phi\) evaluated at the current share price \(S_t\); the value of waiting at time \(t\) is the risk-neutral price \(\frac{1}{1+r} \E_\Q[V^A_{t+1}\mid \F_t]\), and we will choose whichever gives us more.

Summarising, we have the following pricing algorithm for American options: \[ V^A_t(\omega_1\dotsm\omega_t) = \begin{cases} \Phi(S_T(\omega_1\dotsm\omega_T)) & \text{if $t=T$},\\ \max\big\{ \Phi(S_t(\omega_1\dotsm\omega_t)) , \frac{1}{1+r}[q_u V^A_{t+1}(\omega_1\dotsm\omega_t\H) + q_d V^A_{t+1}(\omega_1\dotsm\omega_t\T)] \big\} & \text{if $t< T$}. \end{cases} \]

To see the algorithm in more detail, let’s consider an American option expiring after \(2\) steps with the contract function \(\Phi(x)\). The value of this option at time \(2\) (if it is not exercised before time \(2\)) is clearly \(\Phi(S_2(\omega_1\omega_2))\). At time \(1\) the option holder will have the choice to exercise immediately, with payoff \(\Phi(S_1(\omega_1))\), or to wait until time \(2\), when the value of the option will become \(\Phi(S_2(\omega_1\omega_2))\). The value of waiting at time \(1\) is therefore given by \[ \frac{1}{1+r}[q_{u }\Phi(S_2(\omega_1\H))+q_d\Phi(S_2(\omega_1\T))]. \] In effect, the option holder has the choice between the “value of waiting” and the immediate payoff \(\Phi(S_1(\omega_1))\). The American option at time \(1\) will, therefore, be worth the higher of these two: \[ V^A_1(\omega_1) = \max\{\Phi(S_1(\omega_1)), \frac{1}{1+r}[q_{u }\Phi(S_2(\omega_1H))+q_d\Phi(S_2(\omega_1T))]\}. \]
The same reasoning applied at time 0 gives \[ V^A_0 = \max\{ \Phi(S_0), \frac{1}{1+r}[q_uV^A_1(H) + q_dV^A_1(T)]\}. \]

Example 4.4 Consider an American put option with strike price \(K=80\) pounds expiring at time \(2\) on a stock with initial price \(S_0=80\) pounds in a Binomial model with \(u =1.1, d=0.95\) and \(r=0.05\). The stock values are: \[ \begin{matrix} \begin{array}{l|lllll} t&0&&1&&2\\ \hline &&&&&96.80\\ &&&88.00&<&\\ S_t&80.00&<&&&83.60\\ &&&76.00&<&\\ &&&&&72.20\\ \end{array} \end{matrix} \] The price of the American put will be denoted by \(P^A_t\) for \(t=0, 1, 2\) and its price at time \(2\) is \((80-S_2)^+\) given in the following tree: \[ \begin{matrix} \begin{array}{l|lllll} t&0&&1&&2\\ \hline &&&&&0.00\\ &&&?&<&\\ P^A_t&?&<&&&0.00\\ &&&?&<&\\ &&&&&7.80\\ \end{array} \end{matrix} \] First observe that \(q_{u }=\frac{1+r-d}{u -d}=\frac{2}{3}\) and \(q_d=\frac{1}{3}\). At time \(1\) the option holder can choose between exercising the option immediately or waiting until time \(2\). In the up state at time \(1\) the immediate payoff is \((K-S_1)^+=(80-88)^+=0\) and the value of waiting is \(\frac{1}{1+r}[q_{u }\times 0+q_d\times 0]=0\). In the down state the immediate payoff is \(4\) pounds, while the value of waiting is \(1.05^{-1}\times \frac{1}{3}\times 7.8 \approx 2.48.\) The option holder will choose the higher value (i.e., to exercise the option in the down state at time \(1\)). This gives the time \(1\) value of the American put \[ \begin{matrix} \begin{array}{l|lllll} t&0&&1&&2\\ \hline &&&&&0.00\\ &&&0.00&<&\\ P^A_t&?&<&&&0.00\\ &&&4.00&<&\\ &&&&&7.80\\ \end{array} \end{matrix} \] At time \(0\) the choice is, once again, between the payoff \((80-S_0)^+\), which is zero, or the value of waiting, which is \(1.05^{-1}\times \frac{1}{3}\times 4 \approx 1.27\) pounds. Taking the higher of the two completes the tree of the option prices: \[ \begin{matrix} \begin{array}{l|lllll} t&0&&1&&2\\ \hline &&&&&0.00\\ &&&0.00&<&\\ P^A_t&1.27&<&&&0.00\\ &&&4.00&<&\\ &&&&&7.80\\ \end{array} \end{matrix} \]
Therefore the price of the American put is \(P^A_0=1.27\) pounds.

In comparison, the price of a European put is \(P_0^E=1.05^{-1}\times \frac{1}{3}\times 2.48 \approx 0.79.\) Here we use \(2.48\) (not \(4\)) in the calculation as European option is exercised at time \(2\).

What is the price of an American call in the above example? Although in general an American option is at least as valuable as the equivalent European option (because of the additional choice in when to exercise the option), for call options (on a stock that does not pay dividends) the American and European options have the same price.

Theorem 4.4 The prices of American and European call options on a stock that pays no dividends are equal \(C^A=C^E\), whenever the strike price \(K\) and expiry time \(T\) are the same for both options.

Proof. The relation \(C^A\geq C^E\) is clear as the American call option gives higher payoff (since you can exercise your right at any time) than the European call. (It’s also possible to give an arbitrage argument to prove this.) problem. Now if \(C^A>C^E\), then

  • write and sell an American call.
  • buy a European call.
  • invest the difference \(C^A-C^E\) risk free with interest rate \(r\).

If the American call is exercised at time \(t\le T\), then borrow a share and sell it for \(K\) to settle your obligation as a writer of the call option, investing \(K\) at the rate \(r\). Then at time \(T\) you can use the European call to buy a share for \(K\) and close your short position in stock. Your arbitrage profit will be \[ (C^A-C^E)(1+r)^T+K(1+r)^{T-t}-K>0. \] If the American option is not exercised at all, you will end up with the European option and an arbitrage profit \((C^A-C^E)(1+r)^T\). This proves that \(C^A=C^E\).