Steve Nuchia

In his textbook *Simulation, *Ross[1] gives an example and a related exercise on the problem of valuing (determining the value of) financial options contracts. Ross' exposition of the problem and his suggested solution were found to be opaque by many students in a graduate course using the book. This paper is an attempt to shed some light on the problem.

In a 1998 paper, Longstaff and Schwartz[2] give an algorithm (LSM) for valuing options that they claim converges quite robustly to the optimal exercise threshold. In the sections that follow I will explain what an option is and why they are hard to value, explain Ross' algorithm and put his problem in context, and finally compare the result of Ross' algorithm with the result of LSM. Ultimately, we will try to quantify Ross' comment that the exercise strategy in his algorithm is "far from optimal".

Suppose you found a coupon in the newspaper that would permit you to buy a tank of gasoline for $1.25 per gallon, at a time when gasoline was selling for about $1.15 per gallon. Would you save it? Suppose further that you plan to make a trip during the Fourth of July holiday and you expect that prices will rise to about $1.40 per gallon at that time. Now how much would the coupon be worth?

During your trip you fill your tank several times. When do you use the coupon? Suppose the prices at each service station are $1.21, $1.29, $1.34, and $1.21. Obviously you would want to use the coupon at station number three, where it would be worth nine cents per gallon. But to have the coupon when you reach the third station, you must forego using it at station number two, where it would have been worth four cents per gallon. Since you cannot be sure in advance what the price at stations three and four will be, how do you make the decision to save the coupon at station two and then decide to exercise it at station three?

This simple example illustrates all the main features of the options valuation problem. The option itself, though perhaps complicated, does not change. The price of the underlying asset (here, gasoline) is determined by some stochastic process. The option is exercisable at some set of points in time, and the holder of the option must decide, at each of those points in time, whether to exercise. The value realized by the holder of the option is determined by the actual prices observed for the underlying asset and the holder's strategy for making exercise decisions. The strategy, in turn, must be based only on the history of the price to the point of decision and the assumed stochastic pricing model.

Financial options exist on many different assets as well as purely statistical "underlying" prices such as interest rates and stock market indices. In addition, features of some securities, like convertible bonds, and many contracts can be treated as options. Thus the problem of valuing options is one with many economically significant applications.

We will concern ourselves here only with a particular imaginary stock option. Stock options are standardized option contracts and are traded anonymously through the market much like stocks themselves. Of the tens of thousands of stocks trading in the U.S. markets, only about one thousand have options available. However, there are anywhere from about twenty to as many as several hundred distinct options available for each of those stocks. The arcane vocabulary of options allows one to concisely specify a particular option.

All standardized options cover a specified number of shares of a specific underlying stock. Normally this is 100 shares, but events like stock splits may cause come options to have a different multiplier. Since we are modelling a frictionless market (no transaction costs) we will ignore the multiplier and think in terms of an option on a single share.

All stock options specify a *strike price, *the price at which the holder may buy or sell the underlying stock when exercising the option. In options theory it is important that all participants in the market may also buy and sell the underlying at a time-varying *market price.*

A *call *option entitles the holder to buy the underlying stock at the strike price while a *put *option entitles the holder to sell the stock at the strike price. If the holder elects to use the option, we say the option is *exercised.* An option may be exercised at most once.

An unexercised option ceases to exist on the *expiration date.* Standardized stock options expire on the Saturday following the third Friday of the month, so the expiration date is specified by naming the month (and year) of expiration. In our model we will just assume the option expires after units of time.

In general, an option will specify when and how the holder may exercise. The so-called *American feature *permits the holder to exercise at any time prior to expiration. As we will see, American options are much more difficult to value than the "European" options, which are exercisable only on the expiration date.

A standard American stock option may be fully characterized then by specifying the underlying security, put or call, the strike price, and the expiration date.

Having clipped the American $1.25 gasoline call we found in the newspaper, two things about its value should be apparent. First, it can never cost us more than wallet space to own it. Second, the only way it can pay to own the coupon is if we exercise it when gasoline costs more than the strike price. So, we see that the value of the option is always non-negative (to the holder) and in general depends on the expected future prices of the underlying asset.

With financial assets, we assume that the underlying asset can be re-sold as soon as a call is exercised or re-purchased as soon as a put is exercised. The exercise or intrinsic value of an option is the difference between the strike price and the market price of the underlying. If the difference is negative, the option has no exercise value at that time.

Note that the present intrinsic value of the option is not a random variable -- it is known to the precision that the current market price is known. The problem is calculating (or estimating) the speculative component of the option value. Because this value comes from the idea that if we wait long enough the market price may move so as to create additional exercise value, this component is sometimes called the *time value* of the option. The shorter the remaining time to expiration, the less chance there is (all else being equal) for the price to move in favor of the holder. Time value therefore is monotonically increasing with increasing time to expiration.

In principle then, the value of a European option is just the expected value of the option on the expiration date, conditional on the price history up to the present. Much work has been done to understand the relationship between the parameters of the asset pricing process and the present value of European options: the best known is probably the Black-Scholes[3] formula.

Any approach based on computing the expectation integral suffers from two serious shortcomings. First, it restricts the class of pricing processes that can be studied. More importantly, when the option may be exercised early, the limits of integration in the expectation integral are themselves random variables. This makes it challenging to identify combinations of strategies and price behavior models for which the expectation can be calculated at all. Computing the expectation for an arbitrary strategy on an arbitrary model is clearly a job for simulation.

One important result from the analytical work holds up in many more general cases. For almost any pricing process where the expected price movement favors the option holder, the American feature is worthless. This is the result Ross refers to in his discussion of the threshold, which we dissect below.

The value we will compute for the option will be the estimated expected cash return realized by a holder of the option, conditional on the starting Markov state of the problem. Note that the cost of the option is not at issue: the simulation opens when the option comes into the possession of the hypothetical holder and the cost, if any, is already sunk. The simulation ends when the simulated holder exercises the option or allows it to expire worthless. Over many iterations of this process the average proceeds of exercise may be taken as an estimate of the expected return.

Since the simulated option holder must elect whether or not to exercise the option at each point in time where the option is exercisable, the simulation software must implement an exercise strategy. It should be clear that the expected returns on owning an option are a function of the exercise strategy and therefore it might seem that the strategy should be a parameter of the option valuing program. As we will see shortly, however, it turns out that there is a single optimal strategy.

On the other hand, the optimal exercise strategy (and hence the option value) *does* depend on the stochastic behavior of the underlying stock price. The simulation objective is to estimate the value of the option given a particular stochastic model of the underlying pricing process. What relationship that model might have to the behavior of stock prices in the real world is beyond the scope of this paper.

In general, we will use some kind of stochastic recurrence formula to generate a discrete-time stock price history realization. This Markov process may in turn be an approximation for some more general theoretical pricing process, but since simulation methods demand that we treat time in discrete increments, a Markov model will be our standard assumption. In any such model we can expect that the current price of the stock will be a component of the state vector.

The standard option valuation problem, then, is to estimate the value of an option given a particular Markov process that is assumed to generate the prices for the underlying and a starting point in the state space of that model. Simulation proceeds by generating successive states using the Markov model and applying, at each time step, an approximately optimal execution strategy, with the final result being the mean observed return. The two simulation methods studied in this paper differ principally in how they approximate the optimal strategy.

The exercise (or *intrinsic*) value of an option is easy to compute: it is just the difference between the strike price and the market price of the underlying, or zero if that difference is negative. (Note: one must be careful to set up the formulas correctly for both put and call options.) Since the option holder is never obliged to incur any new expenses, the value of the option can never be negative (to the holder). But this cannot be the whole story -- as we saw with the gasoline coupon, an option has some value even with it has no intrinsic value. This extra value is sometimes called the time value of the option.

It is interesting to note that the time value of a European option may be negative.

My corporate finance professor insisted that we learn, if nothing else, that the value of an instrument or project is the net present value of the expected future cash flows. The phrase "net present value", for the uninitiated, denotes the value today of money to be spent or received in the future, given some assumed interest rate. At a 7% annual interest rate, one dollar a year from now has a net present value of about 93 cents.

The reader should be alert to the role of the net present value computation in options valuation. In this paper we assume the interest rate is negligible over the life of the option and we do include net present value (or "time value of money") computations.

It is also noteworthy that we do not invoke equilibrium exchange assumptions in options valuation theory. The fact that the value of the option to the writer is exactly the negative of its value to the holder results from the symmetry of their future cash flows alone. When an option exchange occurs it does so because participants in the market disagree about that value, perhaps because they hold differing beliefs about the underlying pricing process. The equilibrium price of the option will be determined by the distribution of these beliefs, but the value of the option, in the abstract, is purely a function of the "real" pricing process.

Recall that in the gasoline coupon example, optimal execution required that we forego exercising the coupon at station number two. Since it is "really" worth nine cents per gallon, we don't want to part with it for only four cents per gallon.

There are two claims about execution strategies that we need to examine. First, we claim that an optimal execution strategy exists, at least in a limiting sense. That is, for any positive ϵ there is a strategy that will yield an expected return larger than times the optimal return. Secondly, for pricing processes with one-dimensional state spaces (memoryless processes), the optimal execution strategy is a simple threshold test.

I will not attempt to prove either proposition here, nor will I rigorously explore the nature of the optimal strategy with more complex state spaces. Note, however, that Longstaff and Schwartz implicitly assume that the optimal strategy is always a threshold. Because these assumptions plays a key role in valuation by simulation, I offer the following informal justification:

As we saw above, the expected return on owning an option is a function of the dynamic stock pricing model, the initial state, and the execution strategy implemented. Suppose now that the expected return on an option is $1 and we are able to buy that option for less than $1. To the extent that we trust our model, this would be a good investment. Suppose further that, having bought the option, someone offers us more than $1 for it. In that case, we would sell it to them.

There is no economic difference between buying an option for $x and foregoing exercise proceeds of $x nor between selling the option for $y and exercising it for a benefit of $y. Logically, then, we would exercise an option whenever the immediately realizable gain from doing so exceeds the expected return on the option, given that we do not immediately exercise it.

For the American option in continuous time, there is a semantic difficulty with the idea that the value of the option might be less than the immediate exercise value, but in any discrete time approximation the idea is well defined so we won't worry about it.

It is therefore assumed that the optimal exercise strategy is to exercise the option whenever the immediately available benefit exceeds the net present expected value of the option given that we do not exercise it right now. Since estimating the expected value of the option is our whole task, the simulation can be expected to have a somewhat recursive flavor. We will see that this is the motivating idea behind the LSM algorithm, while Ross finesses the issue by providing an approximating function.

Now, suppose we have a function that purports to estimate the expected return from continuing to hold the option, as a function of the current state of the pricing process. If this function reported a value that was consistently too low, we would frequently exercise the option too early, realizing too low an average return. On the other hand, suppose it reports a value too high. In that case, we would often fail to exercise when we should, missing the best opportunities and settling for a smaller average return than we might have realized with the optimal estimator.

Based on these musings it seems at least plausible that there is an optimal estimator and that using the value it reports as an execution threshold is the optimal execution strategy. From there is but a small step to supposing that the value the function reports for the initial state is the initial value of the option. In fact, neither Ross nor Longstaff and Schwartz go that far, preferring on both theoretical and practical grounds to use the simulation results to compute the option value estimation.

The particular problem we will study in this paper assumes a "lognormal random walk" price generating process. Ross gives a lucid account of the process itself, so we will just summarize it. This is a one-dimensional continuous Markov process where each step consists of multiplying the previous price by to obtain the new price, where is a normally distributed random variable. The mean and variance of the exponent are the only parameters of the state transition operator.

Since and the sum of normal random variables with mean and variance is a normal random variable with mean and variance , the logarithm of the price at each step is a normal random variable with a distribution conditional on any prior state that is an almost trivial function of that prior state. This makes the lognormal random walk a popular model for analytical modelling studies.

Though many generalizations of this model have been proposed, we will not investigate any in this paper. Note, however, that Longstaff and Schwartz claim as an advantage for their method that it works well for arbitrary models while Ross' method is bound to the lognormal model by the derivation of his threshold function.

The particulars of the problem, exercise 6.15 in [1], are: Estimate the value of a $100 call exercisable once on each of the next 20 day for a stock obeying the lognormal random walk process with an initial price of $100 and parameters and per day. Note that the *Mathematica* normal random number generator take rather than as a parameter. Note also that time is Ross only implicitly specifies that time is to be discretized at one increment per day; I have made this assumption explicitly.

It should do no violence to the reader's intuition that one would prefer to hold an option when the price of the underlying is likely to move in our favor and to exercise it when the price is more likely to move against us over time. Still it is not obvious why, as Ross states, is the horizon dividing these two cases for the lognormal random walk process. Since this horizon implies that we would want to hold call options on stocks with prices that tend to decline (at a rate up to ), it might even be called counter-intuitive.

The key to understanding how this can happen is to observe that when passing from one time increment to the next the p.d.f. for the stock price becomes broader and flatter. Even though the mode and mean may move in an adverse direction, the expectation integral may rise through such a step. Recall that the execution benefit is given by the so-called hockey stick function,

and the integrand of the expectation integral (for a fixed future execution time) is this function times the p.d.f. of the stock price. So any change in the p.d.f. that throws weight into the right-hand tail stands a good chance of increasing the expected execution value of the option at the corresponding point in time.

The horizon Ross gives for the existence of an early execution threshold can be found (I assume) by equating the derivative of weight lost due to the mean shifting leftward to that of the weight gained by increasing the variance.

I had hoped to develop some compelling visualization for this phenomenon, but I have not been successful.

The execution strategy Ross proposes is based on the fact that we can find the value for a European option analytically under the lognormal assumption. His formula, which looks very similar to the Black-Scholes formula, purports to compute the expected value of a one-time executable option. To approximate the American feature, he compares the immediate execution payoff to the expected payoff of the best alternative European option of the same strike price having an exercise date prior to the expiration of the American option. In essence, the exercise strategy is asked to decide at each step between exercising the option and exchanging it for one of several possible European options.

The simulation proceeds then by considering, for each price in a history, whether the payoff for exercising at that price exceeds the expected payoff of each of the possible European options that might replace the actual option. If the present payoff is largest that is recorded as the payoff for the current price history and the simulation iteration ends. Otherwise, the simulation steps forward one time unit and the process is repeated. Note that the option is still an American option -- replacement with a European option is a purely hypothetical device used in deciding to continue holding the American option.

A performance-oriented implementation of Ross' method should consider the question of where the analytical maximum of the value estimating function must lie for fixed current price and varying time to expiration. I have preferred here to use a direct implementation of the algorithm as stated in the text for transparency and to minimize the possibility of introducing bugs.

The program presented in the appendix implements Ross' method on Ross' problem.

Here is the empirical distribution function for the realized return on 5000 simulated price histories:

The 95% confidence interval for 1000 realizations was about $22 wide, so 5000 realizations for a CI width of $6 seems to be a reasonable choice. This simulation took over an hour on my poor old Pentium 100.

Recall that the execution strategy in Ross' method exploits an analytical estimate of the value of continuing to hold the option. In contrast, LSM exploits LSM recursively to estimate that value. Longstaff and Schwartz go to some lengths to explain and justify this; I will confine my remarks to an exposition of the mechanics.

Because it is forbidden to peer into the future of a particular price history in support of an exercise decision, LSM may at first appear to be breaking the rules. However, it uses information from the future of all prices histories in a simulation run to compute a function that approximates the expected return on holding the option at each time step. It then uses this estimating function to make all the decisions for the next time step (working backwards) in each history. Once the present is reached, the algorithm has built up a record of the gains realized from all the executions it decided to make and it bases its reported valuation on the average of these gains.

In their paper, Longstaff and Schwartz report that the results obtained by generating independent sets of price history realizations for each step in the recursion are indistinguishable from those obtained by their recommended algorithm. Any theoretical impropriety in the use of future price information is empirically insignificant.

At the core of LSM then is the process that converts the information about realized gains in the tails of all histories into a compact and causality-preserving option value estimating function. Reading the LSM paper one gets the impression that just about any curve fitting procedure will do an acceptable job. For models with -dimensional state spaces they report that fitting against functions of the state variables and all the pair-wise products of the state variables is empirically sufficient.

The overall simulation begins, then, by generating a complete set of price histories. The returns realized by executing the option at the final price are then fitted against the state vectors at the penultimate time step, . The resulting function is used to guide the execution decision process at , resulting in a new set of realized gains. The new gains are fitted against the state vectors at time to give an estimating function that is used to guide the execution decisions at that time, and so on.

A further complication is the assertion by Longstaff and Schwartz that best results are obtained when only those histories where the payoff is positive at the time of the decision are included in the curve fitting process. This is justified since these are the only paths for which the strategy will evaluate the estimating function. Since the fitting process is completely redone for each time increment, there is no obvious reason to object to this idea.

Needless to say, the chief practical difficulty in implementing the algorithm is to program the curve fitting process. Here we will use the weighted Laguerre polynomials recommended by Longstaff and Schwartz and *Mathematica*'s built-in least squares fitting function. For future reference, based on a conversation I had with Dr. Redner, I'd like to investigate fitting rational functions to the value curve minus the hockey stick.

The *Mathematica* code implementing LSM is presented in the appendix.

After some initial success with my LSM implementation I have had difficulty getting stable estimates out of it. Contrary to the assertions of Longstaff and Schwartz I have found the algorithm's performance on Ross' problem to be very sensitive to the choice of basis functions.

I have seen the algorithm converge as expected, but I did not save the configuration and have not been able to reproduce it. The program as reproduced in the appendix represents my stopping place, and it does not presently contain the Leguerre polynomial implementation mentioned above.

I suspect two problems underlie the convergence difficulty I am seeing. First, the unusually large variance in the pricing model seems to play a role. Second, I believe Longstaff and Schwartz are conditioning their data in a way that is not obvious from the paper. A methodical study of the basis function selection, as suggested by Dr. Redner, would no doubt help with the latter problem. The former would be ameliorated in the case of a true American option by choosing a smaller time increment. Note that leaves the performance of the algorithm in question for periodically exercisable options on highly volatile assets.

Ross' problem is tractable as stated, though a large number of realizations is required to get a meaningful result. Simulation with modest horsepower is unlikely to become a factor in real-time securities trading.

The role of simulation ideas in the theory of options valuation is fundamental. The relationship between the simulated execution threshold and the value approximating function is one that should be better understood.

Based on my experience, the robustness of the LSM algorithm appears to have been overstated. I am confident, however, that with a more careful study of the fitting process it would be a valuable tool.

Due to the instability of the LSM algorithm, no conclusion is offered regarding Ross' assertion about the non-optimality of his estimator.

The following sections define the *Mathematica* code used in the simulations. Recall that we are modelling only call options here.

Generate a stock price realization from the lognormal Markov model. The starting price, number of periods, and the periodic mean and square root of the variance are parameters.

Estimate the value of an option using Ross' method.

We are now prepared to give a solution to Ross' problem.

Based on the observed variance, the 95% confidence interval for the mean would be 48 plus or minus about 3. Since the distribution is clearly not normal, let's see what kind of confidence interval we get by resampling:

The Bootstrap 95% confidence interval is approximately:

Which isn't much different from the conventional estimate. Let us procede with the LSM computation and compare the results.

Since we want to work with prices across all histories at each time, we will work with the transpose of the history vector:

In the published algorithm, the discount rate is applied to future exercise gains to compute the payoff vector at each time step. Here we will just take the larger gain at each step, a considerable simplification. We will keep track of the payoffs by destructive assignment.

The output result of running the algorithm is a vector of average returns in time-reversed order. Note that it is generally increasing; as the quality of the estimating function improves, the return should approach the optimum value.

In fact what I'm seeing is that the estimating functions are quite unstable. This may be due to the exceptionally large day-to-day variance in Ross' problem. Alternatively, I may misunderstand the fitting process in some fundamental way. While I have seen some LSM runs that behaved as expected, I have not seen any recently.

What is happening is that the estimating function vary wildly from one day to the next and often severely underestimate the option value on part of their domain. The strategy then executes options in that part of the domain inappropriately, reinforcing the underestimate.

Note that the value of the option declines as we approach the expiration horizon -- the front of the box. Also note that there is a roughly exponential curve in the price/time plane to the left of which the value of the option is negligible. Also, the "hockey stick" pattern on the front wall, plus the way the t=1 curve approximates the hockey stick. Also the shape of the intercept line on the right and left walls.

Now let's look at the time value alone:

[1] Ross, Sheldon M. *Simulation, 2nd Edition *Academic Press, 1997. Section 6.7 beginning on page 101.

[2] Longstaff, Francis A. and Schwartz, Eduardo S. "Valuing American Options By Simulation: A Simple Least-Squares Approach", Working paper, University of California at Los Angeles.

[3] Black, F. and Scholes, M. "The Pricing of Options and Corporate Liabilities", *Journal of Political Economy *#81, 1973, pp 637-654.

Converted by