Simulation Project Document

The material in this web page is not in your textbook and it will be represented in the upcoming Exam.

This document will continue to be updated as issues arise. At
this time, the *requirements*
for part 1 and the research component are final, but additional hints
may be added. Parts 2 and 3 I'm still working on. SWN 10/22

Added notes to the end of the theory section and finalized part 2. SWN 10/29

Simulation Project

We use simulation when there is no direct mathematical solution readily available for a problem. One important class of simulation problems is dynamic systems with random disruptions. Another class is discrete event systems, in which only the random events are important. In any case, we perform simulation in order to gain an understanding of the likely behavior of a system subject to random influences. We may be interested in statistical properties like the average cost of repairing an automobile when a random part fails or we may wish to investigate how likely a random error is to cause a spacecraft to lose control of its trajectory.

Simulations of the kind we will study depend on a source of random
numbers to model the random events, inputs, or errors. Most
computers lack any source of truly random numbers so algorithms known
as *pseudo-random number generators*
are used. In practice they work quite well for most applications.

When we assume that a value is
random we (implicitly or explicitly) assume a *distribution*
for that value. We will limit our study to two distributions,
*uniform* and *normal*
(or Gaussian). Uniform random numbers model situations in which any
value in some range is (assumed to be) equally likely. Normal random
numbers model errors of the kind typically encountered in
measurements. The distances of individual projectiles from the
center of a target is likely to be normally distributed, for
instance.

The built-in function rand() provides uniform random numbers in
C++. I've stolen and adapted a function I call frand() for
generating *standard normal*
random numbers; you can steal it from me here.

Uniform values over a range [a,b] can be generated by linearly
transforming the result of rand(): a + rand() *
(b-a)/double(RAND_MAX). Normally distributed values are
characterized by mean (average) and *standard deviation*
(how spread-out the values are). A similar transform, mean +
frand()*std_dev, gives appropriately distributed results.

The heart of a simulation program is a
*model* of the system being studied. You have to design a set
of variables that keep track of all the important features of the
system. For example, a projectile might have position, orientation,
and velocity variables. If your model does not involve the
orientation (say, you assume a spherical projectile) then you would
not need to keep track of orientation.

You then write code to model events
and/or the passage of time in your system. Each event or time unit
causes the *state* of your model to *evolve*.

When the simulation performs to your
satisfaction you *instrument* your model by adding code and
variables to keep tract of whatever it is you want to measure. For
instance, if you are modeling accuracy of a drill press, you need to
keep tract of the positions of the holes it drills, but you do not
need to store every variable that determined those positions.

You then must run the simulations many times using a loop in the driver program. The statistical properties of the instrumentation variables over many simulated runs provide an estimate of the statistical properties of corresponding variables in the real system.

Note: The Microsoft random number
generator you are using gives exactly the same sequence of numbers
each time you run your program. Executing the following line of code
*once* at the beginning of main()
should make each run give different numbers:

srand(time(0));

Note: You can compute the standard deviation of a sequence of numbers without storing them in an array. The formula

sqrt(n sum x^2 + (sum x)^2 )/n

is equivalent to the Standard Deviation formula given in the textbook. We went through the algebra in class.

Note: Several people have asked for clarification on the idea of a
simulation `run'. When you simulate a process from start to finish
using random numbers, that can be thought of as a *run*. Since
the result is at least somewhat random, you want to simulate that
process more than once and compute statistical measures each time you
*run* the overall simulation program. Which meaning the word
run has should be clear from context. If it isn't, ask.

Note: 1,000,000 runs if probably a bit too much. Work up from 100 simulated lifes until your program takes about 30 seconds to run. If your results haven't stabilized by then, something's wrong.

Part 1: A simple dynamic system

A blind man lives in a 21^{st}
floor condominium in downtown Tulsa. He has a 15x15 foot room with a
5 foot window in the center of the North wall. Each day, for
exercise, he enters the room, closes the door, and finds his pogo
stick where he left it the previous day. He then jumps on the pogo
stick until he hits a wall, at which time he stops for the day and
leaves the stick there. On each jump he moves exactly one foot
North, South, East or West, with each direction equally likely.

If our athletic blind man ever hits the window he will fall through it and plunge to his death. Concerned about this, he as come to your company for $100,000 worth of life insurance. In order to determine the premium (price) of the insurance, you decide to simulate his exercise regime to see how long he is likely to live.

Write a program to simulate the exercise regime described above.
Document any additional assumptions you need to make. Instrument the
simulation to output the *life expectancy*
of your customer (in days) as well as the standard deviation of the
simulated lifetimes and likelihood of dying on any given day.

There will be three more parts; by the time we are done, this will be a program of respectable size. The code you write for part one will be a component of the complete project, so do yourself a favor and design it cleanly.

This part should be completed in the lab as soon as possible. With the possible exception of the standard deviation you should be able to get this running this Thursday.

Your program should prompt for the number of times to run the simulation. Run the program several times each with 10, 1000, and 1,000,000 iterations. Note the degree to which the answers agree or disagree among the runs. How many iterations are needed to get a reasonably stable answer?

Submit your working program and convergence analysis to Parijat for Lab credit. Keep a copy of your solution to Part 1 to submit to me as part of your project package.

Research Component

The relationship between simulation statistics and the statistical properties of the real system is an interesting one. Mathematically it is connected to the idea of Monte Carlo integration, but it requires an understanding of probability theory to make the connection. I don't recommend that you go there.

For some aspect of M.C. Integration and/or simulation, find at least two published papers and write a 3-5 page (with illustrations and bibliography) report. This will be the theory paper to accompany your program on this project.

Essentially any coherent report on anything connected to random number in computer programming is acceptable. Examples include:

random number generation algorithms

a problem in your major area (or an area of interest to you) that has been studied using simulation

compare/contrast C++ and Excel (or some other language) as simulation programming tools (i.e. Find two solutions to one problem and compare the code)

These are just examples. I do not expect extensive research or
laborious writing for this. I want you to begin with concept of
simulation (or Monte Carlo integration) and find a couple of sources
that interest *you*. Then
write a simple report that conveys your interest in the topic and
what you learned from those sources. Two pages of text, a couple of
illustrative diagrams, and a 2-4 entry bibliography are all I expect
here.

The research component is due with your final submission package for this project.

Part 2: Financial Simulation

The system we simulated in part one has a simple analytical
solution *if we change the rules slightly*. You should have
found that the likelihood of our customer dying on any given day is
about 1/40 or 0.025. If you have him begin each day in a random
location you will find that probability is 1/12. This is because
exactly 1/12 of his wall is window. One must be careful reasoning in
this way, since the dynamics of an apparently simple system can cause
strange statistical effects. On the other hand, when a slight
modification makes it possible to predict the result mathematically,
it is a very useful test of the rest of your program!

Our finance department predicts that they will earn about 7.5% per year interest on invested funds in the months to come. Since your customer probably won't last long enough for that forecast to change we will assume the actual rate earned varies randomly from week to week around that value.

Assume the interest is earned
every day based on the weekly rate. You will need to model the
beginning and ending of weeks: document any assumptions you make.
The rate is 7.5% plus a standard normal random number (mean zero and
s.d. = 1%) *per year.*
Assuming 365 days/year the daily rate is current_rate/100/365.

If the policy is written (sold), the
customer will pay the *premium*
amount at the beginning of each week and your finance department will
immediately invest that money. On the day the customer dies, the
company pays his estate $100,000. The difference between the
accumulated premiums with interest earned and the benefit paid out is
the profit (or loss) to your company for that run.

This version of your program should accept both an iteration count and a proposed premium (price) for the insurance policy. It should output the expected financial gain or loss to the company (and its standard deviation) along with the life expectancy data as collected in Part 1.

Part 3

Suppose that insurance regulations require that unique policies like this one be priced to generate a certain expected return.

Assume that we are allowed to charge a premium that is expected to
generate 15% *expected *profit on the face value of the policy.

Convert your program from Part 2 into a function that takes the inputs as parameters rather than prompting for them. Choose a root-finding algorithm suitable for this problem and use it to identify a fair premium for this proposed insurance policy.

*Note: We will have a discussion in class about specializing an
algorithm to this problem.*