Eight to Late

The drunkard’s dartboard: an intuitive explanation of Monte Carlo methods

with 8 comments

(Note to the reader: An Excel sheet showing sample calculations and plots discussed in this post  can be downloaded here.)

Monte Carlo simulation techniques have been applied to areas ranging from physics to project management. In earlier posts, I discussed how these methods can be used to simulate project task durations (see this post and this one for example). In those articles, I described simulation procedures in enough detail for readers to be able to reproduce the calculations for themselves. However, as my friend Paul Culmsee mentioned, the mathematics tends to obscure the rationale behind the technique. Indeed, at first sight it seems somewhat paradoxical that one can get accurate answers via random numbers. In this post, I illustrate the basic idea behind Monte Carlo methods through an example that involves nothing more complicated than squares and circles. To begin with, however, I’ll start with something even simpler – a drunken darts player.

Consider a sozzled soul who is throwing darts at a board situated some distance from him.  To keep things simple, we’ll assume the following:

  1. The board is modeled by the circle shown in Figure 1, and our souse scores a point if the dart falls within the circle.
  2. The dart board is inscribed in a square with sides 1 unit long as shown in the figure, and we’ll assume for simplicity that the  dart always  falls somewhere  within the square (our protagonist is not that smashed).
  3. Given his state, our hero’s aim is not quite what it should be –  his darts fall anywhere within the square with equal probability. (Note added on 01 March 2011: See the comment by George Gkotsis below for a critique of this assumption)

Figure 1: "Dartboard" and enclosing square

We can simulate the results of our protagonist’s unsteady attempts by generating two sets of  uniformly distributed random numbers lying between 0 and 1 (This is easily done in Excel using the rand() function).  The pairs of random numbers thus generated – one from each set –  can be treated as the (x,y) coordinates of  the dart for a particular throw. The result of 1000 pairs of random numbers thus generated (representing the drunkard’s dart throwing attempts) is shown in Figure 2 (For those interested in seeing the details, an Excel sheet showing the calculations for 100o trials can be downloaded here).

Figure 2: Result for 1000 throws

A trial results in a “hit” if it lies within the circle.  That is, if it satisfies the following equation:

(x-0.5)^2 + (y-0.5)^2 < 0.25\ldots \ldots (1)

(Note:  if we replace “<”  by “=”  in the above expression, we get the equation for a circle of radius 0.5 units, centered at x=0.5 and y=0.5.)

Now, according to the frequency interpretation of probability, the probability of the plastered player scoring a point is approximated by the ratio of the number of hits in the circle to the total number of attempts. In this case, I get an average of 790/1000 which is 0.79 (generated from 10 sets of 1000 trials each). Your result will be different from because you will generate different sets of random numbers from the ones I did. However, it should be reasonably close to my result.

Further, the frequency interpretation of probability tells us that the approximation becomes more accurate as the number of trials increases. To see why this is so, let’s increase the number of trials and plot the results. I carried out simulations for 2000, 4000, 8000 and 16000 trials. The results of these simulations are shown in Figures 3 through 6.

Figure 3: Result for 2000 throws

Figure 4: Result for 4000 throws


Figure 5: Result for 8000 throws

Figure 6: Result for 16000 throws

Since a dart is equally likely to end up anywhere within the square, the exact probability of a hit is simply the area of the dartboard (i.e. the circle)  divided by the entire area over which the dart can land. In this case, since the area of the enclosure (where the dart must fall) is 1 square unit, the area of the dartboard is actually equal to the probability.  This is easily seen by calculating the area of the circle using the standard formula \pi r^2 where r is the radius of the circle (0.5 units in this case). This yields 0.785398 sq units, which is reasonably close to the number that we got for the 1000 trial case.  In the 16000 trial case, I  get a number that’s closer to the exact result: an average of 0.7860 from 10 sets of 16000 trials.

As we see from Figure 6, in the 16000 trial case, the entire square is peppered with closely-spaced “dart marks” – so much so, that it looks as though the square is a uniform blue.  Hence, it seems intuitively clear that as we increase, the number of throws, we should get a better approximation of the area and, hence, the probability.

There are a couple of points worth mentioning here. First, in principle this technique can be used to calculate areas of figures of any shape. However, the more irregular the figure, the worse the approximation – simply because it becomes less likely that the entire figure will be sampled correctly by “dart throws.” Second,  the reader may have noted that although the 16000 trial case gives a good enough result for the area of the circle, it isn’t particularly accurate considering the large number of trials. Indeed, it is known that the “dart approximation” is not a very good way of calculating areas – see this note for more on this point.

Finally, let’s look at connection between the general approach used in Monte Carlo techniques and the example discussed above  (I use the steps described in the Wikipedia article on Monte Carlo methods as representative of the general approach):

  1. Define a domain of possible inputs – in our case the domain of inputs is defined by the enclosing square of side 1 unit.
  2. Generate inputs randomly from the domain using a certain specified probability distribution – in our example the probability distribution is a pair of independent, uniformly distributed random numbers lying between 0 and 1.
  3. Perform a computation using the inputs – this is the calculation that determines whether or not a particular trial is a hit or not (i.e.  if the x,y coordinates obey inequality  (1) it is a hit, else  it’s a miss)
  4. Aggregate the results of the individual computations into the final result – This corresponds to the calculation of the probability (or equivalently, the area of the circle) by aggregating the number of hits for each set of trials.

To summarise: Monte Carlo algorithms generate random variables (such as probability) according to pre-specified distributions.  In most practical applications  one will use more efficient techniques to sample the distribution (rather than the naïve method I’ve used here.)  However, the basic idea is as simple as playing drunkard’s darts.

Acknowledgements

Thanks go out to Vlado Bokan for helpful conversations while this post was being written and to Paul Culmsee for getting me thinking about a simple way to explain Monte Carlo methods.

Written by K

February 25, 2011 at 4:51 am

8 Responses

Subscribe to comments with RSS.

  1. [...] This post was mentioned on Twitter by Dux Raymond Sy, PMP and Mack Sigman, Kailash Awati. Kailash Awati said: The drunkard's dartboard – an intuitive explanation of Monte Carlo methods: http://wp.me/p6MRd-1fA [...]

  2. Very good way to describe how probabilities can be used to verify theoretical assumptions.

    Having said the above, I think I have an objection with assumption #3, which makes your hypothesis somewhat biased.

    Assumption #3 says that the dart will fall anywhere into the square with equal probability. To follow your (formal) words, choosing “uniformly distributed random numbers lying between 0 and 1″ has yet to be proven.

    I would propose to introduce a ‘skill’ factor, which represents the circle/square ratio (maybe a normal-Gaussian distribution). Of course, this skill factor would be very low (low variance, but still not 0) for a drunken player, but would still take into account the fact that throwing darts into a square is not purely random.

    PS: you have a great blog and I enjoy every single post! Too bad I chose to come out with this “objection”;-)

    George Gkotsis

    February 28, 2011 at 11:32 pm

    • George,

      Thanks for reading and taking the time to comment.

      You make a very valid point: a Normal distribution would indeed be a better model for a drunken dart player (since there is a greater probability that a dart will land close to the center than away from it, however drunk the player may be). Perhaps the assumption becomes less problematic if we blindfold the player? Although then it becomes harder to justify the assumption that the dart always lands within the square.

      Thanks again for your the comment (and for the kind words about my blog!)

      Regards,

      Kailash.

      PS: Perhaps a random dart throwing machine would be a better model than a drunken dart player (if one wants to use a uniform distribution).

      K

      March 1, 2011 at 5:19 am

  3. [...] who’s eyes glaze over the minute you show me any form of algebra. Kailash recent wrote a post to explain Monte Carlo to the masses, but he went and used a mathematical formula (he couldn’t help himself), and thereby lost me [...]

  4. [...] equal probability. In other words, the hit locations were assumed to be uniformly distributed. In a comment on the piece, George Gkotsis challenged this assumption, arguing that that regardless of the level of [...]

  5. [...] equal probability. In other words, the hit locations were assumed to be uniformly distributed. In a comment on the piece, George Gkotsis challenged this assumption, arguing that that regardless of the level of [...]

    » Week in Review

    November 7, 2011 at 3:07 pm

  6. Do you recommend any book ( Statistics ? ) for undersetanding Monte Carlo ? Is there one for a beginner ?

    Mohan

    mohanradhakrishnan

    April 17, 2012 at 9:57 pm

    • Hi Mohan,

      Thanks for your question. The vast majority of books on Monte Carlo tend to be written for technical specialists with a fair background in maths / stats or, at the other extreme, for a general audience. The one book that I found very readable and useful, with enough practical advice on how to get started with simulations is Waltzing with Bears: Managing risk on Software Projects, by Tom DeMarco and Tim Lister. Also, if I may indulge in a shameless plug :-) you may want to have a look at my other introductory articles on Monte Carlo, starting with this one.

      Hope this helps.

      Regards,

      Kailash.

      K

      April 17, 2012 at 10:17 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 282 other followers

%d bloggers like this: