Eight to Late

Sensemaking and Analytics for Organizations

The drunkard’s dartboard revisited: yet another Excel-based example of Monte Carlo simulation

with 6 comments

(Note: An Excel sheet showing sample calculations and plots discussed in this post can be downloaded here.)

Introduction

Some months ago, I wrote a post explaining the basics of Monte Carlo simulation using the example of a drunkard throwing darts at a board. In that post I assumed that the darts could land anywhere on the dartboard with equal probability. In other words, the hit locations were assumed to be uniformly distributed. In a comment on the piece, George Gkotsis challenged this assumption, arguing that that regardless of the level of inebriation of the thrower, a dart would be more likely to land near the centre of the board than away from it (providing the player is at least moderately skilled). He also suggested using the Normal Distribution to model the spread of hits, with the variance of the distribution serving as a rough measure of the inaccuracy (or drunkenness!) of the drunkard. In George’s words:

I would propose to introduce a ‘skill’ factor, which represents the circle/square ratio (maybe a normal-Gaussian distribution). Of course, this skill factor would be very low (high variance) for a drunken player, but would still take into account the fact that throwing darts into a square is not purely random.

In this post I revisit the drunkard’s dartboard, taking into account George’s suggestions.

Setting the stage

To keep things simple, I’ll make the following assumptions:

Figure 1: The dartboard

  1. The dartboard is a circle of radius 0.5 units centred at the origin (see Figure 1)
  2. The chance of a hit is greatest at the centre of the dartboard and falls off as one moves away from it.
  3. The distribution of hits is a function of distance from the centre but does not depend on direction. In mathematical terms, for a given distance r from the centre of the dartboard, the dart can land at any angle \theta with equal probability, \theta being the angle between the line joining the centre of the board to the dart and the x axis. See Figure 2 for graphical representations of a hit location in terms of r and \theta. Note that that the x and y coordinates can be obtained using the formulas x = r\cos\theta and y= r\sin\theta as s shown in Figure 2.
  4. Hits are distributed according to the Normal distribution with maximum at the centre of the dartboard.
  5. The variance of the Normal distribution is a measure of inaccuracy/drunkenness of the drunkard: the more drunk the drunk, the greater the variation in his aim.

Figure 2: The coordinates of a hit location

These assumptions are consistent with George’s suggestions.

The simulation

[Note to the reader: you may want to download the demo before continuing.]

The steps of a simulation run are as follows:

  1. Generate a number that is normally distributed with a zero mean and a specified standard deviation. This gives the distance, r, of a randomly thrown dart from the centre of the board for a player with a “inaccuracy factor” represented by the standard deviation. Column A in the demo contains normally distributed random numbers with zero mean and a standard deviation of 0.2 . Note that I selected the latter number for no other reason than the results show up clearly on a fixed-axis plot shown in Figure 2.
  2. Generate a uniformly distributed random number lying between 0 and 2\pi. This represents the angle \theta. This is the content of column B of the demo.
  3. The numbers obtained from steps 1 and 2 for completely specify the location of a hit. The location’s x and y coordinates can be worked out using the formulas x = r\cos\theta and y= r\sin\theta. These are listed in columns C and D in the Excel demo.
  4. Re-run steps 1 through 4 as many times as needed. Note that the demo is set up for 5000 runs. You can change this manually or, better yet, automate it. The latter is left as an exercise for you.

It is instructive to visualize the resulting hits using a scatter plot. Among other things this can tell you, at a glance, if the results make sense. For example, we would expect hits to be symmetrically distributed about the origin because the drunkard’s throws are not biased in any particular direction around the centre). A non-symmetrical distribution is thus an indication that there is an error in the calculations.

Now, any finite collection of hits is unlikely to be perfectly symmetrical because of outliers. Nevertheless, the distributions should be symmetrical on average. To test this, run the demo a few times (hit F9 with the demo open). Notice how the position of outliers and the overall shape of the distribution of points changes randomly from simulation to simulation. In all cases, however, there is a clear maximum at the centre of the dartboard with the probability of a hit falling with distance from the centre.

Figure 3: Scatter plot for standard deviation=0.2

Figure 3 shows the results of simulations for a standard deviation of 0.2. Figures 4 and 5 show the results of simulations for standard deviations of 0.1 and 0.4.

Figure 4: Scatter plot for standard deviation=0.1

Note that the plot has fixed axes- i.e. the area depicted is the 1×1 square that encloses the dartboard, regardless of the standard deviation. Consequently, for larger standard deviations (such as 0.4) many hits will be out of range and will not show up on the plot.

Figure 5: Scatter plot for standard deviation=0.4

Closing remarks

As I have stressed in my previous posts on Monte Carlo simulation, the usefulness of a simulation depends on the choice of an appropriate distribution. If the selected distribution does not reflect reality, neither will the simulation. This is true regardless of whether one is simulating a drunkard’s wayward aim or the duration of project task. You may have noted that the assumption of normally-distributed hits has no justification whatsoever; it is just as arbitrary as my original assumption of uniformity. In fact, the hit locations of drunken dart throws is highly unlikely to be either uniform or Normal. Nevertheless, I hope that some of my readers will find the above example to be of pedagogical value.

Acknowledgement

Thanks to George Gkotsis for his comment which got me thinking about this post.

Written by K

November 3, 2011 at 4:59 am

6 Responses

Subscribe to comments with RSS.

  1. You might like http://tukhi.com. It does Monte Carlo simulation in Excel.

    Like

    Keith A. Lewis

    November 7, 2011 at 1:25 pm

  2. […] The Drunkard’s Dartboard Revisited Some months ago, I wrote a post explaining the basics of Monte Carlo simulation using the example of a drunkard throwing darts at a board. In that post I assumed that the darts could land anywhere on the dartboard with equal probability. In other words, the hit locations were assumed to be uniformly distributed. In a comment on the piece, George Gkotsis challenged this assumption, arguing that that regardless of the level of inebriation of the thrower, a dart would be more likely to land near the centre of the board than away from it (providing the player is at least moderately skilled). He also suggested using the Normal Distribution to model the spread of hits, with the variance of the distribution serving as a rough measure of the inaccuracy (or drunkenness!) of the drunkard. […]

    Like

    » Week in Review

    November 7, 2011 at 3:14 pm

  3. Greetings,

    I am happy to see that my suggestion lead to a new post. As far as I can, your new posts fully complies with my suggestion Kailash. I assume this makes the drunkard’s dartboard more realistic 🙂

    This discussion lead me to think about a new experiment concerning throwing darts and …alcohol. But first, I would like to present a case in which probabilities can play strange games:

    Consider a human with an accuracy of 90% (i.e. he answers correctly 9 out of 10 questions). What is the combined accuracy for two persons of the same 90% accuracy (given that they don’t discuss about the questions or interact in any way)? Is it higher, lower or the same?

    (Spoiler alert follows)

    The new accuracy can be calculated by acknowledging examining a new sample space. The two persons might give the same answer at a 90% * 90% = 81% chance. They might contradict (i.e. one is correct, one is wrong) at a 10% * 90% + 90% * 10% = 18% chance and they might give the wrong answer at a 10% * 10% = 1% chance.

    What I propose is to attempt to compare the number of successful hits between a sober person and two drunkards. To put the same question in different words:
    What would you prefer?
    -One sober throwing N darts?
    -Two drunkards throwing 2*N darts?

    Of course, several assumptions must be made for the ratio of variance σ1 (sober’s variance) and σ2 (drunkard’s variance). Since the number of throws for the drunkards is twice higher, I would start with σ2=2*σ1

    Cheers!
    George

    Like

    George Gkotsis

    November 8, 2011 at 4:11 am

  4. George, interesting idea. I see where it goes in the context of estimating.

    When I decided to work out the secrets of estimating I came to the end point that if you want accuracy use historical data, and if you don’t have that, go to an expert.

    Nice.

    Like

    Craig Brown

    November 21, 2011 at 10:31 pm

  5. George, Craig,

    Many thanks for your comments and my apologies for the delay in responding.

    Your comments got me thinking about group estimates and their accuracy as compared to individual estimates. I have some ideas on this which I’m yet to work through fully. I hope to write these up in a comment (or may be even another post) soon.

    Regards,

    Kailash.

    Like

    K

    November 25, 2011 at 9:13 pm

  6. […] thanks go out to George Gkotsis and Craig Brown for their comments which inspired this post. Rate this: Share this:MoreDiggLike this:LikeBe the first to like this […]

    Like


Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.