Archive for the ‘Risk analysis’ Category
Enterprise risk management (ERM) refers to the process by which uncertainties are identified, analysed and managed from an organization-wide perspective. In principle such a perspective enables organisations to deal with risks in a holistic manner, avoiding the silo mentality that plagues much of risk management practice. This is the claim made of ERM at any rate, and most practitioners accept it as such. However, whether the claim really holds is another matter altogether. Unfortunately, most of the available critiques of ERM are written for academics or risk management experts. In this post I summarise a critique of ERM presented in a paper by Michael Power entitled, The Risk Management of Nothing.
I’ll begin with a brief overview of ERM frameworks and then summarise the main points of the paper along with some of my comments and annotations.
ERM Frameworks and Definitions
What is ERM?
The best way to answer this question is to look at a couple of well-known ERM frameworks, one from the Casualty Actuarial Society (CAS) and the other from the Committee of Sponsoring Organisations of the Treadway Commission (COSO).
CAS defines ERM as:
… the discipline by which an organization in any industry assesses, controls, exploits, finances, and monitors risks from all sources for the purpose of increasing the organization’s short- and long-term value to its stakeholders.
See this article for an overview of ERM from actuarial perspective.
COSO defines ERM as:
…a process, effected by an entity’s board of directors, management and other personnel, applied in strategy setting and across the enterprise, designed to identify potential events that may affect the entity, and manage risk to be within its risk appetite, to provide reasonable assurance regarding the achievement of entity objectives.
The term risk appetite in the above definition refers to the risk an organisation is willing to bear. See the first article in the June 2003 issue of Internal Auditor for more on the COSO perspective on ERM.
In both frameworks, the focus is very much on quantifying risks through (primarily) financial measures and on establishing accountability for managing these risks in a systematic way.
All this sounds very sensible and uncontroversial. So, where’s the problem?
The problems with ERM
The author of the paper begins with the observation that the basic aim of ERM is to identify risks that can affect an organisation’s objectives and then design controls and mitigation strategies that reduce these risks (collectively) to below a predetermined value that is specified by the organisation’s risk appetite. Operationally, identified risks are monitored and corrective action is taken when they go beyond limits specified by the controls, much like the operation of a thermostat.
In this view, risk management is a mechanistic process. Failures of risk management are seen more as being due to “not doing it right” (implementation failure) or politics getting in the way (organizational friction), rather than a problem with the framework itself. The basic design of the framework is rarely questioned.
Contrary to common wisdom, the author of the paper believes that the design of ERM is flawed in the following three ways:
- The idea of a single, organisation-wide risk appetite is simplistic.
- The assumption that risk can be dealt with by detailed, process-based rules (suitable for audit and control) is questionable.
- The undue focus on developing financial metrics and controls blind it to “bigger picture”, interconnected risks because these cannot be quantified or controlled by such mechanisms.
We’ll now take a look at each of the above in some detail
Appetite vs. appetisation
As mentioned earlier, risk appetite is defined as the risk the organisation is willing to bear. Although ERM frameworks allow for qualitative measures of risk appetite, most organisations implementing ERM tend to prefer quantitative ones. This is a problem because the definition of risk appetite can vary significantly across an organization. For example, the sales and audit functions within an organisation could (will!) have different appetites for risk. As another example, familiar to anyone who reads the news, is that there is usually a big significant gap between the risk appetites of financial institutions and regulatory authorities.
The difference in risk appetites of different stakeholder groups is a manifestation of the fact that risk is a social construct – different stakeholder groups view a given risk in different ways, and some may not even see certain risks as risks (witness the behaviour of certain financial “masters of the universe”)
Since a single, organisation-wide risk appetite is difficult to come up with, the author suggests a different approach – one which takes into account the multiplicity of viewpoints in an organisation; a process he calls “risk appetizing”. This involves getting diverse stakeholders to achieve a consensus / agreement on what constitutes risk appetite. Power argues that this process of reconciling different viewpoints of risk would lead to a more realistic view of the risk the organization is willing to bear. Quoting from the paper:
Conceptualising risk appetising as a process might better direct risk management attention to where it has likely been lacking, namely to the multiplicity of interactions which shape operational and ethical boundaries at the level of organizational practice. COSO-style ERM principles effectively limit the concept of risk appetite within a capital measurement discourse. Framing risk appetite as the process through which ethics and incentives are formed and reformed would not exclude this technical conception, but would bring it closer to the insights of several decades of organization theory.
Explicitly acknowledging the diversity of viewpoints on risk is likely to be closer to reality because:
…a conflictual and pluralistic model is more descriptive of how organizations actually work, and makes lower demands on organizational and political rationality to produce a single ‘appetite’ by explicitly recognising and institutionalising processes by which different appetites and values can be mediated.
Such a process is difficult because it involves getting people who have different viewpoints to agree on what constitutes a sensible definition of risk appetite.
A process bias
A bigger problem, in Power’s view, is that the ERM frameworks overemphasise financial / accounting measures and processes as a means of quantifying and controlling risk. As he puts it ERM:
… is fundamentally an accounting-driven blueprint which emphasises a controls-based approach to risk management. This design emphasis means that efforts at implementation will have an inherent tendency to elaborate detailed controls with corresponding documents trails.
This is a problem because it leads to a “rule-based compliance” mentality wherein risks are managed in a mechanical manner, using bureaucratic processes as a substitute for real thought about risks and how they should be managed. Such a process may work in a make-believe world where all risks are known, but is unlikely to work in one in which there is a great deal of ambiguity.
Power makes the important point that rule-based compliance chews up organizational resources. The tangible effort expended on compliance serves to reassure organizations that they are doing something to manage risks. This is dangerous because it lulls them into a false sense of security:
Rule-based compliance lays down regulations to be met, and requires extensive evidence, audit trails and box ‘checking’. All this demands considerable work and there is daily pressure on operational staff to process regulatory requirements. Yet, despite the workload volume pressure, this is also a cognitively comfortable world which focuses inwards on routine systems and controls. The auditability of this controls architecture can be theorized as a defence against anxiety and enables organizational agents to feel that their work conforms to legitimised principles.
In this comfortable, prescriptive world of process-based risk management, there is little time to imagine and explore what (else) could go wrong. Further, the latter is often avoided because it is a difficult and often uncomfortable process:
…the imagination of alternative futures is likely to involve the production of discomfort, as compared with formal ‘comfort’ of auditing. The approach can take the form of scenario analysis in which participants from different disciplines in an organization can collectively track the trajectory of potential decisions and events. The process begins as an ‘encounter’ with risk and leads to the confrontation of limitation and ambiguity.
Such a process necessarily involves debate and dialogue – it is essentially a deliberative process. And as Power puts it:
The challenge is to expand processes which support interaction and dialogue and de-emphasise due process – both within risk management practice and between regulator and regulated.
This is right of course, but that’s not all: a lot of other process-focused disciplines such as project management would also benefit by acknowledging and responding to this challenge.
A limited view of embeddedness
One of the imperatives of ERM is to “embed” risk management within organisations. Among other things, this entails incorporating risk management explicitly into job descriptions, and making senior managers responsible for managing risks. Although this is a step in the right direction, Power argues that the concept of embeddeness as articulated in ERM remains limited because it focuses on specific business entities, ignoring the wider environment and context in which they exist. The essential (but not always obvious) connections between entities are not necessarily accounted for. As Power puts it:
ERM systems cannot represent embeddedness in the sense of interconnectedness; its proponents seem only to demand an intensification of embedding at the individual entity level. Yet, this latter kind of embedding of a compliance driven risk management, epitomised by the Sarbanes-Oxley legislation, is arguably a disaster in itself, by tying up resources and, much worse, cognition and attention in ‘auditized’ representations of business processes.
In short: the focus on following a process-oriented approach to risk management – as mandated by frameworks – has the potential to de-focus attention from risks that are less obvious, but are potentially more significant.
Addressing the limitations
Power believes the flaws in ERM can be addressed by looking to the practice of business continuity management (BCM). BCM addresses the issue of disaster management – i.e. how to keep an organisation functioning in the event of a disaster. Consequently, there is a significant overlap between the aims of BCM and ERM. However, unlike ERM, BCM draws specialists from different fields and emphasizes collective action. Such an approach is therefore more likely to take a holistic view of risk, and that is the real point.
Regardless of the approach one takes, the point is to involve diverse stakeholders and work towards a shared (enterprise-wide) understanding of risks. Only then will it be possible to develop a risk management plan that incorporates the varying, even contradictory, perspectives that exist within an organisation. There are many techniques to work towards a shared understanding of risks, or any other issues for that matter. Some of these are discussed at length in my book.
Power suggests that ERM, as articulated by bodies such as CAS and COSO, flawed because:
- It attempts to quantify risk appetite at the organizational level – an essentially impossible task because different organizational stakeholders will have different views of risk. Risk is a social construct.
- It advocates a controls and rule-based approach to managing risks. Such a prescriptive “best” practice approach discourages debate and dialogue about risks. Consequently, many viewpoints are missed and quite possibly, so are many risks.
- Despite the rhetoric of ERM, implemented risk management controls and processes often overlook connections and dependencies between entities within organisations. So, although risk management appears to be embedded within the organisation, in reality it may not be so.
Power suggests that ERM practice could learn a few lessons from Business Continuity Management (BCM), in particular about the interconnected nature of business risks and the collective action needed to tackle them. Indeed, any approach that attempts to reconcile diverse risk viewpoints will be a huge improvement on current practice. Until then ERM will continue to be an illusion, offering false comfort to those who are responsible for managing risk.
Project estimates are generally based on assumptions about future events and their outcomes. As the future is uncertain, the concept of probability is sometimes invoked in the estimation process. There’s enough been written about how probabilities can be used in developing estimates; indeed there are a good number of articles on this blog – see this post or this one, for example. However, most of these writings focus on the practical applications of probability rather than on the concept itself – what it means and how it should be interpreted. In this article I address the latter point in a way that will (hopefully!) be of interest to those working in project management and related areas.
Uncertainty is a shape, not a number
Since the future can unfold in a number of different ways one can describe it only in terms of a range of possible outcomes. A good way to explore the implications of this statement is through a simple estimation-related example:
Assume you’ve been asked to do a particular task relating to your area of expertise. From experience you know that this task usually takes 4 days to complete. If things go right, however, it could take as little as 2 days. On the other hand, if things go wrong it could take as long as 8 days. Therefore, your range of possible finish times (outcomes) is anywhere between 2 to 8 days.
Clearly, each of these outcomes is not equally likely. The most likely outcome is that you will finish the task in 4 days. Moreover, the likelihood of finishing in less than 2 days or more than 8 days is zero. If we plot the likelihood of completion against completion time, it would look something like Figure 1.
Figure 1 begs a couple of questions:
- What are the relative likelihoods of completion for all intermediate times – i.e. those between 2 to 4 days and 4 to 8 days?
- How can one quantify the likelihood of intermediate times? In other words, how can one get a numerical value of the likelihood for all times between 2 to 8 days? Note that we know from the earlier discussion that this must be zero for any time less than 2 or greater than 8 days.
The two questions are actually related: as we shall soon see, once we know the relative likelihood of completion at all times (compared to the maximum), we can work out its numerical value.
Since we don’t know anything about intermediate times (I’m assuming there is no historical data available, and I’ll have more to say about this later…), the simplest thing to do is to assume that the likelihood increases linearly (as a straight line) from 2 to 4 days and decreases in the same way from 4 to 8 days as shown in Figure 2. This gives us the well-known triangular distribution.
Note: The term distribution is simply a fancy word for a plot of likelihood vs. time.
Of course, this isn’t the only possibility; there are an infinite number of others. Figure 3 is another (admittedly weird) example.
Further, it is quite possible that the upper limit (8 days) is not a hard one. It may be that in exceptional cases the task could take much longer (say, if you call in sick for two weeks) or even not be completed at all (say, if you leave for that mythical greener pasture). Catering for the latter possibility, the shape of the likelihood might resemble Figure 4.
From the figures above, we see that uncertainties are shapes rather than single numbers, a notion popularised by Sam Savage in his book, The Flaw of Averages. Moreover, the “shape of things to come” depends on a host of factors, some of which may not even be on the radar when a future event is being estimated.
Making likelihood precise
Thus far, I have used the word “likelihood” without bothering to define it. It’s time to make the notion more precise. I’ll begin by asking the question: what common sense properties do we expect a quantitative measure of likelihood to have?
Consider the following:
- If an event is impossible, its likelihood should be zero.
- The sum of likelihoods of all possible events should equal complete certainty. That is, it should be a constant. As this constant can be anything, let us define it to be 1.
In terms of the example above, if we denote time by and the likelihood by then:
Where denotes the sum of all non-zero likelihoods – i.e. those that lie between 2 and 8 days. In simple terms this is the area enclosed by the likelihood curves and the x axis in figures 2 to 4. (Technical Note: Since is a continuous variable, this should be denoted by an integral rather than a simple sum, but this is a technicality that need not concern us here)
is , in fact, what mathematicians call probability- which explains why I have used the symbol rather than . Now that I’ve explained what it is, I’ll use the word “probability” instead of ” likelihood” in the remainder of this article.
With these assumptions in hand, we can now obtain numerical values for the probability of completion for all times between 2 and 8 days. This can be figured out by noting that the area under the probability curve (the triangle in figure 2 and the weird shape in figure 3) must equal 1. I won’t go into any further details here, but those interested in the maths for the triangular case may want to take a look at this post where the details have been worked out.
The meaning of it all
(Note: parts of this section borrow from my post on the interpretation of probability in project management)
So now we understand how uncertainty is actually a shape corresponding to a range of possible outcomes, each with their own probability of occurrence. Moreover, we also know, in principle, how the probability can be calculated for any valid value of time (between 2 and 8 days). Nevertheless, we are still left with the question as to what a numerical probability really means.
As a concrete case from the example above, what do we mean when we say that there is 100% chance (probability=1) of finishing within 8 days? Some possible interpretations of such a statement include:
- If the task is done many times over, it will always finish within 8 days. This is called the frequency interpretation of probability, and is the one most commonly described in maths and physics textbooks.
- It is believed that the task will definitely finish within 8 days. This is called the belief interpretation. Note that this interpretation hinges on subjective personal beliefs.
- Based on a comparison to similar tasks, the task will finish within 8 days. This is called the support interpretation.
Note that these interpretations are based on a paper by Glen Shafer. Other papers and textbooks frame these differently.
The first thing to note is how different these interpretations are from each other. For example, the first one offers a seemingly objective interpretation whereas the second one is unabashedly subjective.
So, which is the best – or most correct – one?
A person trained in science or mathematics might claim that the frequency interpretation wins hands down because it lays out an objective, well -defined procedure for calculating probability: simply perform the same task many times and note the completion times.
Problem is, in real life situations it is impossible to carry out exactly the same task over and over again. Sure, it may be possible to almost the same task, but even straightforward tasks such as vacuuming a room or baking a cake can hold hidden surprise (vacuum cleaners do malfunction and a friend may call when one is mixing the batter for a cake). Moreover, tasks that are complex (as is often the case in the project work) tend to be unique and can never be performed in exactly the same way twice. Consequently, the frequency interpretation is great in theory but not much use in practice.
“That’s OK,” another estimator might say,” when drawing up an estimate, I compared it to other similar tasks that I have done before.”
This is essentially the support interpretation (interpretation 3 above). However, although this seems reasonable, there is a problem: tasks that are superficially similar will differ in the details, and these small differences may turn out to be significant when one is actually carrying out the task. One never knows beforehand which variables are important. For example, my ability to finish a particular task within a stated time depends not only on my skill but also on things such as my workload, stress levels and even my state of mind. There are many external factors that one might not even recognize as being significant. This is a manifestation of the reference class problem.
So where does that leave us? Is probability just a matter of subjective belief?
No, not quite: in reality, estimators will use some or all of three interpretations to arrive at “best guess” probabilities. For example, when estimating a project task, a person will likely use one or more of the following pieces of information:
- Experience with similar tasks.
- Subjective belief regarding task complexity and potential problems. Also, their “gut feeling” of how long they think it ought to take. These factors often drive excess time or padding that people work into their estimates.
- Any relevant historical data (if available)
Clearly, depending on the situation at hand, estimators may be forced to rely on one piece of information more than others. However, when called upon to defend their estimates, estimators may use other arguments to justify their conclusions depending on who they are talking to. For example, in discussions involving managers, they may use hard data presented in a way that supports their estimates, whereas when talking to their peers they may emphasise their gut feeling based on differences between the task at hand and similar ones they have done in the past. Such contradictory representations tend to obscure the means by which the estimates were actually made.
Estimates are invariably made in the face of uncertainty. One way to get a handle on this is by estimating the probabilities associated with possible outcomes. Probabilities can be reckoned in a number of different ways. Clearly, when using them in estimation, it is crucial to understand how probabilities have been derived and the assumptions underlying these. We have seen three ways in which probabilities are interpreted corresponding to three different ways in which they are arrived at. In reality, estimators may use a mix of the three approaches so it isn’t always clear how the numerical value should be interpreted. Nevertheless, an awareness of what probability is and its different interpretations may help managers ask the right questions to better understand the estimates made by their teams.
The essential idea behind group estimation is that an estimate made by a group is likely to be more accurate than one made by an individual in the group. This notion is the basis for the Delphi method and its variants. In this post, I use arguments involving probabilities to gain some insight into the conditions under which group estimates are more accurate than individual ones.
An insight from conditional probability
Let’s begin with a simple group estimation scenario.
Assume we have two individuals of similar skill who have been asked to provide independent estimates of some quantity, say a project task duration. Further, let us assume that each individual has a probability of making a correct estimate.
Based on the above, the probability that they both make a correct estimate, , is:
This is a consequence of our assumption that the individual estimates are independent of each other.
Similarly, the probability that they both get it wrong, , is:
Now we can ask the following question:
What is the probability that both individuals make the correct estimate if we know that they have both made the same estimate?
This can be figured out using Bayes’ Theorem, which in the context of the question can be stated as follows:
In the above equation, is the probability that both individuals get it right given that they have made the same estimate (which is what we want to figure out). This is an example of a conditional probability – i.e. the probability that an event occurs given that another, possibly related event has already occurred. See this post for a detailed discussion of conditional probabilities.
Similarly, is the conditional probability that both estimators make the same estimate given that they are both correct. This probability is 1.
Answer: If both estimators are correct then they must have made the same estimate (i.e. they must both within be an acceptable range of the right answer).
Finally, is the probability that both make the same estimate. This is simply the sum of the probabilities that both get it right and both get it wrong. Expressed in terms of this is, .
Now lets apply Bayes’ theorem to the following two cases:
- Both individuals are good estimators – i.e. they have a high probability of making a correct estimate. We’ll assume they both have a 90% chance of getting it right ().
- Both individuals are poor estimators – i.e. they have a low probability of making a correct estimate. We’ll assume they both have a 30% chance of getting it right ()
Consider the first case. The probability that both estimators get it right given that they make the same estimate is:
Thus we see that the group estimate has a significantly better chance of being right than the individual ones: a probability of 0.9878 as opposed to 0.9.
In the second case, the probability that both get it right is:
The situation is completely reversed: the group estimate has a much smaller chance of being right than an individual estimate!
In summary: estimates provided by a group consisting of individuals of similar ability working independently are more likely to be right (compared to individual estimates) if the group consists of competent estimators and more likely to be wrong (compared to individual estimates) if the group consists of poor estimators.
Assumptions and complications
I have made a number of simplifying assumptions in the above argument. I discuss these below with some commentary.
- The main assumption is that individuals work independently. This assumption is not valid for many situations. For example, project estimates are often made by a group of people working together. Although one can’t work out what will happen in such situations using the arguments of the previous section, it is reasonable to assume that given the right conditions, estimators will use their collective knowledge to work collaboratively. Other things being equal, such collaboration would lead a group of skilled estimators to reinforce each others’ estimates (which are likely to be quite similar) whereas less skilled ones may spend time arguing over their (possibly different and incorrect) guesses. Based on this, it seems reasonable to conjecture that groups consisting of good estimators will tend to make even better estimates than they would individually whereas those consisting of poor estimators have a significant chance of making worse ones.
- Another assumption is that an estimate is either good or bad. In reality there is a range that is neither good nor bad, but may be acceptable.
- Yet another assumption is that an estimator’s ability can be accurately quantified using a single numerical probability. This is fine providing the number actually represents the person’s estimation ability for the situation at hand. However, typically such probabilities are evaluated on the basis of past estimates. The problem is, every situation is unique and history may not be a good guide to the situation at hand. The best way to address this is to involve people with diverse experience in the estimation exercise. This will almost often lead to a significant spread of estimates which may then have to be refined by debate and negotiation.
Real-life estimation situations have a number of other complications. To begin with, the influence that specific individuals have on the estimation process may vary – a manager who is a poor estimator may, by virtue of his position, have a greater influence than others in a group. This will skew the group estimate by a factor that cannot be estimated. Moreover, strategic behaviour may influence estimates in a myriad other ways. Then there is the groupthink factor as well.
…and I’m sure there are many others.
Finally I should mention that group estimates can depend on the details of the estimation process. For example, research suggests that under certain conditions competition can lead to better estimates than cooperation.
In this post I have attempted to make some general inferences regarding the validity of group estimates based on arguments involving conditional probabilities. The arguments suggest that, all other things being equal, a collective estimate from a bunch of skilled estimators will generally be better than their individual estimates whereas an estimate from a group of less skilled estimators will tend to be worse than their individual estimates. Of course, in real life, there are a host of other factors that can come into play: power, politics and biases being just a few. Though these are often hidden, they can influence group estimates in inestimable ways.
Thanks go out to George Gkotsis and Craig Brown for their comments which inspired this post.
More often than not, managerial decisions are made on the basis of uncertain information. To lend some rigour to the process of decision making, it is sometimes assumed that uncertainties of interest can be quantified accurately using probabilities. As it turns out, this assumption can be incorrect in many situations because the probabilities themselves can be uncertain. In this post I discuss a couple of ways in which such uncertainty about uncertainty can manifest itself.
The problem of vagueness
In a paper entitled, “Is Probability the Only Coherent Approach to Uncertainty?”, Mark Colyvan made a distinction between two types of uncertainty:
- Uncertainty about some underlying fact. For example, we might be uncertain about the cost of a project – that there will be a cost is a fact, but we are uncertain about what exactly it will be.
- Uncertainty about situations where there is no underlying fact. For example, we might be uncertain about whether customers will be satisfied with the outcome of a project. The problem here is the definition of customer satisfaction. How do we measure it? What about customers who are neither satisfied nor dissatisfied? There is no clear-cut definition of what customer satisfaction actually is.
The first type of uncertainty refers to the lack of knowledge about something that we know exists. This is sometimes referred to as epistemic uncertainty – i.e. uncertainty pertaining to knowledge. Such uncertainty arises from imprecise measurements, changes in the object of interest etc. The key point is that we know for certain that the item of interest has well-defined properties, but we don’t know what they are and hence the uncertainty. Such uncertainty can be quantified accurately using probability.
Vagueness, on the other hand, arises from an imprecise use of language. Specifically, the term refers to the use of criteria that cannot distinguish between borderline cases. Let’s clarify this using the example discussed earlier. A popular way to measure customer satisfaction is through surveys. Such surveys may be able to tell us that customer A is more satisfied than customer B. However, they cannot distinguish between borderline cases because any boundary between satisfied and not satisfied customers is arbitrary. This problem becomes apparent when considering an indifferent customer. How should such a customer be classified – satisfied or not satisfied? Further, what about customers who choose not to respond? It is therefore clear that any numerical probability computed from such data cannot be considered accurate. In other words, the probability itself is uncertain.
Ambiguity in classification
Although the distinction made by Colyvan is important, there is a deeper issue that can afflict uncertainties that appear to be quantifiable at first sight. To understand how this happens, we’ll first need to take a brief look at how probabilities are usually computed.
An operational definition of probability is that it is the ratio of the number of times the event of interest occurs to the total number of events observed. For example, if my manager notes my arrival times at work over 100 days and finds that I arrive before 8:00 am on 62 days then he could infer that the probability my arriving before 8:00 am is 0.62. Since the probability is assumed to equal the frequency of occurrence of the event of interest, this is sometimes called the frequentist interpretation of probability.
The above seems straightforward enough, so you might be asking: where’s the problem?
The problem is that events can generally be classified in several different ways and the computed probability of an event occurring can depend on the way that it is classified. This is called the reference class problem. In a paper entitled, “The Reference Class Problem is Your Problem Too”, Alan Hajek described the reference class problem as follows:
“The reference class problem arises when we want to assign a probability to a proposition (or sentence, or event) X, which may be classified in various ways, yet its probability can change depending on how it is classified.”
Consider the situation I mentioned earlier. My manager’s approach seems reasonable, but there is a problem with it: all days are not the same as far as my arrival times are concerned. For example, it is quite possible that my arrival time is affected by the weather: I may arrive later on rainy days than on sunny ones. So, to get a better estimate my manager should also factor in the weather. He would then end up with two probabilities, one for fine weather and the other for foul. However, that is not all: there are a number of other criteria that could affect my arrival times – for example, my state of health (I may call in sick and not come in to work at all), whether I worked late the previous day etc.
What seemed like a straightforward problem is no longer so because of the uncertainty regarding which reference class is the right one to use.
Before closing this section, I should mention that the reference class problem has implications for many professional disciplines. I have discussed its relevance to project management in my post entitled, “The reference class problem and its implications for project management”.
In this post we have looked at a couple of forms of uncertainty about uncertainty that have practical implications for decision makers. In particular, we have seen that probabilities used in managerial decision making can be uncertain because of vague definitions of events and/or ambiguities in their classification. The bottom line for those who use probabilities to support decision-making is to ensure that the criteria used to determine events of interest refer to unambiguous facts that are appropriate to the situation at hand. To sum up: decisions made on the basis of probabilities are only as good as the assumptions that go into them, and the assumptions themselves may be prone to uncertainties such as the ones described in this article.
The discussion of risk in presented in most textbooks and project management courses follows the well-trodden path of risk identification, analysis, response planning and monitoring (see the PMBOK guide, for example). All good stuff, no doubt. However, much of the guidance offered is at a very high level. Among other things, there is little practical advice on what not to do. In this post I address this issue by outlining some of the common pitfalls in project risk analysis.
1. Reliance on subjective judgement: People see things differently: one person’s risk may even be another person’s opportunity. For example, using a new technology in a project can be seen as a risk (when focusing on the increased chance of failure) or opportunity (when focusing on the opportunities afforded by being an early adopter). This is a somewhat extreme example, but the fact remains that individual perceptions influence the way risks are evaluated. Another problem with subjective judgement is that it is subject to cognitive biases – errors in perception. Many high profile project failures can be attributed to such biases: see my post on cognitive bias and project failure for more on this. Given these points, potential risks should be discussed from different perspectives with the aim of reaching a common understanding of what they are and how they should be dealt with.
2. Using inappropriate historical data: Purveyors of risk analysis tools and methodologies exhort project managers to determine probabilities using relevant historical data. The word relevant is important: it emphasises that the data used to calculate probabilities (or distributions) should be from situations that are similar to the one at hand. Consider, for example, the probability of a particular risk – say, that a particular developer will not be able to deliver a module by a specified date. One might have historical data for the developer, but the question remains as to which data points should be used. Clearly, only those data points that are from projects that are similar to the one at hand should be used. But how is similarity defined? Although this is not an easy question to answer, it is critical as far as the relevance of the estimate is concerned. See my post on the reference class problem for more on this point.
3. Focusing on numerical measures exclusively: There is a widespread perception that quantitative measures of risk are better than qualitative ones. However, even where reliable and relevant data is available, the measures still need to based on sound methodologies. Unfortunately, ad-hoc techniques abound in risk analysis: see my posts on Cox’s risk matrix theorem and limitations of risk scoring methods for more on these. Risk metrics based on such techniques can be misleading. As Glen Alleman points out in this comment, in many situations qualitative measures may be more appropriate and accurate than quantitative ones.
4. Ignoring known risks: It is surprising how often known risks are ignored. The reasons for this have to do with politics and mismanagement. I won’t dwell on this as I have dealt with it at length in an earlier post.
5. Overlooking the fact that risks are distributions, not point values: Risks are inherently uncertain, and any uncertain quantity is represented by a range of values, (each with an associated probability) rather than a single number (see this post for more on this point). Because of the scarcity or unreliability of historical data, distributions are often assumed a priori: that is, analysts will assume that the risk distribution has a particular form (say, normal or lognormal) and then evaluate distribution parameters using historical data. Further, analysts often choose simple distributions that that are easy to work with mathematically. These distributions often do not reflect reality. For example, they may be vulnerable to “black swan” occurences because they do not account for outliers.
6. Failing to update risks in real time: Risks are rarely static – they evolve in time, influenced by circumstances and events both in and outside the project. For example, the acquisition of a key vendor by a mega-corporation is likely to affect the delivery of that module you are waiting on –and quite likely in an adverse way. Such a change in risk is obvious; there may be many that aren’t. Consequently, project managers need to reevaluate and update risks periodically. To be fair, this is a point that most textbooks make – but it is advice that is not followed as often as it should be.
This brings me to the end of my (subjective) list of risk analysis pitfalls. Regular readers of this blog will have noticed that some of the points made in this post are similar to the ones I made in my post on estimation errors. This is no surprise: risk analysis and project estimation are activities that deal with an uncertain future, so it is to be expected that they have common problems and pitfalls. One could generalize this point: any activity that involves gazing into a murky crystal ball will be plagued by similar problems.
(Note to the reader: An Excel sheet showing sample calculations and plots discussed in this post can be downloaded here.)
Monte Carlo simulation techniques have been applied to areas ranging from physics to project management. In earlier posts, I discussed how these methods can be used to simulate project task durations (see this post and this one for example). In those articles, I described simulation procedures in enough detail for readers to be able to reproduce the calculations for themselves. However, as my friend Paul Culmsee mentioned, the mathematics tends to obscure the rationale behind the technique. Indeed, at first sight it seems somewhat paradoxical that one can get accurate answers via random numbers. In this post, I illustrate the basic idea behind Monte Carlo methods through an example that involves nothing more complicated than squares and circles. To begin with, however, I’ll start with something even simpler – a drunken darts player.
Consider a sozzled soul who is throwing darts at a board situated some distance from him. To keep things simple, we’ll assume the following:
- The board is modeled by the circle shown in Figure 1, and our souse scores a point if the dart falls within the circle.
- The dart board is inscribed in a square with sides 1 unit long as shown in the figure, and we’ll assume for simplicity that the dart always falls somewhere within the square (our protagonist is not that smashed).
- Given his state, our hero’s aim is not quite what it should be - his darts fall anywhere within the square with equal probability. (Note added on 01 March 2011: See the comment by George Gkotsis below for a critique of this assumption)
We can simulate the results of our protagonist’s unsteady attempts by generating two sets of uniformly distributed random numbers lying between 0 and 1 (This is easily done in Excel using the rand() function). The pairs of random numbers thus generated – one from each set – can be treated as the (x,y) coordinates of the dart for a particular throw. The result of 1000 pairs of random numbers thus generated (representing the drunkard’s dart throwing attempts) is shown in Figure 2 (For those interested in seeing the details, an Excel sheet showing the calculations for 100o trials can be downloaded here).
A trial results in a “hit” if it lies within the circle. That is, if it satisfies the following equation:
(Note: if we replace “<” by “=” in the above expression, we get the equation for a circle of radius 0.5 units, centered at x=0.5 and y=0.5.)
Now, according to the frequency interpretation of probability, the probability of the plastered player scoring a point is approximated by the ratio of the number of hits in the circle to the total number of attempts. In this case, I get an average of 790/1000 which is 0.79 (generated from 10 sets of 1000 trials each). Your result will be different from because you will generate different sets of random numbers from the ones I did. However, it should be reasonably close to my result.
Further, the frequency interpretation of probability tells us that the approximation becomes more accurate as the number of trials increases. To see why this is so, let’s increase the number of trials and plot the results. I carried out simulations for 2000, 4000, 8000 and 16000 trials. The results of these simulations are shown in Figures 3 through 6.
Since a dart is equally likely to end up anywhere within the square, the exact probability of a hit is simply the area of the dartboard (i.e. the circle) divided by the entire area over which the dart can land. In this case, since the area of the enclosure (where the dart must fall) is 1 square unit, the area of the dartboard is actually equal to the probability. This is easily seen by calculating the area of the circle using the standard formula where is the radius of the circle (0.5 units in this case). This yields 0.785398 sq units, which is reasonably close to the number that we got for the 1000 trial case. In the 16000 trial case, I get a number that’s closer to the exact result: an average of 0.7860 from 10 sets of 16000 trials.
As we see from Figure 6, in the 16000 trial case, the entire square is peppered with closely-spaced “dart marks” – so much so, that it looks as though the square is a uniform blue. Hence, it seems intuitively clear that as we increase, the number of throws, we should get a better approximation of the area and, hence, the probability.
There are a couple of points worth mentioning here. First, in principle this technique can be used to calculate areas of figures of any shape. However, the more irregular the figure, the worse the approximation – simply because it becomes less likely that the entire figure will be sampled correctly by “dart throws.” Second, the reader may have noted that although the 16000 trial case gives a good enough result for the area of the circle, it isn’t particularly accurate considering the large number of trials. Indeed, it is known that the “dart approximation” is not a very good way of calculating areas – see this note for more on this point.
Finally, let’s look at connection between the general approach used in Monte Carlo techniques and the example discussed above (I use the steps described in the Wikipedia article on Monte Carlo methods as representative of the general approach):
- Define a domain of possible inputs – in our case the domain of inputs is defined by the enclosing square of side 1 unit.
- Generate inputs randomly from the domain using a certain specified probability distribution – in our example the probability distribution is a pair of independent, uniformly distributed random numbers lying between 0 and 1.
- Perform a computation using the inputs – this is the calculation that determines whether or not a particular trial is a hit or not (i.e. if the x,y coordinates obey inequality (1) it is a hit, else it’s a miss)
- Aggregate the results of the individual computations into the final result – This corresponds to the calculation of the probability (or equivalently, the area of the circle) by aggregating the number of hits for each set of trials.
To summarise: Monte Carlo algorithms generate random variables (such as probability) according to pre-specified distributions. In most practical applications one will use more efficient techniques to sample the distribution (rather than the naïve method I’ve used here.) However, the basic idea is as simple as playing drunkard’s darts.
Thanks go out to Vlado Bokan for helpful conversations while this post was being written and to Paul Culmsee for getting me thinking about a simple way to explain Monte Carlo methods.
When developing duration estimates for a project task, it is useful to make a distinction between the uncertainty inherent in the task and uncertainty due to known risks. The former is uncertainty due to factors that are not known whereas the latter corresponds uncertainty due to events that are known, but may or may not occur. In this post, I illustrate how the two types of uncertainty can be combined via Monte Carlo simulation. Readers may find it helpful to keep my introduction to Monte Carlo simulations of project tasks handy, as I refer to it extensively in the present piece.
Setting the stage
Let’s assume that there’s a task that needs doing, and the person who is going to do it reckons it will take between 2 and 8 hours to complete it, with a most likely completion time of 4 hours. How the estimator comes up with these numbers isn’t important here – maybe there’s some guesswork, maybe some padding or maybe it is really based on experience (as it should be). For simplicity we’ll assume the probability distribution for the task duration is triangular. It is not hard to show that, given the above mentioned estimates, the probability, , that the task will finish at time is given by the equations below (see my introductory post for a detailed derivation):
for 2 hours 4 hours
for 4 hours 8 hours
These two expressions are sometimes referred to as the probability distribution function (PDF). The PDF described by equations (1) and (2) is illustrated in Figure 1. (Note: Please click on the Figures to view full-size images)
Now, a PDF tells us the probability that the task will finish at a particular time . However, we are more interested in knowing whether or not the task will be completed by time . – i.e. at or before time . This quantity, which we’ll denote by (capital P), is sometimes known as the cumulative distribution function (CDF). The CDF is obtained by summing up the probabilities from hrs to time . It is not hard to show that the CDF for the task at hand is given by the following equations:
for 2 hours 4 hours
for 4 hours 8 hours
For a detailed derivation, please see my introductory post. The CDF for the distribution is shown in Figure 2.
Now for the complicating factor: let us assume there is a risk that has a bearing on this task. The risk could be any known factor that has a negative impact on task duration. For example, it could be that a required resource is delayed or that the deliverable will fails a quality check and needs rework. The consequence of the risk – should it eventuate – is that the task takes longer. How much longer the task takes depends on specifics of the risk. For the purpose of this example we’ll assume that the additional time taken is also described by a triangular distribution with a minimum, most likely and maximum time of 1, 2 and 3 hrs respectively. The PDF for the additional time taken due to the risk is:
for 1 hour 2 hours
for 2 hrs 3 hours
The figure for this distribution is shown in Figure 3.
The CDF for the additional time taken if the risk eventuates (which we’ll denote by ) is given by:
for 1 hour 2 hours
for 2 hours 3 hours
The CDF for the risk consequence is shown in Figure 4.
Before proceeding with the simulation it is worth clarifying what all this means, and what we want to do with it.
Firstly, equations 1-4 describe the inherent uncertainty associated with the task while equations 5 through 8 describe the consequences of the risk, if it eventuates.
Secondly, we have described the task and the risk separately. In reality, we need a unified description of the two – a combined distribution function for the uncertainty associated with the task and the risk taken together. This is what the simulation will give us.
Finally, one thing I have not yet specified is the probabilty that the risk will actually occur. Clearly, the higher the probability, the greater the potential delay. Below I carry out simulations for risk probabilities of varying from 0.1 to 0.5.
That completes the specification of the problem – let’s move on to the simulation.
The simulation procedure I used for the zero-risk case (i.e. the task described by equations 1 and 2 ) is as follows :
- Generate a random number between 0 and 1. Treat this number as the cumulative probability, for the simulation run. [You can generate random numbers in Excel using the rand() function]
- Find the time, , corresponding to by solving equations (3) or (4) for . The resulting value of is the time taken to complete the task.
- Repeat steps (1) and (2) for a sufficiently large number of trials.
The frequency distribution of completion times for the task, based on 30,000 trials is shown in Figure 5.
As we might expect, Figure 5 can be translated to the probability distribution shown in Figure 1 by a straightforward normalization – i.e. by dividing each bar by the total number of trials.
What remains to be done is to incorporate the risk (as modeled in equations 5-6) into the simulation. To simulate the task with the risk, we simply do the following for each trial:
- Simulate the task without the risk as described earlier.
- Generate another random number between 0 and 1.
- If the random number is less than the probability of the risk, then simulate the risk. Note that since the risk is described by a triangular function, the procedure to simulate it is the same as that for the task (albeit with different parameters).
- If the random number is greater than the probability of the risk, do nothing.
- Add the results of 1 and 4. This is the outcome of the trial.
- Repeat steps 1-5 for as many trials as required.
I performed simulations for the task with risk probabilities of 10%, 30% and 50%. The frequency distributions of completion times for these are displayed in Figures 6-8 (in increasing order of probability). As one would expect, the spread of times increases with increasing probability. Further, the distribution takes on a distinct second peak as the probability increases: the first peak is at , corresponding to the most likely completion time of the risk-free task and the second at corresponding to the most likely additional time of 2 hrs if the risk eventuates.
It is also instructive to compare average completion times for the four cases (zero-risk and 10%, 30% and 50%). The average can computed from the simulation by simply adding up the simulated completion times (for all trials) and dividing by the total number of simulation trials (30,000 in our case). On doing this, I get the following:
Average completion time for zero-risk case = 4.66 hr
Average completion time with 10% probability of risk = 4.89 hrs
Average completion time with 30% probability of risk = 5.36 hrs
Average completion time with 50% probability of risk= 5.83 hrs
No surprises here.
One point to note is that the result obtained from the simulation for the zero-risk case compares well with the exact formula for a triangular distribution (see the Wikipedia article for the triangular distribution):
This serves as a sanity check on the simulation procedure.
It is also interesting to compare the cumulative probabilities of completion in the zero-risk and high risk (50% probability) case. The CDFs for the two are shown in Figure 9. The co-plotted CDFs allow for a quick comparison of completion time predictions. For example, in the zero-risk case, there is about a 90% chance that the task will be completed in a little over 6 hrs whereas when the probability of the risk is 50%, the 90% completion time increases to 8 hrs (see Figure 9).
Next steps and wrap up
For those who want to learn more about simulating project uncertainty and risk, I can recommend the UK MOD paper – Three Point Estimates And Quantitative Risk Analysis A Process Guide For Risk Practitioners. The paper provides useful advice on how three point estimates for project variables should be constructed. It also has a good discussion of risk and how it should be combined with the inherent uncertainty associated with a variable. Indeed, the example I have described above was inspired by the paper’s discussion of uncertainty and risk.
Of course, as with any quantitative predictions of project variables, the numbers are only as reliable as the assumptions that go into them, the main assumption here being the three point estimates that were used to derive the distributions for the task uncertainty and risk (equations 1-2 and 5-6). Typically these are obtained from historical data. However, there are well known problems associated with history-based estimates. For one, as we can never be sure that the historical tasks are similar to the one at hand in ways that matter (this is the reference class problem). As Shim Marom warns us in this post, all our predictions depend on the credibility of our estimates. Quoting from his post:
Can you get credible three point estimates? Do you have access to credible historical data to support that? Do you have access to Subject Matter Experts (SMEs) who can assist in getting these credible estimates?
If not, don’t bother using Monte Carlo.
In closing, I hope my readers will find this simple example useful in understanding how uncertainty and risk can be accounted for using Monte Carlo simulations. In the end, though, one should always keep in mind that the use of sophisticated techniques does not magically render one immune to the GIGO principle.