Archive for the ‘Bias’ Category
On the limitations of scoring methods for risk analysis
Introduction
A couple of months ago I wrote an article highlighting some of the pitfalls of using risk matrices. Risk matrices are an example of scoring methods , techniques which use ordinal scales to assess risks. In these methods, risks are ranked by some predefined criteria such as impact or expected loss, and the ranking is then used as the basis for decisions on how the risks should be addressed. Scoring methods are popular because they are easy to use. However, as Douglas Hubbard points out in his critique of current risk management practices, many commonly used scoring techniques are flawed. This post – based on Hubbard’s critique and research papers quoted therein - is a brief look at some of the flaws of risk scoring techniques.
Commonly used risk scoring techniques and problems associated with them
Scoring techniques fall under two major categories:
- Weighted scores: These use several ordered scales which are weighted according to perceived importance. For example: one might be asked to rate financial risk, technical risk and organisational risk on a scale of 1 to 5 for each, and then weight then by factors of 0.6, 0.3 and 0.1 respectively (possibly because the CFO – who happens to be the project sponsor – is more concerned about financial risk than any other risks ).
- Risk matrices: These rank risks along two dimensions – probability and impact – and assign them a qualitative ranking of high, medium or low depending on where they fall. Cox’s theorem shows such categorisations are internally inconsistent because the category boundaries are arbitrarily chosen.
Hubbard makes the point that, although both the above methods are endorsed by many standards and methodologies (including those used in project management), they should be used with caution because they are flawed. To quote from his book:
Together these ordinal/scoring methods are the benchmark for the analysis of risks and/or decisions in at least some component of most large organizations. Thousands of people have been certified in methods based in part on computing risk scores like this. The major management consulting firms have influenced virtually all of these standards. Since what these standards all have in common is the used of various scoring schemes instead of actual quantitative risk analysis methods, I will call them collectively the “scoring methods.” And all of them, without exception, are borderline or worthless. In practices, they may make many decisions far worse than they would have been using merely unaided judgements.
What is the basis for this claim? Hubbard points to the following:
- Scoring methods do not make any allowance for flawed perceptions of analysts who assign scores – i.e. they do not consider the effect of cognitive bias. I won’t dwell on this as I have previously written about the effect of cognitive biases in project risk management -see this post and this one, for example.
- Qualitative descriptions assigned to each score are understood differently by different people. Further, there is rarely any objective guidance as to how an analyst is to distinguish between a high or medium risk. Such advice may not even help: research by Budescu, Broomell and Po shows that there can be huge variances in understanding of qualitative descriptions, even when people are given specific guidelines what the descriptions or terms mean.
- Scoring methods add their own errors. Below are brief descriptions of some of these:
- In his paper on the risk matrix theorem, Cox mentions that “Typical risk matrices can correctly and unambiguously compare only a small fraction (e.g., less than 10%) of randomly selected pairs of hazards. They can assign identical ratings to quantitatively very different risks.” He calls this behaviour “range compression” – and it applies to any scoring technique that uses ranges.
- Assigned scores tend to cluster around the mid-low high range. Analysis by Hubbard shows that, on a 5 point scale, 75% of all responses are 3 or 4. This implies that changing a score from 3 to 4 or vice-versa can have a disproportionate effect on classification of risks.
- Scores implicitly assume that the magnitude of the quantity being assumed is directly proportional to the scale. For example, a score of 2 implies that the criterion being measured is twice as large as it would be for a score of 1. However, in reality, criteria are rarely linear as implied by such a scale.
- Scoring techniques often presume that the factors being scored are independent of each other independence – i.e. there are no correlations between factors. This assumption is rarely tested or justified in any way.
Many project management standards advocate the use of scoring techniques. To be fair, in many situations they are adequate as long as they are used with an understanding of their limitations. Seen in this light, Hubbard’s book is an admonition to standards and textbook writers to be more critical of the methods they advocate, and a warning to practitioners that an uncritical adherence to standards and best practices is not the best way to manage project risks .
Scoring done right
Just to be clear, Hubbard’s criticism is directed against scoring methods that use arbitrary, qualitative scales which are not justified by independent analysis. There are other techniques which, though superficially similar to these flawed scoring methods, are actually quite robust because they are:
- Based on observations.
- Use real measures (as opposed to arbitrary ones – such as “alignment with business objectives” on a scale of 1 to 5, without defining what ”alignment” means.)
- Validated after the fact (and hence refined with use).
As an example of a sound scoring technique, Hubbard quotes this paper by Dawes, which presents evidence that linear scoring models are superior to intuition in clinical judgements. Strangely, although the weights themselves can be obtained through intuition, the scoring model outperforms clinical intuition. This happens because human intuition is good at identifying important factors, but not so hot at evaluating the net effect of several, possibly competing factors. Hence simple linear scoring models can outperform intuition. The key here is that the models are validated by checking the predictions against reality.
Another class of techniques use axioms based on logic to reduce inconsistencies in decisions. An example of such a technique is multi-attribute utility theory. Since they are based on logic, these methods can also be considered to have a solid foundation unlike those discussed in the previous section.
Conclusions
Many commonly used scoring methods in risk analysis are based on flaky theoretical foundations – or worse, none at all. To compound the problem, they are often used without any validation. A particularly ubiquitous example is the well-known and loved risk matrix. In his paper on risk matrices, Tony Cox shows how risk matrices can sometimes lead to decisions that are worse than those made on the basis of a coin toss. The fact that this is a possibility – even if only a small one – should worry anyone who uses risk matrices (or other flawed scoring techniques) without an understanding of their limitations.
Cognitive biases as project meta-risks – part 2
Introduction
Risk management is fundamentally about making decisions in the face of uncertainty. These decisions are based on perceptions of future events, supplemented by analyses of data relating to those events. As such, these decisions are subject to cognitive biases - human tendencies to base judgements on flawed perceptions of events and/or data. In an earlier post, I argued that cognitive biases are meta-risks, i.e. risks of risk analysis. An awareness of how these biases operate can pave the way towards reducing their effects on risk-related decisions. In this post I therefore look into the nature of cognitive biases. In particular:
- The role of intuition and rational thought in the expression of cognitive biases.
- The psychological process of attribute substitution which underlies judgement-related cognitive biases
I then take a brief look at ways in which the effect of bias in decision-making can be reduced.
The role of intuition and rational thought in the expression of cognitive biases
Research in psychology has established that human cognition works through two distinct processes: System 1 which corresponds to intuitive thought and System 2 which corresponds to rational thought. In his Nobel Prize lecture, Daniel Kahneman had this to say about the two systems:
The operations of System 1 are fast, automatic, effortless, associative, and often emotionally charged; they are also governed by habit, and are therefore difficult to control or modify. The operations of System 2 are slower, serial, effortful, and deliberately controlled; they are also relatively flexible and potentially rule-governed.
The surprise is that judgements always involve System 2 processes. In Kahneman’s words:
…the perceptual system and the intuitive operations of System 1 generate impressions of the attributes of objects of perception and thought. These impressions are not voluntary and need not be verbally explicit. In contrast, judgments are always explicit and intentional, whether or not they are overtly expressed. Thus, System 2 is involved in all judgments, whether they originate in impressions or in deliberate reasoning.
So, all judgements, whether intuitive or rational, are monitored by System 2. Kahneman suggests that this monitoring can be very cursory thus allowing System 1 impressions to be expressed directly, whether they are right or not. Seen in this light, cognitive biases are unedited (or at best lightly edited) expressions of often incorrect impressions.
Attribute substitution: a common mechanism for judgement-related biases
In a paper entitled Representativeness Revisited, Kahneman and Fredrick suggest that the psychological process of attribute substitution is the mechanism that underlies many cognitive biases. Attribute substitution is the tendency of people to answer a difficult decision-making question by interpreting it as a simpler (but related) one. In their paper, Kahneman and Fredrick describe attribute substitution as occurring when:
…an individual assesses a specified target attribute of a judgment object by substituting a related heuristic attribute that comes more readily to mind…
An example might help decode this somewhat academic description. I pick one from Kahneman’s Edge master class where he related the following:
When I was living in Canada, we asked people how much money they would be willing to pay to clean lakes from acid rain in the Halliburton region of Ontario, which is a small region of Ontario. We asked other people how much they would be willing to pay to clean lakes in all of Ontario.
People are willing to pay the same amount for the two quantities because they are paying to participate in the activity of cleaning a lake, or of cleaning lakes. How many lakes there are to clean is not their problem. This is a mechanism I think people should be familiar with. The idea that when you’re asked a question, you don’t answer that question, you answer another question that comes more readily to mind. That question is typically simpler; it’s associated, it’s not random; and then you map the answer to that other question onto whatever scale there is—it could be a scale of centimeters, or it could be a scale of pain, or it could be a scale of dollars, but you can recognize what is going on by looking at the variation in these variables. I could give you a lot of examples because one of the major tricks of the trade is understanding this attribute substitution business. How people answer questions.
Attribute substitution boils down to making judgements based on specific, known instances of events or issues under consideration. For example, people often overrate their own abilities because they base their self-assessments on specific instances where they did well, ignoring situations in which their performance was below par. Taking another example from the Edge class,
COMMENT: So for example in the Save the Children—types of programs, they focus you on the individual.
KAHNEMAN: Absolutely. There is even research showing that when you show pictures of ten children, it is less effective than when you show the picture of a single child. When you describe their stories, the single instance is more emotional than the several instances and it translates into the size of contributions. People are almost completely insensitive to amount in system one. Once you involve system two and systematic thinking, then they’ll act differently. But emotionally we are geared to respond to images and to instances…
Kahnemann sums it up in a line in his Nobel lecture: The essence of attribute substitution is that respondents offer a reasonable answer to a question that they have not been asked.
Several decision-making biases in risk analysis operate via attribute substitution - some of these include availability, representativeness, overconfidence and selective perception (see this post for specific examples drawn from high-profile failed projects). Armed with this understanding of how these meta-risks operate, lets look at how their effect can be minimised.
System two to the rescue, but…
The discussion of the previous section suggests that people often base judgements on specific instances that come to mind, ignoring the range of all possible instances. They do this because specific instances – usually concrete instances that have been experienced – come to mind more easily than the abstract “universe of possibilities.”
Those who make erroneous judgements will correct them only if they become aware of factors that they did not take into account when making the judgement, or when they realise that their conclusions are not logical. This can only happen through deliberation: rational analysis, which is possible only through a deliberate invocation of System 2 thinking.
Some of the ways in which System 2 can be helped along are:
- By reframing the question or issue in terms that forces analysts to consider the range of possible instances rather than specific instances. A common manifestation of the latter is when risk managers base their plans on the assumption that average conditions will occur – an assumption that Professor Sam Savage calls the flaw of averages (see Dr. Savage’s very entertaining and informative book for more on the flaw of averages and related statistical fallacies).
- By requiring analysts to come up with pros and cons for any decision they make. This forces them to consider possibilities they may not have taken into account when making the original decision.
- By basing decisions on relevant empirical or historical data instead of relying on intuitive impressions.
- By making the analysts aware of their propensity to be overconfident (or under-confident) by evaluating their probability calibration. One way to do this is by asking them to answer a series of trivia questions with confidence estimates for each of their answers (i.e. their self-estimated probability of being right). Their confidence estimates are then compared to the fraction of questions correctly answered. A well calibrated individual’s confidence estimates should be close to the percentage of correct answers. There is some evidence to suggest that analysts can be trained improve their calibration through cycles of testing and feedback. Calibration training is discussed in Douglas Hubbard’s book, The Failure of Risk Management. However, as discussed here, improved calibration by through feedback and repeated tests may not carry over to judgements in real-life situations.
Each of the above options forces analysts to consider instances other than the ones that readily come to mind. That said, they aren’t a sure-cure for the problem: System 2 thinking does not guarantee correctness. Kahneman discusses several reasons why this is so. First, it has been found that education and training in decision-related disciplines (like statistics) does not eliminate incorrect intuitions; it only reduces them in favourable circumstances (such as when the question is reframed to make statistical cues obvious). Second, he notes that sytem 2 thinking is easily derailed: research has shown that the efficiency of system 2 is impaired by time pressure and multi-tasking. (Managers who put their teams under time and multi-tasking pressures should take note!). Third, highly accessible values, which form the basis for initial intuitive judgements serve as anchors for subsequent system 2-based corrections. These corrections are generally insufficient – i.e. too small. And finally, System 2 thinking is of no use if it is based on incorrect assumptions: as a colleague once said, “Logic doesn’t get you anywhere if your premise is wrong.”
Conclusion
Cognitive biases are meta-risks that are responsible for many incorrect judgements in project (or any other) risk analysis . An apposite example is the financial crisis of 2008, which can be traced back to several biases such as groupthink, selective perception and over-optimism (among many others). An understanding of how these meta-risks operate suggest ways in which their effects can be reduced, though not eliminated altogether. In the end, the message is simple and obvious: for judgements that matter, there’s no substitute for due diligence - careful observation and thought, seasoned with an awareness of one’s own fallibility.
Cognitive biases as project meta-risks
Introduction and background
A comment by John Rusk on this post got me thinking about the effects of cognitive biases on the perception and analysis of project risks. A cognitive bias is a human tendency to base a judgement or decision on a flawed perception or understanding of data or events. A recent paper suggests that cognitive biases may have played a role in some high profile project failures. The author of the paper, Barry Shore, contends that the failures were caused by poor decisions which could be traced back to specific biases. A direct implication is that cognitive biases can have a significant negative effect on how project risks are perceived and acted upon. If true, this has consequences for the practice of risk management in projects (and other areas, for that matter). This essay discusses the role of cognitive biases in risk analysis, with a focus on project environments.
Following the pioneering work of Daniel Kahneman and Amos Tversky, there has been a lot of applied research on the role of cognitive biases in various areas of social sciences (see Kahneman’s Nobel Prize lecture for a very readable account of his work on cognitive biases). A lot of this research highlights the fallibility of intuitive decision making. But even judgements ostensibly based on data are subject to cognitive biases. An example of this is when data is misinterpreted to suit the decision-maker’s preconceptions (the so-called confirmation bias). Project risk management is largely about making decisions regarding uncertain events that might impact a project. It involves, among other things, estimating the likelihood of these events occurring and the resulting impact on the project. These estimates and the decisions based on them can be erroneous for a host of reasons. Cognitive biases are an often overlooked, yet universal, cause of error.
Cognitive biases as project meta-risks
So, what role do cognitive biases play in project risk analysis? Many researchers have considered specific cognitive biases as project risks: for example, in this paper, Flyvbjerg describes how the risks posed by optimism bias can be addressed using reference class forecasting (see my post on improving project forecasts for more on this). However, as suggested in the introduction, one can go further. The first point to note is that biases are part and parcel of the mental make up of humans, so any aspect of risk management that involves human judgment is subject to bias. As such, then, cognitive biases may be thought of as meta-risks: risks that affect risk analyses. Second, because they are a part of the mental baggage of all humans, overcoming them involves an understanding of the thought processes that govern decision-making, rather than externally-directed analyses (as in the case of risks). The analyst has to understand how his or her perception of risks may be affected by these meta-risks.
The publicly available research and professional literature on meta-risks in business and organisational contexts is sparse. One relevant reference is a paper by Jack Gray on meta-risks in financial portfolio management. The first few lines of the paper state,
“Meta-risks are qualitative, implicit risks that pass beyond the scope of explicit risks. Most are born out the complex interaction between the behaviour pattern of individuals and those of organizational structures” (italics mine).
Although he doesn’t use the phrase, Gray seems to be referring to cognitive biases – at least in part. This is confirmed by a reading of the paper. It describes, among other things, hubris (which roughly corresponds to the illusion of control) and discounting evidence that conflicts with one’s views (which corresponds to confirmation bias) as meta-risks. From this (admittedly small) sampling of the literature, it seems that the notion of cognitive biases as meta-risks has some precedent.
Next, let’s look at how biases can manifest themselves as meta-risks in a project environment. To keep the discussion manageable, I’ll focus on a small set of biases:
Anchoring: This refers to the tendency of humans to rely on a single piece of information when making a decision. I have seen this manifest itself in task duration estimation – where “estimates plucked out of thin air” by management serve as an anchor for subsequent estimation by the project team. See this post for more on anchoring in project situations. Anchoring is a meta-risk because the over-reliance on a single piece of information about a risk can have an adverse effect on decisions relating to that risk.
Availability: This refers to the tendency of people to base decisions on information that can be easily recalled, neglecting potentially more important information. As an example, a project manager might give undue weight to his or her most recent professional experiences when analysing project risks. Here availability is a meta-risk because it is a barrier to an objective consideration of risks that are not immediately apparent to the analyst.
Representativeness: This refers to the tendency to make judgements based on seemingly representative, known samples . For example, a project team member might base a task estimate based on another (seemingly) similar task, ignoring important differences between the two. Another manifestation of representativeness is when probabilities of events are estimated based on those of comparable, known events. An example of this is the gambler’s fallacy. This is clearly a meta-risk, especially where “expert judgement” is used as a technique to assess risk (Why? Because such judgements are invariably based on comparable tasks that the expert has encountered before.).
Selective perception: This refers to the tendency of individuals to give undue importance to data that supports their own views. Selective perception is a bias that we’re all subject to; we hear what we want to hear, see what we choose to see, and remain deaf and blind to the rest. This is a meta-risk because it results in a skewed (or incomplete) perception of risks.
Loss Aversion: This refers to the tendency of people to give preference to avoiding losses (even small losses) over making gains. In risk analysis this might manifest itself as overcautiousness. Loss aversion is a meta-risk because it might, for instance, result in the assignment of an unreasonably large probability of occurrence to a risk.
A particularly common manifestation of loss aversion in project environments is the sunk cost bias. In situations where significant investments have been made in projects, risk analysts might be biased towards downplaying risks.
Information bias: This is the tendency of some analysts to seek as much data as they can lay their hands on prior to making a decision. The danger here is of being swamped by too much irrelevant information. Data by itself does not improve the quality of decisions (see this post by Tim van Gelder for more on the dangers of data-centrism). Over-reliance on data – especially when there is no way to determine the quality and relevance of data as is often the case – can hinder risk analyses. Information bias is a meta-risk for two reasons already alluded to above; first, the data may not capture important qualitative factors and second, the data may not be relevant to the actual risk.
I could work my way through a few more of the biases listed here, but I think I’ve already made my point: projects encompass a spectrum of organisational and technical situations, so just about any cognitive bias is a potential meta-risk.
Conclusion
Cognitive biases are meta-risks because they can affect decisions pertaining to risks – i.e. they are risks of risk analysis. Shore’s research suggests that the risks posed by these meta-risks are very real; they can cause project failure So, at a practical level, project managers need to understand how cognitive biases could affect their own risk-related judgements (or any other judgements for that matter). The previous section provides illustrations of how selected cognitive biases can affect risk analyses; there are, of course, many more. Listing examples is illustrative, and helps make the point that cognitive biases are meta-risks. However, it is more useful and interesting to understand how biases operate and what we can do to overcome them. As I have mentioned above, overcoming biases requires an understanding of the thought processes through which humans make decisions in the face of uncertainty. Of particular interest is the role of intuition and rational thought in forming judgements, and the common mechanisms that underlie judgement-related cognitive biases. A knowledge and awareness of these mechanisms might help project managers in consciously countering the operation of cognitive biases in their own decision making. I’m currently making some notes on these topics, with the intent of publishing them in a forthcoming essay – please stay tuned.
The role of cognitive biases in project failure
Introduction
There are two distinct views of project management practice: the rational view which focuses on management tools and techniques such as those espoused by frameworks and methodologies, and the social/behavioural view which looks at the social aspect of projects – i.e. how people behave and interact in the context of a project and the wider organisation. The difference between the two is significant: one looks at how projects should be managed, it prescribes tools, techniques and practices; the other at what actually happens on projects, how people interact and how managers make decisions. The gap between the two can sometimes spell the difference between project success and failure. In many failed projects, the failure can be traced back to poor decisions, and the decisions themselves to cognitive biases: i.e. errors in judgement based on perceptions. A paper entitled, Systematic Biases and Culture in Project Failure, by Barry Shore looks at the role played by selected cognitive biases in the failure of some high profile projects. The paper also draws some general conclusions on the relationship between organisational culture and cognitive bias. This post presents a summary and review of the paper.
The paper begins with a brief discussion of the difference in the rational and social/behavioural view of project management. The rational view is prescriptive – it describes management procedures and techniques which claim to increase the chances of success if followed. Further, it emphasises causal effects (if you follow X procedure then Y happens). The social/behavioural view is less well developed because it looks at human behaviour which is hard to study in controlled conditions, let alone in projects. Yet, developments in behavioural economics – mostly based on the pioneering work of Kahnemann and Tversky – can be directly applied to project management (see my post on biases in project estimation, for instance). In the paper, Shore looks at eight case studies of failed projects and attempts to attribute their failure to selected cognitive biases. He also looks into the relationship between (project and organisational) culture and the prevalence of the selected biases. Following Hofstede, he defines organisational culture as shared perceptions of organisational work practices and, analogously, project culture as shared perceptions of project work practices. Since projects take place within organisations, project culture is obviously influenced by the organisational culture.
Scope and Methodology
In this section I present a brief discussion of the biases that the paper focuses on and the study methodology.
There are a large number of cognitive biases in the literature. The author selects the following for his study:
Available data: Restricting oneself to using data that is readily or conveniently available. Note that “Available data” is a non-standard term: it is normally referred to as a sampling bias, which in turn is a type of selection bias.
Conservatism (Semmelweis reflex): Failing to consider new information or negative feedback.
Escalation of commitment: Allocating additional resources to a project that is unlikely to succeed.
Groupthink: Members of a project group under pressure to think alike, ignoring evidence that may threaten their views.
Illusion of control: Management believing they have more control over a situation than an objective evaluation would suggest.
Overconfidence: Having a level of confidence that is unsupported by evidence or performance.
Recency (serial position effect): Undue emphasis being placed on most recent data (ignoring older data)
Selective perception: Viewing a situation subjectively; perceiving only certain (convenient) aspects of a situation.
Sunk cost: Not accepting that costs already incurred cannot be recovered and should not be considered as criteria for future decisions. This bias is closely related to loss aversion.
The author acknowledges that there is a significant overlap between some of these effects: for example, illusion of control has much in common with overconfidence. This implies a certain degree of subjectivity in assigning these as causes for project failures.
The failed projects studied in the paper are high profile efforts that failed in one or more ways. The author obtained data for the projects from public and government sources. He then presented the data and case studies to five independent groups of business professionals (constituted from a class he was teaching) and asked them to reach a consensus on which biases could have played a role in causing the failures. The groups presented their results to the entire class, then through discussions, reached agreement on which of the biases may have lead to the failures.
The case studies
This section describes the failed project studied and the biases that the group identified as being relevant.
Airbus 380: Airbus was founded as a consortium of independent aerospace companies. The A380 project which was started in 2000 - was aimed at creating the A380 superjumbo jet with a capacity of 800 passengers. The project involved coordination between many sites. Six years into the project, when the aircraft was being assembled in Toulouse, it was found that a wiring harness produced in Hamburg failed to fit the airframe.
The group identified the following biases as being relevant to the failure of the Airbus project:
Selective perception: Managers acted to guard their own interests and constituencies.
Groupthink: Each participating organisation worked in isolation from the others, creating an environment in which groupthink would thrive.
Illusion of control: Corporate management assumed they had control over participating organisations.
Availability bias: Management in each of the facilities did not have access to data in other facilities, and thus made decisions based on limited data.
Coast Guard Maritime Domain Awareness Project: This project, initated in 2001, was aimed at creating the maritime equivalent of an air traffic control system. It was to use a range of technologies, and involved coordination between many US government agencies. The goal of the first phase of the project was to create a surveillance system that would be able to track boats as small as jet skis. The surveillance data was to be run through a software system that would flag potential threats. In 2006 – during the testing phase – the surveillance system failed to meet quality criteria. Further, the analysis software was not ready for testing.
The group identified the following biases as being relevant to the failure of the Maritime Awareness project:
Illusion of control: Coordinating several federal agencies is a complex task. This suggests that project managers may have thought they had more control than they actually did.
Selective perception: Separate agencies worked only on their portions of the project, failing to see the larger picture. This suggests that project groups may have unwittingly been victims of selective perception.
Columbia Shuttle: The Columbia Shuttle disaster was caused by a piece of foam insulation breaking off the propellant tank and damaging the wing. The problem with the foam sections was known, but management had assumed that it posed no risk.
In their analysis, the group found the following biases to be relevant to the failure of this project:
Conservatism: Management failed to take into account negative data.
Overconfidence: Management was confident there were no safety issues.
Recency: Although foam insulation had broken off on previous flights, it had not caused any problems.
Denver Airport Baggage Handling System: The Denver airport project, which was scheduled for completion in 1993, was to feature a completely automated baggage handling system. The technical challenges were enormous because the proposed system was an order of magnitude more complex than those that existed at the time. The system was completed in 1995, but was riddled with problems. After almost a decade of struggling to fix the problems, not to mention being billions over-budget, the project was abandoned in 2005.
The group identified the following biases as playing a role in the failure of this project:
Overconfidence: Although the project was technically very ambitious, the contractor (BAE systems) assumed that all technical obstacles could be overcome within the project timeframes.
Sunk cost: The customers (United Airlines) did not pull out of the project even when other customers pulled out, suggesting that they were reluctant to write off already incurred costs.
Illusion of control: Despite evidence to the contrary, management assumed that problems could be solved and that the project remained under control.
Mars Climate Orbiter and Mars Polar Lander: Telemetry signals from the Mars climate orbiter ceased when the spacecraft approached its destination. The root cause of the problem was found to be a failure to convert between metric and British units: apparently the contractor, Lockheed, had used British units in the engine design but NASA scientists who were responsible for operations and flight assumed the data was in metric units. A few months after the climate orbiter disaster, another spacecraft, the Mars polar lander fell silent just short of landing on the surface of Mars. The failure was attributed to a software problem that caused the engines to shutdown prematurely, thereby causing the spacecraft to crash.
The group attributed the above project failures to the following biases:
Conservatism: Project engineers failed to take action when they noticed that the spacecraft was off-trajectory early in the flight.
Sunk cost: Managers were under pressure to launch the spacecraft on time – waiting until the next launch window would have entailed a wait of many months thus “wasting” the effort up to that point. (Note: In my opinion this is an incorrect interpretation of sunk cost)
Selective perception: The spacecraft modules were constructed by several different teams. It is very likely that teams worked with a very limited view of the project (one which was relevant to their module).
Merck Vioxx: Vioxx was a very successful anti-inflammatory medication developed and marketed by Merck. An article published in 2000 suggested that Merck misrepresented clinical trial data, and another paper published in 2001 suggested that those who took Vioxx were subject to a significantly increased risk of assorted cardiac events. Under pressure, Merck put a warning label on the product in 2002. Finally, the drug was withdrawn from the market in 2004 after over 80 million people had taken it.
The group found the following biases to be relevant to the failure of this project:
Conservatism: The company ignored early warning signs about the toxicity of the drug.
Sunk cost: By the time concerns were raised, the company had already spent a large amount of money in developing the drug. It is therefore likely that there was a reluctance to write off the costs incurred to that point.
Microsoft Xbox 360: The Microsoft Xbox console was released to market in 2005, a year before comparable offerings from its competitors. The product was plagued with problems from the start; some of them include: internet connectivity issues, damage caused to game disks, faulty power cords and assorted operational issues. The volume of problems and complaints prompted Microsoft to extend the product warranty from one to three years at an expected cost of $1 billion.
The group thought that the following biases were significant in this case:
Conservatism: Despite the early negative feedback (complaints and product returns), the development group seemed to acknowledge that there were problems with the product.
Groupthink: It is possible that the project team ignored data that threatened their views on the product. The group reached this conclusion because Microsoft seemed reluctant to comment publicly on the causes of problems.
Sunk cost: By the time problems were identified, Microsoft had invested a considerable sum of money on product development. This suggests that the sunk cost trap may have played a role in this project failure.
NYC Police Communications System: (Note: I couldn’t find any pertinent links to this project). In brief: the project was aimed at developing a communications system that would enable officers working in the subway system to communicate with those on the streets. The project was initiated in 1999 and scheduled for completion in 2004 with a budgeted cost of $115 million. A potential interference problem was identified in 2001 but the contractors ignored it. The project was completed in 2007, but during trials it became apparent that interference was indeed a problem. Fixing the issue was expected to increase the cost by $95 million.
The group thought that the following biases may have contributed to the failure of this project:
Conservatism: Project managers failed to take early data on intereference account.
Illusion of control: The project team believed – until very late in the project – that the interference issue could be fixed.
Overconfidence: Project managers believed that the design was sound, despite evidence to the contrary.
Analysis and discussion
The following four biases appeared more often than others: Conservatism, illusion of control, selective perception and sunk cost.
The following biases appeared less often: groupthink and overconfidence.
Recency and availability were mentioned only once.
Based on the small data sample and the somewhat informal means of analysis, the author concludes that the first four biases may be dominant in project management. In my opinion this conclusion is shaky because the study has a few shortcomings, which I list below:
- The sample size is small
- The sample covers a range of domains.
- No checks were done to verify the group members’ understanding of all the biases.
- The data on which the conclusions are based is incomplete – based only on publicly available data. (perhaps is this an example of the available data bias at work?)
- A limited set of biases is used – there could be other biases at work.
- The conclusions themselves are subject to group-level biases such as groupthink. This is a particular concern because the group was specifically instructed to look at the case studies through the lens of the selected cognitive biases.
- The analysis is far from exhaustive or objective; it was done as a part of classroom exercise.
For the above reasons, the analysis is at best suggestive: it indicates that biases may play a role in the decisions that lead to project failures.
The author also draws a link between organisational culture and environments in which biases might thrive. To do this, he maps the biases on to the competing values framework of organisational culture, which views organisations along two dimensions:
- The focus of the organisation – internal or external.
- The level of management control in the organisation – controlling (stable) or discretionary (flexible).
According to the author, all nine biases are more likely in a stability (or control) focused environment than a flexible one, and all barring sunk cost are more likely to thrive in a internal focused organisation than an externally focused one. This conclusion makes sense: project teams are more likely to avoid biases when empowered to make decisions, free from management and organisational pressures. Furthermore, biases are also less likely to play a role when external input – such as customer feedback – is taken seriously.
That said, the negative effects of internally focused, high control organisations can be countered. The author quotes two examples:
- When designing the 777 aircraft, Boeing introduced a new approach to project management wherein teams were required to include representatives from all groups of stakeholders. The team was encouraged to air differences in opinion and to deal with these in an open manner. This approach has been partly credit for the success of the 777 project.
- Since the Vioxx debacle, Merck rewards research scientists who terminate projects that do not look promising.
Conclusions
Despite my misgivings about the research sample and methodology, the study does suggest that standard project management practices could benefit by incorporating insights from behavioural studies. Further, the analysis indicates that cognitive biases may have indeed played a role in the failure of some high profile projects. My biggest concern here, as stated earlier, is that the groups were required to associate the decisions with specific biases – i.e. there was an assumption that one or more of the biases from the (arbitrarily chosen) list was responsible for the failure. In reality, however, there may have been other more important factors at work.
The connections with organisational culture are interesting too, but hardly surprising: people are more likely to do the right thing when management empowers them with responsibility and authority.
In closing: I found the paper interesting because it deals with an area that isn’t very well represented in the project management literature. Further, I believe these biases play a significant role in project decision making, especially in internally focussed / controlled organisations (project managers are human, and hence not immune…). However, although the paper supports this view, it doesn’t make a wholly convincing case for it.
Measuring the unmeasurable: a note on the pitfalls of performance metrics
Many organisations measure performance – of people, projects processes or whatever - using quantitative metrics, or KPIs as they are often called. Some examples of these include: calls answered / hour (for a person working in a contact centre); % complete (for a project task) and orders processed / hour (for an order handling process). The rationale for measuring performance quantitatively is rooted in Taylorism or scientific management. The early successes of Taylorism in improving efficiencies on the shopfloor lead to its adoption in other areas of management. The scientific approach to management underlies the assumption that metrics are a Good Thing, echoing the words of the 19th century master physicist, Lord Kelvin:
When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge of it is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced it to the stage of science.
This is a fine sentiment for science: precise measurement is a keystone of physics and other natural sciences. So much so, that some scientists spend a large part of their professional lives refining and perfecting certain measurements. However, it can be misleading and sometimes downright counterproductive to attempt such quantification in management. This post explains why I think so.
Firstly, there are basically two categories of things (indicators, characteristics or whatever) that management attempts to quantify when defining performance metrics– tangible (such as number of calls per unit time) and intangible (for example, employee performance on a five point scale). Although people attach numerical scores to both kinds of things, I’m sure most people would agree that any quantification of employee performance is way more subjective than number of calls per unit time. Now, it is possible to reduce this subjectivity by associating the intangible characteristic to a tangible one – for example, employee performance can be tied to sales (for a sales rep), r number of projects successfully completed (for a project manager) or customer satisfaction as measured by surveys (for a customer service representative). However, all such attempts result in a limited view of the characteristic being measured. Such associated tangible metrics cannot measure all aspects of the intangible metric in question. In the case at hand – employee performance - factors such as enthusiasm, motivation, doing things beyond the call of duty etc., all of which are important aspects of employee performance, remain unmeasurable. So as a first point we have the following: attaching a numerical score to intangible quantities is fraught with subjectivity and ambiguity.
But even measures of tangible characteristics can have issues. An example that comes to mind is the infamous % complete metric for tasks in a project management. Many project managers record a progress by noting that a task – say data migration – is 70% complete. But, what does this figure mean? Does it mean that 70% of the data has been migrated (and what does that mean anyway?), or is it that 70% of the total effort required (as measured against days allocated to the task) has been expended. Most often, the figure quoted has no explanation as to what it means – and everyone interprets it in a way that best suits their agenda. My point here is: a well designed metric should include an unambiguous statement as to what is being measured, how it is to be measured and how it is to be interpreted. Many seemingly well defined metrics do not satisfy this criterion – the % complete metric being a sterling example. These give the illusion of precision, which can be more harmful than having no measurement at all. My second point is thus summarised as follows: it is hard to design unambiguous metrics, even for tangible performance characteristics. Of course, speaking of the % complete metric, many project managers now understand its shortcomings and use an “all or nothing” approach – a task is either 0% complete (not started or in progress) or 100% complete (truly complete).
Another danger of quantification of performance is highlighted by Eliyahu Goldratt in his book The Haystack Syndrome. To quote from the book:
…Tell me how you measure me and I will tell you how I will behave. If you measure me in an illogical way…do not complain about illogical behaviour…
A case in point is the customer contact centre employee who is measured by calls handled per hour. The employee knows he has to maximise calls taken, so he ends up trying to keep conversations short – even if it means upsetting customers. By trying to improve call throughput, the company ends up reducing quality of service. Fortunately, some service companies are beginning to understand this – read about Repco’s experience in this article from MIS Australia, for example. The take-home point here is: performance measurements that focus on the wrong metric have the potential to distort employee behaviour to the detriment of the organisation.
Finally, metrics that rely on human judgements are subject to cognitive bias. Specifically, it is well known that biases such as anchoring and framing can play a big role in determining the response received to a question such as, “How would you rate X’s performance on a scale of 1 to 5 (best performance being 5)?” In earlier posts, I’ve written about the role of cognitive biases in project task estimation and project management research. The effect of these biases on performance metrics can be summarised as follows: since many performance metrics rely on subjective judgements made by humans, these metrics are subject to cognitive biases. It is difficult, if not impossible, to correct for these biases.
To conclude: it is difficult to design performance metrics that are unambiguous, unbiased and do not distort behaviour. Use them if you must – or are required to do so by your organisation – but design and interpret them with care because, if used unthinkingly, they can cause terminal damage to employee morale.