Archive for July 2009
In earlier posts I’ve described a notation called IBIS (Issue-based information system), and demonstrated its utility in visualising reasoning and resolving complex issues through dialogue mapping. The IBIS notation consists of just three elements (issues, ideas and arguments) that can be connected in a small number of ways. Yet, despite these limitations, IBIS has been found to enhance creativity when used in collaborative design discussions. Given the simplicity of the notation and grammar, this claim is surprising, even paradoxical. The present post resolves this paradox by viewing collaborative knowledge creation as an art, and considers the aesthetic competencies required to facilitate this art.
In a position paper entitled, The paradox of the “practice level” in collaborative design rationale, Al Selvin draws an analogy between design discussions using Compendium (an open source IBIS-based argument mapping tool) and art. He uses the example of the artist Piet Mondrian, highlighting the difference in style between Mondrian’s earlier and later work. To quote from the paper,
Whenever I think of surfacing design rationale as an intentional activity — something that people engaged in some effort decide to do (or have to do), I think of Piet Mondrian’s approach to painting in his later years. During this time, he departed from the naturalistic and impressionist (and more derivative, less original) work of his youth (view an image here) and produced the highly abstract geometric paintings (view an image here) most associated with his name…
Selvin points out that the difference between the first and the second paintings is essentially one of abstraction: the first one is almost instantly recognisable as a depiction of dunes on a beach whereas the second one, from Mondrian’s minimalist period, needs some effort to understand and appreciate, as it uses a very small number of elements to create a specific ambience. To quote from the paper again,
“One might think (as many in his day did) that he was betraying beauty, nature, and emotion by going in such an abstract direction. But for Mondrian it was the opposite. Each of his paintings in this vein was a fresh attempt to go as far as he could in the depiction of cosmic tensions and balances. Each mattered to him in a deeply personal way. Each was a unique foray into a depth of expression where nothing was given and everything had to be struggled for to bring into being without collapsing into imbalance and irrelevance. The depictions and the act of depicting were inseparable. We get to look at the seemingly effortless result, but there are storms behind the polished surfaces. Bringing about these perfected abstractions required emotion, expression, struggle, inspiration, failure and recovery — in short, creativity…”
In analogy, Selvin contends that a group of people who work through design issues using a minimalist notation such as IBIS can generate creative new ideas. In other words: IBIS, when used in a group setting such as dialogue mapping, can become a vehicle for collaborative creativity. The effectiveness of the tool, though, depends on those who wield it:
“…To my mind using tools and methods with groups is a matter of how effective, artistic, creative, etc. whoever is applying and organizing the approach can be with the situation, constraints, and people. Done effectively, even the force-fitting of rationale surfacing into a “free-flowing” design discussion can unleash creativity and imagination in the people engaged in the effort, getting people to “think different” and look at their situation through a different set of lenses. Done ineffectively, it can impede or smother creativity as so many normal methods, interventions, and attitudes do…”
Although Selvin’s discussion is framed in the context of design discussions using Compendium, this is but dialogue mapping by another name. So, in essence, he makes a case for viewing the collaborative generation of knowledge (through dialogue mapping or any other means) as an art. In fact, in another article, Selvin uses the term knowledge art to describe both the process and the product of creating knowledge as discussed above. Knowledge Art as he sees it, is a marriage of the two forms of discourse that make up the term. On the one hand, we have knowledge which, “… in an organizational setting, can be thought of as what is needed to perform work; the tacit and explicit concepts, relationships, and rules that allow us to know how to do what we do.” On the other, we have art which “… is concerned with heightened expression, metaphor, crafting, emotion, nuance, creativity, meaning, purpose, beauty, rhythm, timbre, tone, immediacy, and connection.”
Facilitating collaborative knowledge creation
In the business world, there’s never enough time to deliberate or think through ideas (either individually or collectively): everything is done in a hurry and the result is never as good as it should or could be; the picture never quite complete. However, as Selvin says,
“…each moment (spent discussing or thinking through ideas or designs) can yield a bit of the picture, if there is a way to capture the bits and relate them, piece them together over time. That capturing and piecing is the domain of Knowledge Art. Knowledge Art requires a spectrum of skills, regardless of how it’ practiced or what form it takes. It means listening and paying attention, determining the style and level of intervention, authenticity, engagement, providing conceptual frameworks and structures, improvisation, representational skill and fluidity, and skill in working with electronic information…”
So, knowledge art requires a wide range of technical and non-technical skills. In previous posts I’ve discussed some of technical skills required – fluency with IBIS, for example. Let’s now look at some of the non-technical competencies.
What are the competencies needed for collaborative knowledge creation? Palus and Horth offer some suggestions in their paper entitled, Leading Complexity; The Art of Making Sense. They define the concept of creative leadership as making shared sense out of complexity and chaos and the crafting of meaningful action. Creative leadership is akin to dialogue mapping, which Jeff Conklin describes as a means to achieve a shared understanding of wicked problems and a shared commitment to solving them. The connection between creative leadership and dialogue mapping is apparent once one notices the similarity between their definitions. So the competencies of creative leadership should apply to dialogue mapping (or collaborative knowledge creation) as well.
Palus and Horth describe six basic competencies of creative leadership. I outline these below, mentioning their relevance to dialogue mapping:
Paying Attention: This refers to the ability to slow down discourse with the aim of achieving a deep understanding of the issues at hand. A skilled dialogue mapper has to be able to listen; to pay attention to what’s being said.
Personalizing: This refers to the ability to draw upon personal experiences, interests and passions whilst engaged in work. Although the connection to dialogue mapping isn’t immediately evident, the point Palus and Horth make is that the ability to make connections between work and one’s interests and passions helps increase involvement, enthusiasm and motivation in tackling work challenges.
Imaging: This refers to the ability to visualise problems so as to understand them better, using metaphors, pictures stories etc to stimulate imagination, intuition and understanding. The connection to dialogue mapping is clear and needs no elaboration.
Serious play: This refers to the ability to experiment with new ideas; to learn by trying and doing in a non-threatening environment. This is something that software developers do when learning new technologies. A group engaged in a dialogue mapping must have a sense of play; of trying out new ideas, even if they seem somewhat unusual.
Collaborative enquiry: This refers to the ability to sustain productive dialogue in a diverse group of stakeholders. Again, the connection to dialogue mapping is evident.
Crafting: This refers to the ability to synthesise issues, ideas, arguments and actions into coherent, meaningful wholes. Yet again, the connection to dialogue mapping is clear – the end product is ideally a shared understanding of the problem and a shared commitment to a meaningful solution.
Palus and Horth suggest that these competencies have been ignored in the business world because:
- They are seen as threatening the status quo (creativity is to feared because it invariably leads to changes).
- These competencies are aesthetic, and the current emphasis on scientific management devalues competencies that are not rational or analytical.
The irony is that creative scientists have these aesthetic competencies (or qualities) in spades. At the most fundamental level science is an art – it is about constructing theories or designing experiments that make sense of the world. Where do the ideas for these new theories or experiments come from? Well, they certainly aren’t out there in the objective world; they come from the imagination of the scientist. Science, in the real sense of the word, is knowledge art. If these competencies are useful in science, they should be more than good enough for the business world.
To sum up: knowledge creation in an organisational context is best viewed as an art – a collaborative art. Visual representations such as IBIS provide a medium to capture snippets of knowledge and relate them, or piece them together over time. They provide the canvas, brush and paint to express knowledge as art through the process of dialogue mapping.
Corporate developers spend majority of their programming time doing maintenance work. My basis for this claim is two years worth of statistics that I have been gathering at my workplace. According to these figures, my group spends about 65 percent of their programming time on maintenance (with some developers spending considerably more, depending on the applications they support). I suspect these numbers are applicable to most corporate IT shops – and possibly, to a somewhat smaller extent, to software houses as well. Unfortunately, maintenance work is often looked upon as being “inferior to” development. This being the case, it is worth dispelling some myths about maintenance programming. As it happens, I’ve just finished reading Robert Glass‘ wonderful book, Facts and Fallacies of Software Engineering, in which he presents some interesting facts about software maintenance (among lots of other interesting facts). This post looks at these facts which, I think, some readers may find surprising.
Let’s get right to it. Fact 41 in the book reads:
Maintenance typically consumes 40 to 80 percent (average 60 percent) of software costs. Therefore, it is probably the most important life cycle phase of software.
Surprised? Wait, there’s more: Fact 42 reads:
Enhancement is responsible for roughly 60 percent of software maintenance costs. Error correction is roughly 17 percent. Therefore software maintenance is largely about adding new capability to old software, not fixing it.
As a corollary to Fact 42, Glass unveils Fact 43, which simply states that:
Maintenance is a solution, not a problem.
Developers who haven’t done any maintenance work may be surprised by these facts. Most corporate IT developers have done considerable maintenance time; so no one in my mob was surprised when I mentioned these during a coffee break conversation. Based on the number quoted in the first paragraph (65 percent maintenance) and Glass’s figure (60 percent of maintenance is modification work), my colleagues spend close to 40 percent of their time of enhancing existing applications. All of them reckon this number is about right, and their thinking is supported by my data.
A few weeks ago, I wrote a piece entitled the legacy of legacy software in which I pointed out that legacy code is a problem for historians and programmers alike. Both have to understand legacy code, albeit in different ways. The historian needs to understand how it developed over the years so that he can understand its history; why it is the way it is and what made it so. The programmer has a more pragmatic interest – she needs to understand how it works so that she can modify it. Now, Glass’ Fact 42 tells us that much of maintenance work is adding new functionality. New functionality implies new code, or at least substantial modifications of existing code. Software is therefore a palimpsest – written once, and then overwritten again and again.
The maintenance programmer whose job it is to modify legacy code has to first understand it. Like a historian or archaeologist decoding a palimpsest, she has to sort through layers of modifications made by different people at different times for different reasons. The task is often made harder by the fact that modifications are often under-documented (if not undocumented). In Fact 44 of the book, Glass states that this effort of understanding code – an effort that he calls undesign – makes up about 30 percent of the total time spent in maintenance. It is therefore the most significant maintenance activity.
But that’s not all. After completing “undesign” the maintenance programmer has to design the enhancement within the context of the existing code – design under constraints, so to speak. There are at least a couple of reasons why this is hard. First, as Brooks tells us in No Silver Bullet — design itself is hard work; it is one of the essential difficulties of software engineering. Second, the original design is created with a specific understanding of requirements. By the time modifications come around, the requirements may have changed substantially. These new requirements may conflict with the original design. If so, the maintenance task becomes that much harder.
Ideally, existing design documentation should ease the burden on the maintenance programmer. However it rarely does because such documentation is typically created in the design phase – and rarely modified to reflect design changes as the product is built. As a consequence, most design documentation is hopelessly out of date by the time the original product is released into production. To quote from the book:
Common sense would tell you that the design documentation, produced as the product is being built, would be an important basis for those undesign tasks. But common sense, in this case, would be wrong. As the product is built, the as-built program veers more and more away from the original design specifications. Ongoing maintenance drives the specs and product even further apart. The fact of the matter is, design documentation is almost completely untrustworthy when it comes to maintaining a software product. The result is, almost all of that undesign work involves reading of code (which is invariably up to date) and ignoring the documentation (which commonly is not).
So, one of the main reasons maintenance work is hard is that the programmer has to expend considerable effort in decoding someone else’s code (some might argue that this is the most time consuming part of undesign). Programmers know that it is hard to infer what a program does by reading it, so the word “code” in the previous sentence could well be used in the sense of code as an obfuscated or encrypted message. As Charles Simonyi said in response to an Edge question:
Programmers using today’s paradigm start from a problem statement, for example that a Boeing 767 requires a pilot, a copilot, and seven cabin crew with various certification requirements for each—and combine this with their knowledge of computer science and software engineering—that is how this rule can be encoded in computer language and turned into an algorithm. This act of combining is the programming process, the result of which is called the source code. Now, programming is well known to be a difficult-to-invert function, perhaps not to cryptography’s standards, but one can joke about the possibility of the airline being able to keep their proprietary scheduling rules secret by publishing the source code for the implementation since no one could figure out what the rules were—or really whether the code had to do with scheduling or spare parts inventory—by studying the source code, it can be that obscure.
Glass offers up one final maintenance-related fact in his book (Fact 45):
Better software engineering leads to more maintenance, not less.
Huh? How’s that possible.
The answer is actually implicit in the previous facts and Simonyi’s observation: in the absence of documentation, the ease with which modifications can be made is directly related to the ease with which the code can be understood. Well designed systems are easier to understand, and hence can be modified more quickly. So, in a given time interval, a well designed system will have more modifications done to it than one that is not so well designed. Glass mentions that this is an interesting manifestation of Fact 43: Maintenance as a solution, rather than a problem.
Towards the end of the book, Glass presents the following fallacy regarding maintenance:
The way to predict future maintenance costs and to make product replacement decisions is to look at past cost data.
The reason that prediction based on past data doesn’t work is that a plot of maintenance costs vs. time plot has a bathtub shape. Initially, when a product is just released, there is considerable maintenance work (error fixing and enhancements) done on it. This decreases in time, until it plateaus out. This is the “stable” region corresponding to the period when the product is being used with relatively few modifications or error fixes. Finally, towards the end of the product’s useful life, enhancements and error fixes become more expensive as technology moves on and/or the product begins to push the limits of its design. At this point costs increase again, often quite steeply. The point Glass makes is that, in general, one does not know where the product is on this bathtub curve. Hence, using past data to make predictions is fraught with risk – especially if one is near an inflection point, where the shape of the curve is changing.So what’s the solution? Glass suggests asking customer about their expectations regarding the future of the product, rather than trying to extrapolate from past data.
Finally, Glass has this to say about replacing software:
Most companies find that retiring an existing software product is nearly impossible. To build a replacement requires a source of the requirements that match the current version of the product, and those requirements probably don’t exist anywhere. They’re not in the documentation because it wasn’t kept up to date. They’re not to be found from the original customers or users or developers because those folks are long gone…They may be discernable form reverse engineering the existing product, but that’s an error-prone and undesirable task that hardly anyone wants to tackle. To paraphrase an old saying, “Old software never dies, it just tends to fade away.”
And it’s the maintenance programmer who extends its life, often way beyond original design and intent. So, maintenance matters because it adds complexity to the legacy of legacy software. But above all it matters because it is a solution, not a problem.
Over the last few months I’ve written a number of posts on IBIS (short for Issue Based Information System), an argument visualisation technique invented in the early 1970s by Horst Rittel and Werner Kunz. IBIS is best known for its use in dialogue mapping – a collaborative approach to tackling wicked problems – but it has a range of other applications as well (capturing project knowledge is a good example). All my prior posts on IBIS focused on its use in specific applications. Hence the present piece, in which I discuss the “what” and “whence” of IBIS: its practical aspects – notation, grammar etc. – along with its origins, advantages and limitations
I’ll begin with a brief introduction to the technique (in its present form) and then move on to its origins and other aspects.
A brief introduction to IBIS
IBIS consists of three main elements:
- Issues (or questions): these are issues that need to be addressed.
- Positions (or ideas): these are responses to questions. Typically the set of ideas that respond to an issue represents the spectrum of perspectives on the issue.
- Arguments: these can be Pros (arguments supporting) or Cons (arguments against) an issue. The complete set of arguments that respond to an idea represents the multiplicity of viewpoints on it.
The best IBIS mapping tool is Compendium – it can be downloaded here. In Compendium, the IBIS elements described above are represented as nodes as shown in Figure 1: issues are represented by green question nodes; positions by yellow light bulbs; pros by green + signs and cons by red – signs. Compendium supports a few other node types, but these are not part of the core IBIS notation. Nodes can be linked only in ways specified by the IBIS grammar as I discuss next.
The IBIS grammar can can be summarized in a few simple rules:
- Issues can be raised anew or can arise from other issues, positions or arguments. In other words, any IBIS element can be questioned. In Compendium notation: a question node can connect to any other IBIS node.
- Ideas can only respond to questions – i.e. in Compendium “light bulb” nodes can only link to question nodes. The arrow pointing from the idea to the question depicts the “responds to” relationship.
- Arguments can only be associated with ideas – i.e in Compendium + and – nodes can only link to “light bulb” nodes (with arrows pointing to the latter)
The legal links are summarized in Figure 2 below.
The rules are best illustrated by example- follow the links below to see some illustrations of IBIS in action:
- See this post for a simple example of dialogue mapping.
- See this post or this one for examples of argument visualisation .
- See this post for the use IBIS in capturing project knowledge.
Now that we know how IBIS works and have seen a few examples of it in action, it’s time to trace its history from its origins to the present day.
A good place to start is where it all started. IBIS was first described in a paper entitled, Issues as elements of Information Systems; written by Horst Rittel (who coined the term “wicked problem”) and Werner Kunz in July 1970. They state the intent behind IBIS in the very first line of the abstract of their paper:
Issue-Based Information Systems (IBIS) are meant to support coordination and planning of political decision processes. IBIS guides the identification, structuring, and settling of issues raised by problem-solving groups, and provides information pertinent to the discourse.
Rittel’s preoccupation was the area of public policy and planning – which is also the context in which he defined wicked problems originally. He defined the term in his landmark paper of 1973 entitled, Dilemmas in a General Theory of Planning. A footnote to the paper states that it is based on an article that he presented at an AAAS meeting in 1969. So it is clear that he had already formulated his ideas on wickedness when he wrote his paper on IBIS in 1970.
Given the above background it is no surprise that Rittel and Kunz foresaw IBIS to be the:
…type of information system meant to support the work of cooperatives like governmental or administrative agencies or committees, planning groups, etc., that are confronted with a problem complex in order to arrive at a plan for decision…
The problems tackled by such cooperatives are paradigm-defining examples of wicked problems. From the start, then, IBIS was intended as a tool to facilitate a collaborative approach to solving such problems.
Operation of early systems
When Rittel and Kunz wrote their paper, there were three IBIS-type systems in operation: two in governmental agencies (in the US, one presumes) and one in a university environment (possibly, Berkeley, where Rittel worked). Although it seems quaint and old-fashioned now, it is no surprise that they were all manual, paper-based systems- the effort and expense involved in computerizing such systems in the early 70s would have been prohibitive, and the pay-off questionable.
The paper also offers a short description of how these early IBIS systems operated:
An initially unstructured problem area or topic denotes the task named by a “trigger phrase” (“Urban Renewal in Baltimore,” “The War,” “Tax Reform”). About this topic and its subtopics a discourse develops. Issues are brought up and disputed because different positions (Rittel’s word for ideas or responses) are assumed. Arguments are constructed in defense of or against the different positions until the issue is settled by convincing the opponents or decided by a formal decision procedure. Frequently questions of fact are directed to experts or fed into a documentation system. Answers obtained can be questioned and turned into issues. Through this counterplay of questioning and arguing, the participants form and exert their judgments incessantly, developing more structured pictures of the problem and its solutions. It is not possible to separate “understanding the problem” as a phase from “information” or “solution” since every formulation of the problem is also a statement about a potential solution.
Even today, forty years later, this is an excellent description of how IBIS is used to facilitate a common understanding of complex (or wicked) problems. The paper contains an overview of the structure and operation of manual IBIS-type systems. However, I’ll omit these because they are of little relevance in the present-day world.
As an aside, there’s a term that’s conspicuous by its absence in the Rittel-Kunz paper: design rationale. Rittel must have been aware of the utility of IBIS in capturing design rationale: he was a professor of design science at Berkley and design reasoning was one of his main interests. So it is somewhat odd that he does not mention this term even once in his IBIS paper.
Fast forward a couple decades (and more!)
In a paper published in 1988 entitled, gIBIS: A hypertext tool for exploratory policy discussion, Conklin and Begeman describe a prototype of a graphical, hypertext-based IBIS-type system (called gIBIS) and its use in capturing design rationale (yes, despite the title of the paper, it is more about capturing design rationale than policy discussions). The development of gIBIS represents a key step between the original Rittel-Kunz version of IBIS and its present-day version as implemented in Compendium. Amongst other things, IBIS was finally off paper and on to disk, opening up a new world of possibilities.
gIBIS aimed to offer users:
- The ability to capture design rationale – the options discussed (including the ones rejected) and the discussion around the pros and cons of each.
- A platform for promoting computer-mediated collaborative design work – ideally in situations where participants were located at sites remote from each other.
- The ability to store a large amount of information and to be able to navigate through it in an intuitive way.
Before moving on, one point needs to be emphasized: gIBIS was intended to be used in collaborative settings; to help groups achieve a shared understanding of central issues, by mapping out dialogues in real time. In present-day terms – one could say that it was intended as a tool for sense making.
The gIBIS prototype proved successful enough to catalyse the development of Questmap, a commercially available software tool that supported IBIS. However, although there were some notable early successes in the real-time use of IBIS in industry environments (see this paper, for example), these were not accompanied by widespread adoption of the technique. Other graphical, IBIS-like methods to capture design rationale were proposed (an example is Questions, Options and Criteria (QOC) proposed by MacLean et. al. in 1991), but these too met with a general reluctance in adoption.
Making sense through IBIS
The reasons for the lack of traction of IBIS-type techniques in industry are discussed in an excellent paper by Shum et. al. entitled, Hypermedia Support for Argumentation-Based Rationale: 15 Years on from gIBIS and QOC. The reasons they give are:
- For acceptance, any system must offer immediate value to the person who is using it. Quoting from the paper, “No designer can be expected to altruistically enter quality design rationale solely for the possible benefit of a possibly unknown person at an unknown point in the future for an unknown task. There must be immediate value.” Such immediate value is not obvious to novice users of IBIS-type systems.
- There is some effort involved in gaining fluency in the use of IBIS-based software tools. It is only after this that users can gain an appreciation of the value of such tools in overcoming the limitations of mapping design arguments on paper, whiteboards etc.
The intellectual effort – or cognitive overhead, as it is called in academese – in using IBIS in real time involves:
- Teasing out issues, ideas and arguments from the dialogue.
- Classifying points raised into issues, ideas and arguments.
- Naming (or describing) the point succinctly.
- Relating (or linking) the point to an existing node.
This is a fair bit of work, so it is no surprise that beginners might find it hard to use IBIS to map dialogues. However, once learnt, a skilled practitioner can add value to design (and more generally, sense making) discussions in several ways including:
- Keeping the map (and discussion) coherent and focused on pertinent issues.
- Ensuring that all participants are engaged in contributing to the map (and hence the discussion).
- Facilitating useful maps (and dialogues) – usefulness being measured by the extent to which the objectives of the session are achieved.
See this paper by Selvin and Shum for more on these criteria. Incidentally, these criteria are a qualitative measure of how well a group achieves a shared understanding of the problem under discussion. Clearly, there is a good deal of effort involved in learning and becoming proficient at using IBIS-type systems, but the payoff is an ability to facilitate a shared understanding of wicked problems – whether in public planning or in technical design.
Why IBIS is better than conventional modes of documentation
IBIS has several advantages over conventional documentation systems. Rittel and Kunz’s 1970 paper contains a nice summary of the advantages, which I paraphrase below:
- IBIS can bridge the gap between discussions and records of discussions (minutes, audio/video transcriptions etc,). IBIS sits between the two, acting as a short term memory. The paper thus foreshadows the use of issue-based systems as an aid to organizational or project memory.
- Many elements (issue, ideas or arguments) that come up in a discussion have contextual meanings that are different from any pre-existing definitions. In discussions, contextual meaning is more than formal meaning. IBIS captures the former in a very clear way – for example a response to a question “What do we mean by X? elicits the meaning of X in the context of the discussion, which is then subsequently captured as an idea (position)”.
- Related to the above, the commonality of an issue with other, similar issues might be more important than its precise meaning. To quote from the paper, “…the description of the subject matter in terms of librarians or documentalists (sic) may be less significant than the similarity of an issue with issues dealt with previously and the information used in their treatment…” With search technologies available, this is less of an issue now. However, search technologies are still limited in terms of finding matches between “similar” items (How is “similar” defined? Ans: it depends on context). A properly structured, context-searchable IBIS-based project archive may still be more useful than a conventional document archive based on a document management system.
- The reasoning used in discussions is made transparent, as is the supporting (or opposing) evidence. (see my post on visualizing argumentation for example)
- The state of the argument (discussion) at any time can be inferred at a glance (unlike the case in written records). See this post for more on the advantages of visual documentation over prose.
Issues with issue-based information systems
Lest I leave readers with the impression that IBIS is a panacea, I should emphasise that it isn’t. According to Conklin, IBIS maps have the following limitations:
- They atomize streams of thought into unnaturally small chunks of information thereby breaking up any smooth rhetorical flow that creates larger, more meaningful chunks of narrative.
- They disperse rhetorically connected chunks throughout a large structure.
- They are not is not chronological in structure (the chronological sequence is normally factored out);
- Contributions are not attributed (who said what is normally factored out).
- They do not convey the maturity of the map – one cannot distinguish, from the map alone, whether one map is more “sound” than another.
- They do not offer a systematic way to decide if two questions are the same, or how the maps of two related questions relate.
Some of these issues (points 3, 4) can be addressed by annotating nodes; others are not so easy to solve.
My aim in this post has been to introduce readers to the IBIS notation, and also discuss its origins, development and limitations. On one hand, a knowledge of the origins and development is valuable because it gives insight into the rationale behind the technique, which leads to a better understanding of the different ways in which it can be used. On the other, it is also important to know a technique’s limitations, if for no other reason than to be aware of these so that one can work around them.
Before signing off, I’d like to mention an observation from my experience with IBIS. The real surprise for me has been that the technique can capture most written arguments and discussions, despite having only three distinct elements and a very simple grammar. Yes, it does require some thought to do this, particularly when mapping discussions in real time. However, this cognitive “overhead” is good because it forces the mapper to think about what’s being said instead of just writing it down blind. Thoughtful transcription is the aim of the game. When done right, this results in a map that truly reflects a shared understanding of the complex (and possibly wicked) problem under discussion.
There’s no better coda to this post on IBIS than the following quote from this paper by Conklin:
…Despite concerns over the years that IBIS is too simple and limited on the one hand or too hard to use on the other, there is a growing international community who are fluent enough in IBIS to facilitate and capture highly contentious debates using dialogue mapping, primarily in corporate and educational environments…
For me that’s reason enough to improve my understanding of IBIS and its applications, and to look for opportunities to use it in ever more challenging situations.
One of the standard ways of characterising risk on projects is to use matrices which categorise risks by impact and probability of occurrence. These matrices provide a qualitative risk ranking in categories such as high, medium and low (or colour: red, yellow and green). Such rankings are often used to prioritise and allocate resources to manage risks. There is a widespread belief that the qualitative ranking provided by matrices reflects an underlying quantitative ranking. In a paper entitled, What’s wrong with risk matrices?, Tony Cox shows that the qualitative risk ranking provided by a risk matrix will agree with the quantitative risk ranking only if the matrix is constructed according to certain general principles. This post is devoted to an exposition of these principles and their consequences.
Since the content of this post may seem overly academic to some of my readers, I think it is worth clarifying why I believe an understanding of Cox’s principles is important for project managers. First, 3×3 and 4×4 risk matrices are widely used in managing project risk. Typically these matrices are constructed in an intuitive (but arbitrary) manner. Cox shows – using very general assumptions – that there is only one sensible colouring scheme (or form) of these matrices. This conclusion was surprising to me, and I think that many readers may also find it so. Second, and possibly more important, is that the arguments presented in the paper show that it is impossible to maintain perfect congruence between qualitative (matrix) and quantitative rankings. As I discuss later, this is essentially due to the impossibility of representing quantitative rankings accurately on a rectangular grid. Developing an understanding of these points will enable project managers to use risk matrices in a more logically sound manner.
Background and preliminaries
Let’s begin with some terminology that’s well known to most project managers:
Probability: This is the likelihood that a risk will occur. It is quantified as a number between 0 (will definitely not occur) and 1 (will definitely occur).
Impact (termed “consequence” in the paper): This is the severity of the risk should it occur. It can also be quantified as a number between 0 (lowest severity) and 1(highest severity).
Note that the above scales for probability and impact are arbitrary – other common choices are percentages or a scale of 0 to 10.
Risk: In many project risk management frameworks, risk is characterised by the formula: Risk = probability x impact. This formula looks reasonable, but is typically specified a priori, without any justification.
A risk can be plotted on a two dimensional graph depicting impact (on the x-axis) and probability (on the y-axis). This is typically where the problems start: for most risks, neither the probability nor the impact can be accurately quantified. The standard solution is to use a qualitative scale, where instead of numbers one uses descriptive text – for example, the probability, impact and risk can take on one of three values: high, medium and low (as shown in Figure 1 below). In doing this, analysts make the implicit assumption that the categorisation provided by the qualitative assessment ranks the risks in correct quantitative order. Problem is, this isn’t true.
Let’s look at the simple case of two risks A and B ranked on a 2×2 risk matrix shown in Figure 2 below. Let’s assume that the probability and impact of each of the two risks are independent and uniformly distributed between 0 and 1. Clearly, if the two risks have the same qualitative ranking (high, say), there is no way to rank them correctly unless one has quantitative knowledge of probability and impact – which is usually not the case. In the absence of this information, there’s a 50% chance (all other factors being equal) of ranking them correctly – i.e. one is effectively “flipping a coin” to choose which one has the higher (or lower) rank. This situation highlights a shortcoming of risk matrices: poor resolution. It is not possible to rank risks that have the same qualitative ranking.
“That’s obvious,” I hear you say – and you’re right. But there’s more: if one of the ratings is medium and the other one is not (i.e. the other one is high or low), then there is a non-zero chance of making an incorrect ranking because some points in the cell with the higher qualitative rating have a lower quantitative value of risk than some points in the cell with the lower qualitative ranking. Look at that statement again: it implies that risk matrices can incorrectly assign higher qualitative rankings to quantitatively smaller risks – i.e. there is the possibility of making ranking errors. This point is seriously counter-intuitive (to me anyway) and merits a proof, which Cox provides and I discuss below. Before doing so, I should also point out that the discussion of this paragraph assumes that the probabilities and impacts of the two risks are independent and uniformly distributed. Cox also points out that the chance of making the wrong ranking can be even higher if the joint distribution of the two are correlated. In particular, if the correlation is negative (i.e. probability decreases as impact increases), a random ranking is actually better than that provided by the risk matrix. In this situation the information provided by risk matrices is “worse than useless” (a random choice is better!). Negative correlations between probability and impact are actually quite common – many situations involve a mix of high probability-low impact and low probability-high impact risks. See the paper for more on this.
Weak consistency and its implications
With the issues of poor resolution and ranking errors established, Cox asks the question: What can be salvaged? The underlying problem is that the joint distribution of probability and impact is unknown. The standard approach to improving the utility of risk matrices is to attempt to characterise this distribution. This can be done using artificial intelligence tools – and Cox provides references to papers that use some of these techniques to characterise distributions. These techniques typically need plentiful data as they attempt to infer characteristics of the joint distribution from data points. Cox, instead, proposes an approach that is based on general properties of risk matrices – i.e. an approach that prescribes a set of rules that ensure consistency. This has the advantage of being general, and not depending on the availability of data points to characterise the probability distribution.
So what might a consistency criterion look like? Cox suggests that, at the very least, a risk matrix should be able to distinguish reliably between very high and very low risks. He formalises this requirement in his definition of weak consistency, which I quote from the paper:
A risk matrix with more than one “colour” (level of risk priority) for its cells satisfies weak consistency with a quantitative risk interpretation if points in its top risk category (red) represent higher quantitative risks than points in its bottom category (green)
The notion of weak consistency formalises the intuitive expectation that a risk matrix must, at the very least, distinguish between the lowest and highest (quantitative) risks. If it can’t, it is indeed “worse than useless”. Note that weak consistency doesn’t say anything about distinguishing between medium and lowest/highest risks – merely between the lowest and highest.
Having defined weak consistency, Cox derives some of its surprising consequences, which I describe next.
Cox’s First Lemma: If a risk matrix satisfies weak consistency, then no red cell (highest risk category) can share an edge with a green cell (lowest risk category).
Proof: To see how this is plausible, consider the different ways in which a red cell can adjoin a green one. Basically there are only two ways in which this can happen, which I’ve illustrated in Figure 3. Now assume that the quantitative risk of the midpoint of the common edge is a number n (n between 0 and 1). Then if x and y and are the impact and probability, we have
xy=n or y=n/x
So, the locus of all points having the same risk (often called the iso-risk contour) as the midpoint is a rectangular hyperbola with negative slope (i.e. y decreases as x increases). The negative slope (see Figure 3) implies that the points above the iso-risk contour in the green cell have a higher quantitative risk than points below the contour in the red cell. This contradicts weak consistency. Hence – by reductio ad absurdum – it isn’t possible to have a green cell and a red cell with a common edge.
Cox’s Second Lemma: if a risk matrix satisfies weak consistency and has at least two colours (green in lower left and red in upper right, if axes are oriented to depict increasing probability and impact), then no red cell can occur in the bottom row or left column of the matrix.
Proof: Assume it is possible to have a red cell in the bottom row or left column. Now consider an iso-risk contour for a sufficiently small risk (i.e. a contour that passes through the lower left-most green cell). By the properties of rectangular hyperbolas, this contour must pass through all cells in the bottom row and the left-most column, as shown in Figure 4. Thus, by an argument similar to the one of the previous lemma, all points below the iso-risk contour in either of the red cells have a smaller quantitative risk than point above it in the green cell. This violates weak consistency, and hence the assumption is incorrect.
An implication that follows directly from the above lemmas is that any risk matrix that satisfies weak consistency must have at least three colours!
Surprised? I certainly was when I first read this.
Between-ness and its implications
If a risk matrix provides a qualitative representation of the actual qualitative risks, then small changes in the probability or impact should not cause discontinuous jumps in risk categorisation from lowest to highest category without going through the intermediate category. (Recall, from the previous section, that a weakly consistent matrix must have at least three colours).
This expectation is formalised in the axiom of between-ness:
A risk matrix satisfies the axiom of between-ness if every positively sloped line segment that lies in a green cell at its lower end and a red cell at its upper end must pass through at least one intermediate cell (i.e. one that is neither red nor green).
By definition, no 2×2 cell can satisfy between-ness. Further, amongst 3×3 matrices, only one colour scheme satisfies both weak consistency and between-ness. This is the matrix shown in Figure 1: green in the left and bottom columns , red in upper right-most cell and yellow in all other cells. This, to me, is a truly amazing consequence of a couple of simple, intuitive axioms.
Consistent colouring and its implications
The basic idea behind consistent colouring is that risks that have the identical quantitative values should have the same qualitative ratings. This is impossible to achieve in a discrete risk matrix because iso-risk contours cannot coincide with cell boundaries (Why? Because iso-risk contours have negative slopes whereas cell boundaries have zero or infinite slope – i.e. they are horizontal or vertical lines). So, Cox suggests the following: enforce consistent colouring for extreme categories only – red and green – allowing violations for intermediate categories. What this means is that cells that contain iso-risk contours which pass through other red cells (“red contours”) must be red and cells that contain iso-risk contours which pass through other green cells (“green contours”) must be green. Hence the following definition of consistent colouring:
- A cell is red if it contains points with quantitative risks at least as high as those in other red cells, and does not contain points with quantitative risks as small as those on any green cell.
- A cell is green if it contains points with risks at least as small as those in other green cells, and does not contain points with quantitative risks as high as those in any red cell.
- A cell has an intermediate colour only if it a) lies between a red cell and a green cell or b) it contains points with quantitative risks higher than those in some red cells and also points with quantitative risks lower than those in some green cells.
An iso-risk contour is green if it passes through one or more green cells but no red cells and a red contour is one which passes through one or more red cells but no green cells. Consistent colouring then implies that cells with red contours and no green contours are red; and cells with green contours and no red contours are green (and, obviously, cells with contours of both colours are intermediate)
Implications of the three axioms – Cox’s Risk Matrix Theorem
So, after a longish journey, we have three axioms: weak consistency, between-ness and consistent colouring. With that done, Cox rolls out his theorem – which I dub Cox’s Risk Matrix Theorem (not to be confused with Cox’s Theorem from statistics!), which can be stated as follows:
In a risk matrix satisfying weak consistency, between-ness and consistent colouring:
a) All cells in the leftmost column and in the bottom row are green.
b) All cells in the second column from the left and the second row from the bottom are non-red.
The proof is a bit long, so I’ll omit it, making a couple of plausibility arguments instead:
- The lower leftmost cell is green (by definition), and consistent colouring implies that all contours that lie below the one passing through the upper right corner of this cell must also be green because a) they pass through the lower leftmost cell which is green and b) none of the other cells they pass through are red (by Cox’s second lemma). The other cells on the lowest or leftmost edge of the matrix can only be intermediate or green. That they cannot be intermediate is a consequence of between-ness.
- That the second row and second column must be non-red is also easy to see: assume any of these cells to be red. We then have a red cell adjoining a green cell, which violates between-ness.
I’ll leave it at that, referring the interested reader to the paper for a complete proof.
Cox’s theorem has an immediate corollary which is particularly interesting for project managers who use 3×3 and 4×4 risk matrices:
A tricoloured 3×3 or 4×4 matrix that satisfies weak consistency, between-ness and consistent colouring can have only the following (single!) colour scheme:
a) Leftmost column and bottom row coloured green.
b) Top right cell (for 3×3) or four top right cells (for 4×4) coloured red.
c) All other cells coloured yellow.
Proof: Cox’s theorem implies that the leftmost column and bottom row are green. The top right cell must be red (since it is a tricoloured matrix). Consistent colouring implies that the two cells adjoining this cell (in a 4×4 matrix) and the one diagonally adjacent must also be red (this cannot be so for a 3×3 matrix because these cells would adjoin a green cell which violates Cox’s first lemma). All other cells must be yellow by between-ness.
This result is quite amazing. From three very intuitive axioms Cox derives essentially the only possible colouring scheme for 3×3 and 4×4 risk matrices.
This brings me to the end of this post on the Cox’s axiomatic approach to building logically consistent risk matrices. I highly recommend reading the original paper for more. Although it presents some fairly involved arguments, it is very well written. The arguments are presented with clarity and logical surefootedness, and the assumptions underlying each argument are clearly laid out. The three principles (or axioms) proposed are intuitively appealing – even obvious – but their consequences are quite unexpected (witness the unique colouring scheme for 3×3 and 4×4 matrices). Further, the arguments leading up to the lemmas and theorems bring up points that are worth bearing in mind when using risk matrices in practical situations.
In closing I should mention that the paper also discusses some other limitations of risk matrices that flow from these principles: in particular, spurious risk resolution and inappropriate resource allocation based on qualitative risk categorisation. For reasons of space, and the very high likelihood that I’ve already tested my readers’ patience to near (if not beyond) breaking point, I’ll defer a discussion of these to a future post.
Note added on 20 December, 2009:
See this post for a visual representation of the above discussion of Cox’s risk matrix theorem and the comments that follow.