Archive for March 2010
System design is a creative activity, but one that is subject to a variety of constraints. Many of these constraints are obvious: for example, when tasked with designing a new software product, a team might be asked to work within a budget or use a particular technology. These constraints place boundaries on the design activity; they force designers to work within parameters specified by the constraints. But there are other less obvious constraints too. In a paper entitled, How Do Committees Invent, published in 1968, Melvin Conway described a notion that is now called Conway’s Law: An organisation which designs a system will inevitably produce a design that mirrors the organisation’s communication structure. This post is a summary of the key points of the paper.
Conway begins the paper with the observation that the system design is an activity that involves specifying how a system will be built using a number of diverse parts. Many elements of the act of design are similar, regardless of the nature of the system –be it software or a shopping mall. The objective of a design team or organisation is to produce a specification or blueprint based on which the system can be built.
Much of design work is about making choices. Conway points out that these choices may be more than design decisions:
Most design activity requires continually making choices. Many of these choices may be more than design decisions; they may also be personal decisions the designer makes about his own future. As we shall see later, the incentives which exist in a conventional management environment can motivate choices which subvert the intent of the sponsor.
The paper is essentially an elaboration and justification of this claim.
The preliminary stages of design work are more about organizing than design itself. First, the boundaries have to be understood so that the solution space can be defined. Second, the high-level structure of the system has to be explored so that work can be subdivided in a sensible way within the organisation that’s doing the design. This latter point is the crux of Conway’s argument:
…the very act of organizing a design team means that certain design decisions have already been made, explicitly or otherwise. Given any design team organization, there is a class of design alternatives which cannot be effectively pursued by such an organization because the necessary communication paths do not exist. Therefore, there is no such thing as a design group which is both organized and unbiased.
There are a couple of important points here:
- The act of delegating design tasks narrows the scope of design options that can be pursued.
- Once tasks are delegated to groups, coordination (via communication) between these groups is the only way that the work can be integrated.
Further, once established, it is very hard to change design idea (or a project team, for that matter).
The system mirrors the organisation
Most systems of any significance are composed of several subsystems that communicate with each other via interfaces. According to Conway, these elements (italicized in the previous sentence) have a correspondence with the organisation that designs the system. How so? Well, every subsystem is designed by a group within the organisation (call it a design group). If two subsystems are to communicate interact with each other, the two groups responsible for their design must communicate with each other (to negotiate the interface design). If subsystems don’t interact, no communication is necessary. What we see from this argument is that the communication between subsystems roughly mirrors the communication paths within the organisation.
As any system designer knows: given a set of requirements, there are a number of designs that can satisfy them. If the argument of the previous paragraph is true then the structure of the design organisation (or project team) influences the choice that is made.
Managing systems design
Conway points out that large system design efforts spin out of control more often than those for small systems. He surmises that this happens when the design becomes too complex for one person (or a small, tightly-knit group of people). A standard management reaction to such a situation is to delegate the design of component to sub-teams. Why? Well here’s what Conway says:
A manager knows that he will be vulnerable to the charge of mismanagement if he misses his schedule without having applied all his resources. This knowledge creates a strong pressure on the initial designer who might prefer to wrestle with the design rather than fragment it by delegation, but he is made to feel that the cost of risk is too high to take the chance. Therefore, he is forced to delegate in order to bring more resources to bear.
A major fallacy in this line of thinking is that more resources means that work gets done faster. It is well known that this isn’t so – at least as far as software systems development is concerned. Conway points out that politics also contributes to this effect. In most organisations, managerial status is tied to team size and project budgets. This provides an incentive to managers to expand their organisations (i.e. project teams), making design delegation almost inevitable.
Large teams have a large number of communication paths between their members. Specifically, in a team consisting of N people, there are N(N-1)/2 possible communication paths – each person can communicate with N-1 people making N(N-1), but this has to be halved because paths between every two individuals are counted twice. Organisations deal with this by restricting communication paths to hierarchical management structures. Because communication paths mirror organizational structures, it is almost inevitable that system designs will mirror them.
The main implication of Conway’s thesis is that a project team (or any organisation) charged with designing a system should be structured in a way that suits the communication needs of the system. For example, sub-teams involved in designing related subsystems should have many more communication channels than those that design independent components. Further, system design is inherently complex, and the first design is almost never the final one. A consequence that flows from this is that design organisations should be flexible because they’ll almost always need to be reorganized.
In the end it is less about the number of people on a team than the communication between them. As Conway mentions in the last two lines of his paper:
There is need for a philosophy of system design management which is not based on the assumption that adding manpower simply adds to productivity. The development of such a philosophy promises to unearth basic questions about value of resources and techniques of communication which will need to be answered before our system-building technology can proceed with confidence.
This is as true now as it was forty-odd years ago.
A large number of the posts on this blog do not get much attention – not too many hits and few if any comments. There could be several reasons for this, but I need to consider the possibility that readers find many of the things I write about uninteresting. Now, this isn’t for the want of effort from my side: I put a fair bit of work into research and writing, so it is a little disappointing. However, I take heart from the possibility that it might not be entirely my fault: there’s a statistical reason (excuse?) for the dearth of quality posts on this blog. This (possibly uninteresting) post discusses this probabilistic excuse.
The argument I present uses the concepts of conditional probability and Bayes Theorem. Those unfamiliar with these may want to have a look at my post on Bayes theorem before proceeding further.
Grist for my blogging mill comes from a variety of sources: work, others’ stories, books, research papers and the Internet. Because of time constraints, I can write up only a fraction of the ideas that come to my attention. Let’s put a number to this fraction – say I can write up only 10% of the ideas I come across. Assuming that my intent is to write interesting stuff, this number corresponds to the best (or most interesting) ideas I encounter. Of course, the term “interesting” is subjective – an idea that fascinates me might not have the same effect on you. However this is a problem for most qualitative judgements, so we’ll accept this and move on.
If we denote the event “I have an interesting idea” by and its probability by , we have:
Then, if we denote the event “I have an idea that is uninteresting” by , we have:
assuming that an idea must either be interesting or uninteresting (no other possibilities allowed).
Now, for me to write up an idea, I have to find it interesting (i.e. judge it as being in the top 10%). Let’s be generous and assume that I correctly recognise an interesting idea (as being interesting) 70% of the time. From this, the conditional probability of my writing a post given that I encounter an interesting idea, , is:
where is the event that I write up an idea.
On the flip side, let’s assume that I correctly recognise 80% of the uninteresting ideas that I encounter as being no good. This implies that I incorrectly identify 20% of the uninteresting stuff as being interesting. That is, 20% of the uninteresting stuff is wrongly identified as being blog-worthy. So, the conditional probability of my writing a post about an uninteresting idea, , is:
(If the above values for and are confusing remember that, by assumption, I write about all ideas that I find interesting – and this includes those ideas that I deem interesting but are actually uninteresting)
Now, we want to figure out the probability that a post that appears on my blog is interesting – i.e. that a post is interesting given that I have written it up. Using the notation of conditional probability, this can be written as . Bayes Theorem tells us that:
, which is the probability that I write a post, can be expressed as follows:
= probability that I write an interesting post+ probability that I write an uninteresting post
This can be written as,
Substituting this in the expression for Bayes Theorem, we get:
Using the numbers quoted above
So, only 28% of the ideas I write about are interesting. The main reason for is my inability to filter out all the dross. These “false positives” – which are all the ideas that I identify as interesting but are actually not – are represented by the term in the denominator. Since there are way more bad ideas than good ones floating around (pretty much everywhere!), the chance of false positives is significant.
So, there you go: it isn’t my fault really. 🙂
I should point out that the percentage of interesting ideas written up will be small whenever the false positive term is significant compared to the numerator. In this sense the result is insensitive to the values of the probabilities that I’ve used.
Of course, the argument presented above is based on a number of assumptions. I assume that:
- Mostreaders of this blog share my interests.
- The ideas that I encounter are either interesting or uninteresting.
- There is an arbitrary cutoff point between interesting and uninteresting ideas (the 10% cutoff).
- There is an objective criterion for what’s interesting and what’s not, and that I can tell one from the other 70% of the time.
- The relevant probabilities are known.
…and so, to conclude
I have to accept that much of the stuff I write about will be uninteresting, but can take consolation in the possibility that it is a consequence of conditional probabilities. I’m trumped by conditionality, once more.
This post was inspired by Peter Rousseeuw’s brilliant and entertaining paper entitled, Why the Wrong Papers Get Published. Thanks also go out to Vlado Bokan for interesting conversations about conditional probabilities and Bayes theorem.
Projects are fraught with uncertainty, so it is no surprise that the language and tools of probability are making their way into project management practice. A good example of this is the use of Monte Carlo methods to estimate project variables. Such tools enable the project manager to present estimates in terms of probabilities (e.g. there’s a 90% chance that a project will finish on time) rather than illusory certainties. Now, it often happens that we want to find the probability of an event occurring given that another event has occurred. For example, one might want to find the probability that a project will finish on time given that a major scope change has already occurred. Such conditional probabilities, as they are referred to in statistics, can be evaluated using Bayes Theorem. This post is a discussion of Bayes Theorem using an example from project management.
Bayes theorem by example
All project managers want to know whether the projects they’re working on will finish on time. So, as our example, we’ll assume that a project manager asks the question: what’s the probability that my project will finish on time? There are only two possibilties here: either the project finishes on (or before) time or it doesn’t. Let’s express this formally. Denoting the event the project finishes on (or before) time by , the event the project does not finish on (or before) time by and the probabilities of the two by and respectively, we have:
Equation (1) is simply a statement of the fact that the sum of the probabilities of all possible outcomes must equal 1.
Fig 1. is a pictorial representation of the two events and how they relate to the entire universe of projects done by the organisation our project manager works in. The rectangular areas and represent the on time and not on time projects, and the sum of the two areas, , represents all projects that have been carried out by the organisation.
In terms of areas, the probabilities quoted above can be expressed as:
This also makes explicit the fact that the sum of the two probabilities must add up to one.
Now, there are several variables that can affect project completion time. Let’s look at just one of them: scope change. Let’s denote the event “there is a major change of scope” by and the complementary event (that there is no major change of scope) by .
Again, since the two possibilities cover the entire spectrum of outcomes, we have:
Fig 2. is a pictorial representation of by and .
The rectangular areas and represent the projects that have undergone major scope changes and those that haven’t respectively.
Clearly we also have since the number of projects completed is a fixed number, regardless of how it is arrived at.
Now things get interesting. One could ask the question: What is the probability of finishing on time given that there has been a major scope change? This is a conditional probability because it represents the likelihood that something will happen (on-time completion) on the condition that something else has already happened (scope change).
As a first step to answering the question posed in the previous paragraph, let’s combine the two events graphically. Fig 3 is a combination of Figs 1 and 2. It shows four possible events:
- On Time with Major Change (, ) – denoted by the rectangular area in Fig 3.
- On Time with No Major Change (, ) – denoted by the rectangular area in Fig 3.
- Not On Time with Major Change (, ) – denoted by the rectangular area in Fig 3.
- Not On Time with No Major Change (, $\tilde latex C$) – denoted by the rectangular area in Fig 3.
We’re interested in the probability that the project finishes on time given that it has suffered a major change in scope. In the notation of conditional probability, this is denoted by . In terms of areas, this can be expressed as
since (or equivalently ) represent all projects that have undergone a major scope change.
Similarly, the conditional probability that a project has undergone a major change given that it has come in on time, , can be written as:
Now, what I’m about to do next may seem like pointless algebraic jugglery, but bear with me…
Consider the ratio of the area to the big outer rectangle (whose area is ) . This ratio can be expressed as follows:
This is simply multiplying and dividing by the same factor ( in the second expression and in the third.
Written in the notation of conditional probabilities, the second and third expressions in (9) are:
which is Bayes theorem.
From the above discussion, it should be clear that Bayes theorem follows from the definition of conditional probability.
We can rewrite Bayes theorem in several equivalent ways:
where the denominator in (12) follows from the fact that a project that undergoes a major change will either be on time or will not be on time (there is no other possibility).
A numerical example
To complete the discussion, let’s look at a numerical example.
Assume our project manager has historical data on projects that have been carried out within the organisation. On analyzing the data, the PM finds that 60% of all projects finished on time. This implies:
Let us assume that our organisation also tracks major changes made to projects in progress. Say 50% of all historical projects are found to have major changes. This implies:
Finally, let us assume that our project manager has access to detailed data on successful projects, and that an analysis of this data shows that 30% on time projects have undergone at least one major scope change. This gives:
Equations (13) through (16) give us the numbers we need to calculated using Bayes Theorem. Plugging the numbers in equation (11), we get:
So, in this organisation, if a project undergoes a major change then there’s a 36% probability that it will finish on time. Compare this to the 60% (unconditional) probability of finishing on time. Bayes theorem enables the project manager to quantify the impact of change in scope on project completion time, providing the relevant historical data is available. The italicised bit in the previous sentence is important; I’ll have more to say about it in the concluding section.
In closing this section I should emphasise that although my discussion of Bayes theorem is couched in terms of project completion times and scope changes, the arguments used are general. Bayes theorem holds for any pair of events.
It should be clear that the probability calculated in the previous section is an extrapolation based on past experience. In this sense, Bayes Theorem is a formal statement of the belief that one can predict the future based on past events. This goes beyond probability theory; it is an assumption that underlies much of science. It is important to emphasise that the prediction is based on enumeration, not analysis: it is solely based on ratios of the number of projects in one category versus the other; there is no attempt at finding a causal connection between the events of interest. In other words, Bayes theorem suggests there is a correlation between major changes in scope and delays, but it does not tell us why. The latter question can be answered only via a detailed study which might culminate in a theory that explains the causal connection between changes in scope and completion times.
It is also important to emphasise that data used in calculations should be based on events that akin to the one at hand. In the case of the example, I have assumed that historical data is for projects that resemble the one the project manager is working on. This assumption must be validated because there could be situations in which a major change in scope actually reduces completion time (when the project is “scoped-down”, for instance). In such cases, one would need to ensure that the numbers that go into Bayes theorem are based on historical data for “scoped-down” projects only.
To sum up: Bayes theorem expresses a fundamental relationship between conditional probabilities of two events. Its main utility is that it enables us to make probabilistic predictions based on past events; something that a project manager needs to do quite often. In this post I’ve attempted to provide a straightforward explanation of Bayes theorem – how it comes about and what its good for. I hope I’ve succeeded in doing so. But if you’ve found my explanation confusing, I can do no better than to direct you to a couple of excellent references.
- An Intuitive (and short) explanation of Bayes Theorem – this is an excellent and concise explanation by Kalid Azad of Better Explained.
- An intuitive explanation of Bayes Theorem – this article by Eliezer Yudkowsky is the best explanation of Bayes theorem I’ve come across. However, it is very long, even by my verbose standards!
The relationship between software developers and project managers is often fraught with conflict. Yet they have to get along professionally: the success of a software project depends to a large extent on whether or not they can work together. A paper entitled Stop whining, start doing! Identity conflict in project-managed software production, published in the May 2009 issue of Ephemera, delves into this issue by looking at how the two parties view themselves in relation to each other. This post is a summary and review of the paper.
A quick word or two before diving into the paper. First, identity conflict occurs when a person or a group feels that their sense of self (i.e. their identity) is threatened or denied legitimacy and respect. Second, although the title of the paper suggests that it deals with identity conflict in project-managed software production, the authors’ focus is considerably narrower: the analysis presented in the paper is based on selected Slashdot discussion threads involving software developers and project managers. In effect, the authors draw their conclusions about identity conflict in software production based on opinions expressed online forums. It is, I think, stretch to read too much into such exchanges. I’ll have more to say about this in the final section of this post.
The authors of the paper, Peter Case and Erik Pineiro, begin their analysis by noting that although project managers are usually above programmers in the organizational hierarchy, the relationship is complicated by two factors:
- Project managers usually do not need (and often lack) the educational qualifications possessed by programmers.
- Project managers usually do not have the depth of technical knowledge that programmers have.
These two conditions often lead to the view amongst programmers that they (the programmers) have a higher organizational status (or matter more) than project managers do.
The authors end their introduction with the observation that the skills of programmers are necessary for the creation of software, but equally necessary is the need to direct software building efforts using some form of project management. This observation underlines the symbiotic relationship between developers and project managers. On the other hand the differences between the two disciplines is also a source of conflict between the two parties. The main aim of the paper is to highlight how this conflict plays out in online discussions involving developers and project managers.
Case and Pineiro base their research on an analysis of contributions to online discussions on Slashdot. A large number of Slashdot contributors are programmers. The discussions cover a range of topics of interest to coders – from the merits of particular technologies, to outsourcing, aesthetics and personal philosophies of programming. The authors use two ideas (or, more accurately, lenses) to interpret the data, the data being individual contributions to discussions.
First, the authors contend that contributors to Slashdot are:
… bringing about social effects through their displays of technical bravado, expressions of aesthetic preference and espousals of dissent, resistance and subversion.
In particular, when discussing the relationship between programmer and project managers, contributors – through the discussion – are creating (or confirming) a certain view of the relationship between the two parties. In doing so, they are enacting their own identities (as members of a particular guild). They do so in opposition to the other party – i.e. by setting themselves apart by negating the skills, knowledge and work of the other.
Second, the work that programmers do has a certain value in the marketplace – i.e it is created because of its (direct or indirect) commercial value. This has two consequences:
- Programmers feel that they have to compromise on quality (or aesthetics) because of time pressures and, on the other hand, project managers feel that programmers do not understand the commercial imperative to move the project forward.
- Both sides feel that their own specialized area of knowledge – technical or managerial – is the truly important one as far as the commercial aspects of the project are concerned.
The authors use excerpts from Slashdot discussions to highlight these two dynamics at work.
Data analysis and discussion
The programmers’ perspective
In Slashdot discussions, Case and Pineiro find that programmers articulate different identities depending on the audience. In some situations they express an interest in (or agree with the importance of) meeting deadlines, getting the product to market etc. Whereas, in other situations, particularly when responding to (or comparing themselves to) project managers, they might express an interest in code aesthetics – i.e. the importance of writing good, even beautiful, code. Several examples of the latter can be found in a discussion thread on software aesthetics. Many of the contributions to the discussion assert that (good) programmers have the following traits:
- Are passionate about their work
- Care deeply about quality.
- Understand of the importance of good code.
Several contributors make these points in opposition to the work-related traits of managers. That is, they set up a stereotypical manager – who doesn’t care about code quality etc. – as a foil for their rhetorical arguments.
Technical knowledge is another important way in which programmers set themselves apart from project managers. The contention here is that managers cannot really understand what a software system is because they don’t have the required knowledge; an implication being that they are not qualified to manage its construction. Case and Pineiro mention how this issue can really raise emotions when it comes to the issue of outsourcing (see this discussion thread for example).
The project manager’s perspective
Although the Slashdot community is dominated by programmers, the authors were able to find a few contributions that gave the project managers’ viewpoint. For example, a discussion on project management for programmers, gave some project managers an opportunity to articulate their views. In the discussion, project managers tended to focus on two aspects – performance (i.e. completing the project on time) and project management knowledge – both of which (they implied) programmers do not understand. See this comment for an example of the former and this one or this one for examples of the latter.
Project managers thus set themselves apart (i.e. construct their identities) as distinct from programmers. They are concerned with driving the project to successful completion and they have specialized knowledge and training to do this; both of which programmers do not possess.
The authors’ conclusions
From their analysis of selected Slashdot discussions, Pineiro and Case conclude the following:
- Software developers and project managers tend to construct their identities in opposition to each other – i.e. by setting themselves (their skills, motivations and concerns) as being different from the other. On the other hand, business imperatives require that they collaborate, so there is a symbiotic aspect to their relationship.
- Despite project managers being marginally higher up than programmers in organizational hierarchy, there is little difference in the status of the two. The reasons for this are: 1) Project managers lack the specialized technical knowledge needed to understand programmers’ work and 2) Software project managers’ are dependent on programmers – perhaps more so than project managers in other disciplines. As a consequence, the difference in status might rankle with programmers. Thus the hierarchical proximity of the two parties might also be one of the reasons for the conflict between the two.
- Both parties generally claim to have the same goal – produce quality software in reasonable time. They differ, however, in the means to achieve the goal. Programmers believe the focus should be on producing high quality code whereas project managers tend to emphasise the optimization of resources and time, whilst meeting the system requirements.
The paper isn’t an easy read because of the wide use of sociological jargon, but then it is a research paper. The central idea – that online discussions can serve to construct or reinforce programmer and project manager identities – is an intriguing one. If true, it implies that one’s self (and projected) image influences, and is influenced by, how one describes oneself to others in speech or writing. For instance, does my claim that I’m a “seasoned database architect” and “experienced project manager” reinforce my professional identity? Also interesting is the idea that developer and project manager identities are often constructed in opposition to each other. That is, both parties appear to build their identities by casting the other in a negative light.
However, as interesting as all this is, I believe it misses a crucial point: that programmers and project managers, for the most part, play out roles which are defined and enforced by the organisations they work for. If organisational rules and norms require programmers (or project managers) to behave in dysfunctional ways, then that’s exactly what most of them will do. On the other hand, if an organisation encourages behaviour that fosters good working relationships between the two parties, things would be different: those working in such environments would have more positive experiences to to relate (this comment taken from one of threads analysed by the authors is a case in point). It is surprising that the authors chose not to include such (positive) comments in their analysis.
In brief: the paper presents an interesting perspective on the relationship between programmers and project managers. Even though the authors’ focus is somewhat limited, the paper is of potential interest to researchers and practitioners interested in work relationships in project environments.