Enumeration or analysis? A note on the use and abuse of statistics in project management research
In a detailed and insightful response to my post on bias in project management research, Alex Budzier wrote, “Good quantitative research relies on Theories and has a sound logical explanation before testing something. Bad research gets some data throws it to the wall (aka correlation analysis) and reports whatever sticks.” I believe this is a very important point: a lot of current research in project management uses statistics in an inappropriate manner; using the “throwing data on a wall” approach that Alex refers to in his comment. Often researchers construct models and theories based on data that isn’t sufficiently representative to support their generalisations.
This point is the subject of a paper entitled, On Probability as a Basis for Action, published by Edwards Deming in 1975. In the paper, Deming makes the important distinction between enumerative and analytical studies. The basic difference between the two is that analytical studies are aimed at establishing cause and effect based on data (i.e. building theories that explain why the data is what it is), whereas enumerative studies are concerned with classification (i.e categorising data). In this post I delve into the use (or abuse) of statistics in project management research, with particular reference to enumerative and analytical studies. The discussion presented below is based on Deming’s paper and a very readable note by David and Sarah Kerridge.
Some terminology before diving into the discussion: Deming uses the notion of a frame, which he defines as an aggregate of identifiable physical units of some kind, any or all of which may be selected and investigated. In short: the aggregate of potential samples.
So what’s an enumerative study? In his paper, Deming defines it as one in which, “…action will be taken on the material in the frame studied…The aim of a study in an enumerative problem is descriptive. How many farms or people belong to this or that category? What is the expected out-turn of wheat for this region? How many units in the lot are defective? The aim (in the last example) is not to find out why there are so many or so few units in this or that category: merely how many.”
In contrast, an analytic study is one “in which action will be taken on the process or cause-system that produced the frame studied, the aim being to improve practice in the future…Examples include, comparison of two industrial processes A and B. (Possible) actions: adopt method B over method A, or hold on to A, or continue the experiment (gather more data).“
Deming also provides a criterion by which to distinguish between enumerative and analytic studies. To quote from the paper, “A 100 percent sample of the frame provides the complete answer to the question posed for the enumerative problem, subject to the limitations of the method of investigation. In contrast a 100 percent sample of the frame is inconclusive in an analytic problem“
It may be helpful to illustrate the above via project management examples. A census of tools used by project managers is an enumerative problem: sampling the entire population of project managers provides a complete answer. In contrast, building (or validating) a model of project manager performance is an analytic study: it is not possible, even in principle, to verify the model under all circumstances. To paraphrase Deming: there is no statistical method by which to extrapolate the validity of the model to other project managers or environments. This is the key point. Statistical methods have to be complemented by knowledge of the subject matter – in the case of project manager performance this may include organisational factors, environmental effects, work history and experience of project managers etc. Such knowledge helps the investigator design studies that cover a wide range of circumstances, paving the way for generalisations necessary for theory building. Basically, the sample data must cover the entire range over which generalisations are made. What this means is that the choice of samples depends on the aim of the study. The Kerridges offer some examples in their note, which I reproduce below:
Aim: Discover problems and possibilities, to form a new theory.
Method: Look for interesting groups, where new ideas will be obvious. These
may be focus groups, rather than random samples. Accuracy and rigour aren’t required. But this assumes that the possibilities discovered will be tested by other means, before making any prediction.
Aim: Predict the future, to test a general theory.
Method: Study extreme and atypical samples, with great rigour and accuracy.
Aim: Predict the future, to help management.
Method: Get samples as close as possible to the foreseeable range of circumstances
in which the prediction will be used in practice.
Aim: Change the future, to make it more predictable.
Method: Use statistical process control to remove special causes, and experiment using the PDSA cycle to reduce common cause variation.
Unfortunately, many project management studies that purport to build theories do not exercise appropriate care in study design. The typical offence is that samples used in the studies do not support generalisations made. The resulting theories are thus built on flimsy empirical foundations. To be sure, most offenders label their studies as preliminary (other favoured adjectives include exploratory, tentative initial etc), thereby absolving themselves of responsibility for their irresponsible speculations. That would be OK if such work were followed up by a thorough empirical study, but it often isn’t. I’m loath to point fingers at specific offenders, but readers will find an example or two amongst papers reviewed on this blog. Lest I be accused of making gross and unfair generalisations, I should hasten to add that the reviews also include papers in which statistical analysis is done right (I’ll leave it to the reader to figure out which ones these are…).
To sum up: in this post I’ve discussed the difference between enumerative and analytic studies and its implications for the validity of some published project management research. Enumerative statistics deals with counting and categorisations whereas the analytical studies are concerned with clarifying cause-effect relationships. In analytical work, it is critical that samples are chosen that reflect the stated intent of the work, be it general theory-building or prediction in specific circumstances. Although this distinction should be well understood (having been articulated clearly over quarter a century ago!), it appears that it isn’t always given due consideration in project management research.