Fooled by conditionality
Conditional probability refers to the chance that an event will occur given that another (possibly related) event has occurred. Understanding how conditional probability works is important – and occasionally even a matter of life or death. For instance, a person may want to know the chance that she has a life-threatening disease given that she has tested positive for it.
Unfortunately, there is a good deal of confusion about conditional probability. I’m amongst the confused, so I thought I’d do some reading on the topic. I began with the Wikipedia article (which wasn’t too helpful) and then went on to other references. My search lead me to some interesting research papers on the confusion surrounding conditional probabilities. This post is inspired by a couple of papers I came across.
In a paper on the use and misuse of conditional probabilities, Walter Kramer and Gerd Gigerenzer point out that many doctors cannot answer the “life or death” question that I posed in the first paragraph of this post. Here are a few pertinent lines from from their article:
German medical doctors with an average of 14 years of professional experience were asked to imagine using a certain test to screen for colorectal cancer. The prevalence of this type of cancer was 0.3%, the sensitivity of the test (the conditional probability of detecting cancer when there is one) was 50% and the false positive rate was 3%. The doctors were asked: “What is the probability that someone who tests positive actually has colorectal cancer?
Kramer and Gigerenzer found that the doctors’ answers ranged from 1% to 99% with about half of them answering 50% (the sensitivity) or 47% (the sensitivity minus that false positive rate).
You may want to have a try at answering the question before proceeding further.
The question can be answered quite easily using Bayes’ rule, which tells us how to calculate the conditional probability of an event given that another (possibly related) event has occurred. If the two events are denoted by A and B, the conditional probability that A will occur given that B has occurred, denoted by P(A|B), is:
Where P(B|A) is the conditional probability that B will occur given that A has occurred and P(A) and P(B) are the probabilities of A and B occurring respectively. See the appendix at the end of this post for more on Bayes rule.
In terms of the problem stated above, Bayes rule is:
P(Has cancer|Tests positive) = P(Tests positive|Has cancer) * P(Has cancer) / P(Tests positive)
From the problem statement we have:
P(Tests positive|Has cancer) =0.5
P(Has cancer) = 0.003
P(Tests positive) = (1-0.003)*0.03 + 0.003*0.5
Note that P(Tests positive) is obtained by noting that a person can test positive in two ways:
- Not having the disease and testing positive.
- Having the disease and testing positive.
Plugging the numbers in, we get:
P(Has cancer|Tests positive) = 0.5 * 0.003 / (0.997*0.03 + 0.003*0.5) = 0.047755
Or about 5%.
Kramer and Gigerenzer contend that the root of the confusion lies in the problem statement: people find it unnatural to reason in terms of probabilities because the terminology of conditional probability is confusing (A given B, B given A – it’s enough to make one’s head spin). To resolve this they recommend stating the problem in terms of frequencies – i.e. number of instances – rather than ratios.
OK, so let’s restate the problem in terms of frequencies (note this is my restatement, not Gigerenzer’s):
Statistically, 3 people in every 1000 have colorectal cancer. We have a test that is 50% accurate. So, out of the 3 people who have the disease, 1.5 of them will test positive for it. The test has a false positive rate of 3%: so about 30 (29.91 actually) of the remaining 997 people who don’t have the disease will test positive. What is the probability that someone who tests positive has the disease?
From the problem restatement we have:
Total number of people who have cancer and test positive in every 1000 = 1.5
Total number of people who test positive in every 1000 = 1.5+30=31.5
P(Have cancer|Test positive) = 1.5/31.5=0.047619
The small difference between the two numbers is due to rounding error (I’ve rounded 29.91 up to 30)
There’s no question that this is much more straightforward.
But the story doesn’t end there. In a paper entitled the Non-use of Bayes Rule, Thomas Dohmen and his colleagues Armin Falk,David Huffman, Felix Marklein and Uwe Sunde measured the ability to use Bayesian reasoning (which is academese for “reasoning using Bayes rule”) in a representative sample of the German population. They did so by asking those sampled to answer a question that involved conditional probability. Being aware of Gigerenzer’s work, they stated their question in frequencies rather than probabilities. Here is the question they posed, taken directly from their paper:
Imagine you are on vacation in an area where the weather is mostly sunny and you ask yourself how tomorrow’s weather will be. Suppose that, in the area you are in, on average 90 out of 100 days are sunny, while it rains on 10 out of 100 days. The weather forecast for tomorrow predicts rain. On average, the weather forecast is correct on 80 out of 100 days. What do you think is the probability, in percent, that it is going to rain tomorrow?
Again, you may want to have a go at the problem before proceeding further.
The solution is obtained by a straightforward application of Bayes rule which, for the problem above, reads:
P(Rain|Rain forecast)=P(Rain forecast|Rain)* P(Rain) / P(Rain Forecast)
P(Rain forecast|Rain)=0.8 (since there’s an 80% probability of correct forecast)
P(Rain) = 0.1
P(Rain forecast) = P(Rain forecast|Rain)* P(Rain) + P(Rain forecast|Sun)* P(Sun) = 0.8*0.1+0.2*0.9
So, plugging the numbers In , we get P(Rain|Rain forecast)=0.08 / (0.08 + 0.18) = 0.3077 – or approximately 31%.
The surprising thing is that in the study no one got this right, and only 6% of those who were surveyed gave answers within 10% of the correct one.
Dohmen et. al. go on to point out that those with higher education levels – in particular, those with higher degrees were more likely to get the problem wrong! (So it is true: education causes more confusion than clarity.)
Anyway, it appears that stating the problem in terms of frequencies doesn’t help as much as Kramer and Gigerenzer suggest.
In my opinion, whether the problem is stated in terms of frequency or ratio is neither here nor there. The key is to state the problem clearly. In the restatement of the cancer test problem, it isn’t so much the use of frequencies that helps, but that the relevant numbers are presented unambiguously. There is little interpretation required on the problem solver’s part. It is very clear as to what needs to be done; so clear that one does not need to use Bayes rule. In contrast, in the second problem the respondent still has to figure out the individual probabilities that need to be plugged into Bayes’ formula. This requires some interpretation and thought, which doesn’t always come easy. In fact, such reasoning seems to be harder for those with higher degrees than those without. The last paragraph of Dohmen’s paper states:
In a cognitive task as complex as the one we use in this paper, one would expect deliberation cost to be relatively high for people with less formal education. In contrast, for highly educated people deliberation cost should be relatively low. Other things equal, this reasoning would imply that more educated people perform better in assessing conditional probabilities. Our results indicate the contrary, as education, in particular university education, increases the likelihood that respondents are led astray in the probability judgment task. An identification of the exact channels which are responsible for the detrimental effect of education and cognitive ability on Bayesian judgment constitutes a fascinating area for future research.
Fascinating or not, I now have a scapegoat to blame for my being fooled by conditionality.
Appendix: A “derivation” of Bayes Rule
Here’s a quick “derivation” of Bayes rule (the quotes denote that some of the steps in the derivation are a consequence of definitions rather than inferences).
To keep the discussion concrete, we’ll assume that A is the event that a patient has cancer and B the event that a patient tests positive.
The left hand side of Bayes rule, in terms of these events, is:
P(Has cancer|Test positive) = P(Has Cancer & Tests Positive|Tests Positive)
= P(Has cancer & Tests Positive)/P(Tests positive) …..(1)
The second expression in (1) above is merely a restatement of the first. The third is obtained by noting that
P(Has Cancer & Tests positive|Tests positive)= (Number of people who have cancer & test positive)/(Number of people who test positive) …..(2)
and that the probabilities in the numerator and denominator of the third statement in (1) are:
P(Has cancer & Tests positive) = (Number of people who have cancer & test positive)/(Total population sampled) …..(3)
P(Tests positive) = (Number of people who test positive)/(Total population sampled) …..(4)
The third expression in (1) follows from the fact the that the denominators (3) and (4) are identical.
We further note that
P(Has cancer &Tests positive) = [P(Test positive & Has cancer)/P(Has cancer)] * P(Has cancer)
=P(Tests positive|Has Cancer) * P(Has cancer) …..(5)
Here the numerator and denominator have been multiplied by the same factor – P(Has cancer). We have also used the fact that P(Tests positive & Has Cancer)/P(Has cancer) is the same as P(Tests positive|Has cancer).
Substituting (5) in the right hand side of (1), we get:
P(Has cancer|Tests positive) = P(Tests positive|Has cancer) * P(Has cancer) / P(Tests positive)
Which is Bayes rule for this case.
As further reading, I recommend Eliezer Yudkowsky’s brilliant essay, An intuitive explanation of Bayes Theorem.