Prosecutor's fallacy

The prosecutor's fallacy is any of several fallacies of statistical reasoning often used in legal arguments. Two of the most common errors are described below:


 * One form of the fallacy results from misunderstanding conditional probability, or neglecting the prior odds of a defendant being guilty; i.e., the chance of an individual being guilty without specific evidence. When a prosecutor has collected some evidence (for instance a DNA match) and has an expert testify that the probability of finding this evidence if the accused were innocent is tiny, the fallacy occurs if it is concluded that the probability of the accused being innocent must be comparably tiny.  The probability of innocence would only be the same small value if the prior odds of guilt were exactly 1:1. In reality the probability of guilt would depend on other circumstances. If the person is already suspected for other reasons, then the probability of guilt would be very high, whereas if he is otherwise totally unconnected to the case, then we should consider a much lower prior probability of guilt, such as the overall rate of offenders in the populace for the crime in question, and the probability of guilt would be much lower.


 * Another form of the fallacy results from misunderstanding the idea of multiple testing, such as when evidence is compared against a large database. The size of the database elevates the likelihood of finding a match by pure chance alone; i.e., DNA evidence is soundest when a match is found after a single directed comparison because the existence of matches against a large database where the test sample is of poor quality (common for recovered evidence) is very likely by mere chance.

The terms "prosecutor's fallacy" and "defense attorney's fallacy" were originated by William C. Thompson and Edward Schumann in their classic article Interpretation of Statistical Evidence in Criminal Trials: The Prosecutor's Fallacy and the Defense Attorney's Fallacy.

Examples of prosecutor's fallacies
Concrete examples are helpful to understanding the statistical reasoning behind these ideas:

1. Conditional Probability. Consider this case: you win the lottery jackpot. You are then charged with having cheated, for instance with having bribed lottery officials. At the trial, the prosecutor points out that winning the lottery without cheating is extremely unlikely, and that therefore your being innocent must be comparably unlikely. This reasoning is intuitively faulty — it could be applied to any lottery winner, even though we know somebody wins the lottery every day. The flaw in the logic is that the prosecutor has failed to take account of the low prior probability that you and not somebody else would win the lottery in the first place.

2. Multiple Testing In another scenario, assume a rape has been committed and that a sample is compared against 20,000 men who have their DNA on record in a database. A match is found, and at his trial, it is testified that the probability that two DNA profiles match by chance is only 1 in 10,000. This does not mean the probability that the suspect is innocent is 1 in 10,000. Since 20,000 men were tested, there were 20,000 opportunities to find a match by chance; the probability that there was at least one DNA match is


 * $$1 - \left(1-\frac{1}{10000}\right)^{20000} \approx 86\%$$

which is considerably more than 1 in 10,000. (The probability that exactly one of the 20,000 men has a match is about 27%, which is still rather high.)

Mathematical analysis
We can view finding a person innocent or guilty in mathematical terms as a form of binary classification.

We start with a thought experiment. I have a big bowl with one thousand balls, some of them made of wood, some of them made of plastic. I know that 100% of the wooden balls are white, and only 1% of the plastic balls are white, the others being red. Now I pull a ball out at random, and observe that it is white. Given this information, how likely is it that the ball I pulled out is made of wood? Is it 99%? Not necessarily! Maybe the bowl contains only 10 wooden and 990 plastic balls. Without that information (the prior probability), we cannot make any statement. In this thought experiment, you should think of the wooden balls as "accused is guilty", the plastic balls as "accused is innocent", and the white balls as "the evidence is observed".

The fallacy can be analyzed using conditional probability: Suppose E is the observed evidence, and I stands for "accused is innocent". We know that P(E|I) (the probability that the evidence would be observed if the accused were innocent) is tiny. The prosecutor wrongly concludes that P(I|E) (the probability that the accused is innocent, given the evidence E) is comparatively tiny. However, P(E|I) and P(I|E) are quite different; using Bayes' theorem we see


 * $$ P(I|E) = \frac{P(E|I) \cdot P(I) }{P(E)}. $$

So the prior probability of innocence P(I) and the overall probability of the observed evidence P(E) needs to be taken into account. P(E) is the sum of the probability that the person is innocent but nevertheless the evidence is against him (P(I) times P(E|I)) and the probability that he is guilty and that the evidence is against him. If P(I) is much larger than P(E), then P(I|E) can be large as well.

We can also formulate Bayes' theorem with odds:


 * $$ Odds(I|E) = Odds(I)\cdot\frac{P(E|I)}{P(E|\sim I)} $$

Without knowledge of the prior odds of I, the small value of P(E|I) does not necessarily imply that Odds(I|E) is small. (P(E|~I), the probability that the evidence is observed given the accused is guilty, is assumed to be high.)

The fallacy lies in the fact that the prior probability of guilt is not taken into account. If this probability is small, then the effect of the presented evidence is to increase that probability dramatically (by a factor of P(E|I) /P(E|~I)), but does not necessarily make it overwhelming. (In the example below of a city with 10 million people, the presented evidence raises the prior probability of guilt of 1 in 10 million to a posterior probability of guilt of 1 in 10.)

The prosecutor's fallacy is therefore no fallacy if the prior odds of guilt are assumed to be 1:1 or higher. The prior odds in fact depend on the circumstances. Was the person a suspect before the new evidence or not?

In this picture then, the fallacy consists in the fact that the prosecutor claims an absolutely low probability of innocence, without mentioning that the information he conveniently omitted would have led to a different estimate.

In legal terms, the prosecutor is operating in terms of a presumption of guilt, as he is obliged to, but which is contrary to the jurors' obligatory presumption of innocence whereby a person is assumed to be innocent unless found guilty. If the person is suspected solely on the basis of this piece of evidence, then a more reasonable value for the prior odds of guilt might be a value estimated from the overall frequency of the given crime in the general population.

Defendant's fallacy
Suppose there is a one-in-a-million chance of a match given that the accused is innocent. The prosecutor says this means there is only a one-in-a-million chance of innocence. But if everyone in a community of 10 million people is tested, one expects 10 matches even if everyone is innocent.

The defendant's fallacy would be to say, "We would expect 10 matches in this city of 10 million people, so this particular piece of evidence suggests there is a 90% chance that the accused is innocent. So this evidence cannot be used to point to a conclusion of guilt, and should be excluded."

The problem with the defendant's argument is that there may be other available evidence which on its own is also not conclusive. For example if CCTV cameras surrounding the scene of the crime spotted one hundred people there at the relevant time, one of which was the accused, then the defendant could claim: "The video suggests a 99% chance that the defendant is innocent. The match suggested a 90% chance of innocence. So the conclusion should be a finding of innocence."

When the photographic evidence is combined with the match, the two together point strongly towards guilt, since (assuming the chances of being in the photograph and having the match are independent for an innocent person) the chance that the accused is innocent is about 0.0001. Although this is not conclusive proof and only establishes low probability of innocence in a simplified model excluding other potential explanations such as a person being framed, it provides a much more compelling argument than either piece of evidence alone.

The argument goes that the prior probability that the man is innocent is 9,999,999/10,000,000. While the likelihood of having the match and being in the video may be 1 if guilty, the likelihood of the match if innocent is 1/1,000,000, and the likelihood of being in the video if innocent is 1/100,000, so (assuming independence) the likelihood of both happening if innocent is 1/100,000,000,000. That gives a posterior probability of being innocent of 9,999,999/100,009,999,999 which is 0.000099989991... or about 0.01%.

The Sally Clark case
An interesting example of this concept is the case of Sally Clark, a British woman who was accused in 1998 of having killed her first child at 11 weeks of age, then conceived another child and allegedly killed it at 8 weeks of age. The defense claimed that these were two cases of sudden infant death syndrome (SIDS or "cot death"); neither prosecution nor defense offered any other explanations for the deaths. The prosecution had expert witness Sir Roy Meadow testify that the probability of two children in the same family dying from SIDS is about 1 in 73 million. Some press reports at the time reported this as the probability that the deaths were accidental or the probability that Sally Clark was innocent. But this is incorrect, because it does not take into consideration the prior probability that an arbitrarily chosen woman would murder two of her children. (The figure of 1 in 73 million has another flaw: it assumes that SIDS deaths within the same family are statistically independent, which they may not be.) Mrs Clark was convicted in 1999, resulting in a press release by the Royal Statistical Society which pointed out the mistake.

To provide proper context for this number, the figure of 1 in 73 million (or whatever the correct value is) should have been compared to the probability of a mother killing one child, conceiving another and killing that one too. This probability is small, but not as small as the square of the probability of killing one child (because if a person has the motivation and capacity for doing it once, she may well have the motivation and capacity to do it a second time). Without further data, we can only speculate about the relative probabilities of the alternative theories.

A higher court later quashed Sally Clark's conviction, on other grounds, on 29 January 2003.