The misuse of statistics in court

 

The following are miscellaneous notes and comments from a series of articles and a presentation by Nick Thieme for the Royal Statistical Society (RSS). Thieme is a data journalist whose work has appeared in the RSS magazine “Significance” as well as Slate Magazine, BuzzFeed News, and Undark Magazine.

17 October 2018 (Washington, DC) –  An elderly woman is picking up litter when she is barrelled over and robbed by a younger white woman with a blond ponytail. A few moments later, his attention drawn by crying and screaming, an eyewitness sees a woman matching that description hop into a yellow car driven by a black man with a beard and a moustache. Four days later, Janet Collins, a white woman with a blond ponytail, and her husband Malcolm Collins, a black man without a beard, are questioned about the crime. Two weeks later they are charged with the robbery and four months later they are convicted.

A key piece of evidence in their trial was an “analysis” performed by a mathematics instructor at a state college. It went like this: the probability of a partly yellow automobile is 1/10, of a man with a moustache is 1/4, of a woman with a ponytail is 1/10, of a woman with blonde hair is 1/3, of a black man with a beard is 1/10, and of an interracial couple in a car is 1/1000. Multiplying these together, the prosecutor concluded that “there was but one chance in 12 million that any couple possessed the distinctive characteristics of the defendants”, and that “the chances of anyone besides these defendants being there, … having every similarity, … is something like one in a billion”.

The absolute intellectual bankruptcy of this argument is egregious. According to The Double Helix and the Law of Evidence, by David Kaye:

the court saw two ‘glaring defects’ in the calculation – ‘an inadequate evidentiary foundation’ for the numbers that were multiplied and ‘an inadequate proof of statistical independence’” in the physical traits in question, pointing out that black men with beards and men with moustaches represent “overlapping categories”. The California Supreme Court overturned the case on appeal. But mistakes of this sort are still made in courtrooms today. Why should that be, what can be done to limit them, and how do we guarantee defendants a fair shake when statistics are used in court?

The limits of education

It would be uncharitable and incorrect to imply that statistics are always misused or misunderstood in a legal setting. Indeed, the US legal system’s statistical highlights are as prescient as they are exemplary. One of the best writers to ever serve on the Supreme Court, Associate Justice (and Acting Chief Justice) Oliver Wendell Holmes, Jr, wrote in 1897 that: “For the rational study of the law the black letter man may be the man of the present, but the man of the future is the man of statistics and the master of economics.” This would be proven right time and time again.

In a case addressing the voting rights of black Alabamians under the Civil Rights Act of 1957, the Supreme Court was informed by statistical analysis and voted in favour of black citizens, saying: “In the problem of racial discrimination, statistics often tell much, and Courts listen.” In 1986, Bazemore v Friday established that regression analysis can be used in some cases to determine racial discrimination in the workplace. And in 2011, Matrixx v Siracusano set out that, roughly speaking, statistical significance is not a requirement for practical significance. Explaining the court’s opinion, Associate Justice Sonya Sotomayor stated that “courts frequently permit expert testimony on causation based on evidence other than statistical significance”. This understanding came to the court years before it dawned on parts of the scientific community.

However, while judges often get statistics, they and others do not always get it right. Jury trials, a hallmark of the American legal system, provide difficulties of their own. We might expect judges or lawyers to be familiar with basic statistics, or areas of statistics that are particularly relevant to their profession, such as forensic statistics. But forensic statistics can present challenges even to those well trained in statistics, says Paul Weiss, a faculty researcher at Emory University, who testifies as an expert witness. So, he asks: “How would you expect 12 people from any walk of life to get it?”

Education can help, but requiring that judges and jurors be statistically literate is no panacea for the misuse of statistics in the law. If only the statistically literate can be jurors, there would be no way to guarantee a defendant their 6th Amendment right to a jury of their peers. So that idea is a non‐starter, barring a wonderful, watershed change to universalise statistics education.

Constitutional issue aside, it should be noted that people with statistical education can and do use statistics incorrectly, as evidenced by the replication crisis looming over the sciences. The Royal Statistical Society’s publication, Fundamentals of Probability and Statistical Evidence in Criminal Proceedings, puts an even finer point on the issue:

twisted logic can seem enormously seductive and [is] frequently perpetrated by professionals who ought to know better, especially in pressured situations such as giving evidence in criminal trials

Errors and fallacies

Statistical errors often arise in court because a statistic is proffered as an answer to a question it cannot address. Consider how juries might interpret the following phrases: “evidence is consistent with”, “could have come from”, or “does not rule out”. When an expert testifies that “the evidence is consistent with the defendant having murdered the victim”, they mean the data does not rule out the defendant having committed the murder. Jurors and judges may sometimes take that to mean the defendant likely murdered the victim.

The well‐known prosecutor’s fallacy is another example of using a statistic to answer a question it cannot. Many famous cases have been decided by experts confusing the probability of damning evidence given a defendant’s innocence with the probability of innocence given damning evidence. Expert witnesses may testify to the first – “If the suspect is innocent, it would be unlikely to find gun residue on their hands” – but do so meaning the second – “Therefore, the probability this person is innocent given the evidence is very low”.

Dr Maria Cuellar, assistant professor of criminology at the University of Pennsylvania, has studied and testified on shaken baby syndrome, and she described the problem with testifying in her field like this:

We’re using statistical arguments within medicine to talk about legal questions. That makes it very confusing because the standards required are different in the different fields, and there’s very little communication between them.

That the standards are different can again be framed as statistics being used to answer the wrong question. Consider an example Cuellar suggested. A worker sues a chemical company after 10 years of employment because they contracted lung cancer just before retiring. A statistician determining the carcinogenic effects of the chemical produced by the company and inhaled by the worker is searching for the probability that exposure to the chemical causes cancer, and may testify on that basis if called as an expert witness once the case goes to trial. However, in deciding the case, the judge is interested in a subtly different question: whether the worker’s exposure to the chemical in this factory caused this particular cancer. The statistician’s question is one of casual effect, while the judge’s is one of causal attribution.

There are, of course, statistical methods for this kind of causal reasoning. Bayesian methods, including casual belief networks, causal inference specifically for reasoning about legal issues, and randomised controlled trials are all active fields of study. If the issue were purely the unavailability of models, there would be no issue. The issue must be elsewhere.

Conflicting notions

One reason why it can be so easy to demand the wrong legal answers of the wrong statistics is because the fields, as practised, represent competing views on how to reason about the truth. Dr Jim Greiner, a professor of statistics and law at Harvard, with experience as a trial attorney, explained:

There’s an epistemology within the United States legal profession that says what’s true is what I see day to day and what higher courts tell me is true. Statistical reasoning, on the other hand, largely discounts individual cases and assigns less value to hierarchy. Each philosophy was developed with a goal in mind and each has benefits, but their differences can cause problems in integration.

The clearest example of this is their attitudes towards precedent. In law, the longer a rule has been adhered to the more correct it is considered to be, hence why the US Constitution is viewed as the ultimate tool for reasoning about the law of the land. On the other hand, science – and statistics as a part of it – is an iterative process that improves with time. The newest methods are typically those that allow us to reason most effectively. These two different notions conflict in practice.

In the 1986 case of Tornburg v Gingles, the Supreme Court accepted the results of two statistical models in determining “racially polarised voting” as a form of vote dilution. Because of the court’s influence, those two models have remained the standard for detecting racially polarised voting in the 30 years since, despite the existence of far better models. Courts’ reliance on these methods is so all‐consuming that a research project headed by Greiner was unable to discover any instance of a court finding racial bloc voting when the plaintiff did not present evidence using one of the two suggested models.

Improved statistical methods strive to make data analysis easier and inference more thorough, but they are unlikely to singlehandedly bridge the divide between law and statistical science. The more advanced statistical methods become, the more they may represent an unfamiliar way of reasoning, and the more difficult they may be for judges, lawyers and jurors to understand.

Perhaps the solution is to recognise that competing methods for reasoning can be made complementary and that a balance can be found between satisfying statistical and legal concerns. Where this balance is achieved, statisticians could help experts and attorneys to give explanations better suited to the aims of the law.

On the part of the statistician, the courts require methods that are interpretable and interrogable by lay people, as well as honesty about the limitations of those methods. On the legal side, the courts need lawyers who are willing to listen to and learn from their experts.

This is also where statistical education could have a greater effect on legal outcomes. Lawyers better trained in statistics and statistical reasoning will be better prepared to take advantage of the work done by their mathematical allies. But if attorneys shop around for “statistical experts” until they find someone willing to tweak sample sizes and sampling procedures to get the desired result, no reconciliation is possible.

Expert oversight

What if there were to be an Inspector General‐like body tasked with overseeing statistical arguments in legal settings? Such an authority would review the use of survey sampling, hypothesis testing, and model selection in courts, recommending, if necessary, that judges dismiss evidence or litigants appeal. Statistical oversight of this sort could help judges in their role as scientific gatekeepers – especially now that DNA tests and forensic science are commonly presented as evidence.

Almost all the experts Thieme interviewed for this series of articles emphasised the difficulty of spreading scientific knowledge into the legal system. Greiner (the professor of statistics and law at Harvard who I introduced above and who has experience as a trial attorney) explained:

The diffusion of information is a problem about governing in a world of intensely specialized knowledge – but this is a role for which an oversight body is particularly well suited. Consider an analogy to medicine. Not all doctors are intimately familiar with the chemical structure of the medicines they prescribe because, frankly, chemistry is not their primary concern. For that reason, doctors rely on the chemists at the Food and Drug Administration to classify drugs and drug interactions as safe or not. A hypothetical statistical legal agency could do the same.

However, courts may be reluctant to cede this considerable power to an outside body. As put by Supreme Court Associate Justice Stephen Breyer in a wonderful piece Science in the Courtroom:

Any effort to bring better science into the courtroom must respect the jury’s constitutionally specified role – even if doing so means that, from a scientific perspective, an incorrect result is sometimes produced.

It is unsettling, but Breyer’s argument – in essence – says that juries must be allowed to make mistakes in reasoning about statistics as they do in reasoning about medicine, physics, and any other science presented to them. Justice is not only for the experts to decide.

Political considerations

The freedom of juries need not come at the expense of better science and statistics in court. In many countries, including the UK, independent regulators set standards for forensic science, and those standards demand that methods and results be scientifically and statistically valid. These regulators also establish the boundaries of what expert witnesses can and cannot say about a piece of evidence.

In the US there is no national oversight of forensic labs and techniques. Instead, the Center for Statistics and Applications in Forensic Science (CSAFE) aims to partly fill that gap by developing and refining forensic techniques and providing educational opportunities for practitioners. But CSAFE’s director, Dr Alicia Carriquiry – distinguished professor of statistics at Iowa State University – worries about the inconsistency with which US administrations commit to addressing the misuse of statistics and science in court. She says that:

In the Obama administration, there was a great deal of cooperation between the Department of Justice and the research community, and a whole lot of support for basic sciences geared toward improving the scientific and statistical foundations of forensic tools. But the Trump administration came along and many of these efforts came to a screeching halt.

Leave a Reply

Your email address will not be published. Required fields are marked *

scroll to top