The following article was originally published by Analytics at Wharton.

The American Civil Liberties Union of Kansas is crediting research from experts at Wharton and Princeton with helping to win its federal court case against the Kansas Highway Patrol (KHP) over unconstitutional treatment of out-of-state drivers.

Wharton operations, information, and decisions science professor Dean Knox and Princeton politics and public affairs professor Jonathan Mummolo led a team of researchers through a comprehensive data analysis that led to a court ruling that indicated KHP troopers were illegally targeting drivers from Colorado and Missouri along Interstate 70 to search their cars for drugs. Marijuana is legal in those neighboring states, but not in Kansas.

In a maneuver known as the “Kansas two-step,” a trooper would finish a routine traffic stop, walk back toward their patrol car, then turn around and re-engage the driver in conversation, probing for a reason to search for drugs. A federal judge ruled in July that the practice violated the Fourth Amendment protection against unreasonable search and seizure, and she ordered the KHP to stop considering the origin of the vehicle when detaining motorists along the I-70 corridor.

Sharon Brett, legal director of the ACLU of Kansas, said the data analysis performed by Research on Policing Reform and Accountability (RoPRA), an organization co-founded by Knox and Mummolo, was key in proving the case because it showed patterns of unconstitutional policing.

“It played a really pivotal role in convincing the court that what had happened to our clients was more than an isolated incident and that it would continue to happen to others if the court didn’t step in and order relief,” she said. “That’s what this type of analysis can do in a major civil rights case like this. It can take an individual incident and show it’s actually part of something much bigger, much more problematic, much more widespread.”

A Forensic Data Journey

Brett said ACLU attorneys realized they needed to “dig into the data” to determine the facts of the case, so they searched for experts who could correctly analyze the existing data and go after other sources of information to build a complete picture of what the agency was doing. Brett reached out to former colleagues at the U.S. Department of Justice, and they put her in touch with Knox and Mummolo, who are well-known for their work using science to quantify bias and discrimination in policing.

“One of the core messages that we keep coming back to is that if there’s bias in the way that the data set itself is constructed, that can lead to extremely misleading conclusions.”— Dean Knox

Police reform has become a more pressing issue in recent years, especially in the wake of the 2020 police killings of George Floyd and Breonna Taylor. But analysts have a history of using police records to assess discrimination against minority civilians. The problem, according to Knox and Mummolo, is that the data is inherently flawed. Records typically leave out police interactions that do not result in citation or other enforcement actions, which makes it difficult to find the correct denominator of encounters in which that enforcement could have happened. And they rarely include information that would skew the data from the start, such as officers detaining minorities more frequently than whites for the same behavior.

“One of the core messages that we keep coming back to is that if there’s bias in the way that the data set itself is constructed, that can lead to extremely misleading conclusions,” Knox said.

Knox and Mummolo rounded out their team with Rachel Mariman, director of data computing and research support at Analytics at Wharton, and Jacob Kaplan, a data scientist at Princeton who earned his doctorate in criminology at Penn. The first step was compiling the data to quantify four specific elements: the total number of cars driving through the corridor; where they were from; whether they were speeding, and whether they were pulled over.

Yet there exists no data that simultaneously captures these four factors. In order to gather the most information possible, they couldn’t rely solely on information provided by the KHP. They also pulled records from the Kansas Department of Transportation, including granular-level information from traffic cameras and sensors. They also incorporated commercial cell phone location data to estimate the proportion of drivers on various road segments at various times. Each of these datasets provided only a partial view of the broader picture. They then used tried-and-true statistical methods to triangulate all the information.

The result showed a clear pattern: From January 2018 to November 2020, troopers stopped 70% more out-of-state drivers than would be expected if they were stopping in-state and out-of-state drivers at the same rate. That difference represented about 50,000 traffic stops, according to court documents. Once stopped, out-of-state drivers were disproportionately subjected to canine searches. In the court order, the judge noted that the defense provided no evidence to explain the disparity.

“The way we tackled this was to basically fuse together data sets that each provided different pieces of this puzzle that we were trying to solve,” said Mummolo, who testified about the findings.

Data’s Role in Social Justice

Knox and Mummolo have long advocated against using a one-size-fits-all approach to data analysis. While it’s tempting, they say, doing so can produce inaccurate results because data is typically imperfect in ways that must be carefully considered during analysis. And accuracy is paramount, particularly when dealing with matters of equality and racial justice.

“The way we tackled this was to basically fuse together data sets that each provided different pieces of this puzzle that we were trying to solve.”— Jonathan Mummolo

“People often think about data in terms of, ‘Let me just get my toolkit of statistical methods off the shelf, and whack at it with the standard analytical hammer as hard as I can,’” Knox said. “But our point is that we need to be a little more thoughtful about how we go about it, especially in the real world with these important social questions.”

Knox is a computational social scientist who specializes in the analysis of unstructured and incomplete data. Before joining the faculty at Wharton, he was a professor at Princeton, where he began working with Mummolo. The two found a shared interest in using science to address intractable social problems, and they began working together. Both emphasize that their research isn’t about punishing police officers; it’s about promoting a legal justice system that is fair and equitable. That’s why the court case was so meaningful for them.

“This was a very gratifying project for us because it’s a direct application of what we’ve been working on for years,” Mummolo said.

Knox added, “This is exactly what motivates the kind of work we do. We’re trying to solve practical challenges that confront policing reform. It’s the perfect example of issues we are trying to address.”

Brett said she believes the case shows how crucial it is to bring data and social scientists into the debate over police reform.

“Data is a language we can use to convince people of the import and impact of real-life decisions happening every day by government entities,” she said. “Researchers who can come in, look at that data, ask a question of it, and then provide an answer to the public, that increases transparency in government functions and leads to a more just society.”