Could Big Data Unlock the Potential of Predictive Policing?

(Credit: Peter Kim/shutterstock)

It’s hard to imagine a nation without an organized police force, but in truth, it’s a fairly modern invention. Crime was once handled locally, often by volunteers and by the will of the ruling power, and it was only in 1829 that the first large-scale, professional force came to be — London’s Metropolitan Police Service, created by Parliamentarian Sir Robert Peel. These police, nicknamed “peelers” or “bobbies” after their creator, wore uniforms selected to make them look more like citizens than soldiers, followed clear guiding principles and strived not only to fight crimes but also to prevent them from happening. The US followed suit less than two decades later, when the nation’s first metropolitan police department was created in New York City in 1844, based on the London model. Law enforcement has evolved a lot since then, of course. And in recent decades, information technology has emerged as a significant player in policing. The September 11 attacks in 2001 led to a radical modernization in American policing that included the advent of so-called Big Data — analysis of large datasets to discover hidden patterns. Knowable asked criminologist and statistician Greg Ridgeway of the University of Pennsylvania how computers — Big Data in particular — are changing policing in the US. Ridgeway is the author of a 2017 article on the topic in the Annual Review of Criminology. This conversation has been edited for length and clarity.

Criminologist and statistician Greg Ridgewayof the University of Pennsylvania. (Credit: JAMES PROVOST)

How did you become interested in Big Data and policing?

I’m a statistician. I’ve always been working at the intersection of statistics, data analysis and computer science: I spent a couple of years at Microsoft Research back in the 1990s. When I started doing more work on criminology, I brought all that with me. When I work with a police department, the first thing I ask is, “Tell me about your data: What do you have, and what are your problems that I could potentially answer with data?” So it’s just been a natural fit for me. What sort of data were police departments handling before computers Since 1929, the United States government has been nationally collecting unified and standardized official statistics on crimes reported to the police — homicides, sexual assaults or burglaries — for the Uniform Crime Report. So it’s only been about 90 years since the police agreed upon a definition of how they count crimes in the US. This reporting still exists, and with the rise of computers since the 1960s, police departments could start doing a lot more.

So what has happened since the 1960s?

During the civil rights movement in the US, there was an idea to reform our criminal justice system. President Lyndon Johnson employed a commission to review it. The commission predicted that technology would play a core role in the next stages of the evolution of our criminal justice system — and policing in particular. Back then, the idea of computing was just emerging. For example, the St. Louis Police Department ran an algorithm against a database of crime data to allocate patrol officers — a model known as LEMRAS (Law Enforcement Manpower Resource Allocation System). At the time, the subject of surveillance — for instance, to record conversations using wireless bugs from remote places — was just becoming common.

“Police can rely on computing power to make better decisions about where to position their limited resources.”

So the commission envisioned the use of such technologies, though a vast majority of those was simply for accounting: keeping track of people the police encounter, of suspects, and logging evidence, cases and emergency service calls. As storage became cheaper, things could be digitized and stored as data in vast amounts. From the 1980s, the police started using those databases to map crime hotspots, and since the 1990s, crime mapping has become a big thing.

In the 21st century, what does Big Data mean for policing? Could you give an example of how police are accessing and using it now?

Let’s start with predictive policing. The fact that a computer that can store information and also display it as a map made it possible for the police to develop new strategies to deal with crimes. You could now put the last thousand crime incidents up on a map, not for investigating just one of them but for looking for patterns across your whole community. Now, if you’re a police captain, you’ve got to think about how you can use the statistical models to proactively predict which neighborhoods are at greater risk for something happening. Police now want to anticipate where the problems are going to arise and be there either to prevent the crime entirely or be well positioned to respond to it, should it happen. For example, a group of researchers working with the Los Angeles Police Department found that the experimental algorithm model ETAS (Epidemic-Type Aftershock Sequence) was much better than human police at determining where a crime is likely to happen so that a force can be deployed there. Crime was lower on days when ETAS was in charge of such predictive policing. So now police can rely on computing power to make better decisions about where to position their limited resources.

Which forms of Big Data dominate policing now, and is that changing the nature of law enforcement?

Back in the 1960s, police were able to count how many robberies happened in a day. Today, the same sort of case file can have detailed information like the address, CCTV footage and text reports that police have written about the incident. Multiply that over thousands of robberies. Now you have a massive store of video and text: a database that describes the collection of incidents. Moreover, there are advanced telematics systems on a police car to report on the vehicle as well as the subjects the police inside the car are tracking. Another example is from Bogota, Colombia. The police put [GPS] trackers on individual police officers so that they could collect data on them every six seconds. Just this data availability made it possible for them to later realize that their allocation of police wasn’t lined up with where the crime problems were. In this case, it’s not even advanced statistics. It’s just using Big Data to make sure that there’s an alignment between where the crime problems are and where the resources are being spent.

You also note that Big Data helps to measure police performance. How so?

A major issue, at least in the US, is the use of force by police, and there are extensive data being collected about it. Police now have body camera video documentation and detailed descriptions of these force incidents. So you can count them.

“We need to decide what kind of information the police should have and when they’re allowed to have and use it.”

By comparing the performance of one individual police officer who did not use force with one who did, both at the same place at the same time, we can identify outlying officers — the ones who have inexplicably large rates of use of force, large numbers of stops of minorities, or large rates of injuries to either suspects or themselves. This is helpful to the police department in understanding where their potential problems and risks are, and to act on them. That is really important.

Some critics are concerned that surveillance technology and Big Data could exacerbate the problem of racial profiling. How do you see this?

Collecting and publishing data on police interactions with the public has created much greater transparency of police practice. We now know how often certain kinds of activities happen — such as traffic stops, searches, recovery of contraband, citations, use of force, shootings. And we know which members of the community are most affected — which racial groups, neighborhoods and age groups. Police know that data about their behaviors are documented, tracked and published for the community to see and evaluate. Numerous researchers, including me, use these data to develop and deploy methods to monitor for evidence of racial profiling. It won’t solve all problems, but it is a step toward reducing the risk of racial profiling in policing.

What about privacy issues? Are you concerned that Big Data combined with increasingly advanced surveillance technologies could give rise to a police state?

I have no specific opinion on what’s right or wrong here. I only know that as a society we haven’t quite decided what we’re comfortable with yet. In the past that wasn’t a problem because the police weren’t technically capable of doing much surveillance. The problem is only going to get worse if we don’t make a decision on what we do and don’t want. Police capabilities are just going to increase exponentially. The cost of collecting, matching and storing data is plummeting, and it will become easier and cheaper for the police to do so.

“Collecting and publishing data on police interactions with the public has created much greater transparency.”

So I think we need to decide what kind of information the police should have and when they’re allowed to have and use it. Police in New York City were recently called out because they were collecting information on everyone they stopped. What if they stopped the wrong person? Say you did nothing wrong, the NYPD stopped you by mistake, but nevertheless your name and ID stayed in NYPD records. I’m not sure if we’re OK with that.

What are the challenges and caveats in putting Big Data to use?

I think there’s still a big gap between the data collected and its true use. For example, police video takes up a lot of storage space but I don’t think we know what to do with it. Right now, if there is a complaint about the video or there’s a lawsuit associated with the incident, then someone will pull up the video and watch it. Is there some other way to analyze video that doesn’t involve watching all of it to figure out whether there is something that we need to know about policing? UCLA mathematician George Mohler’s group proposed PredPol, an algorithm to forecast where problems might occur. He was predicting particular places that are likely to be problematic. In Chicago, the police department has tried to predict people who might be problematic. Should we be predicting people or places, and what particular algorithms should be used?

So it sounds like you’re saying that gathering masses of data and analyzing it is one thing, but the much tougher problem is figuring out how to use it to respond in practical ways.

That’s right. For example, let’s say that a police agency collects the right data and builds a model that forecasts that a particular location is at high risk for a violent crime. What should the police do? Do they spend the resources and park an officer there indefinitely? What if they forecast that a particular business is a likely target for a robbery? What should they do? Simply tell the owner? Set up a camera? Park an officer? The issue is that even with good data and analysis, it is not always obvious how best to translate the findings into some public safety practice.

So what’s your verdict? Has the advent of Big Data improved policing in the US? And where do you see things going?

It’s changed how police departments deploy resources — for example, how many officers to put where, and when. It has provided new data sources to use in investigations: cameras, social media, mobile data records, phone-location data. So policing is different, but Big Data has not changed the rate at which crimes are solved. The crime clearance rate — that is, the percentage of crimes that the police solve — has been the same in the US since the 1960s. Only 45 percent of crimes are solved. Even though we have cameras, DNA and all the other things like social media and mobile applications — they help solve some crimes but the overall crime rate remains essentially unchanged. I can’t say that because we have better data collection and predictive models that the crime rate is coming down nationally. I think that’s sort of a disappointing aspect of the thinking that brought the 1967 President’s report. Those authors anticipated a radical change in policing. They thought that these new technologies were going to solve a lot of problems, that new cases were going to be resolved and our country would be much safer. But in fact, crime rates skyrocketed then came back down — our current crime rate is actually back where it was in 1967. So that’s a little disappointing after all this. We have to be able to do more, and we’re just not there. Maybe we’re not digitized enough to see the impact, yet. We have a lot of cameras, but maybe we don’t have enough. We use DNA, but it’s still expensive so perhaps we don’t use it enough. We’ve digitized a lot of records, but perhaps it’s still not enough — because there’s been no change in that rate of solving crimes.

10.1146/knowable-92818-4

Vijay Shankar Balakrishnan is a journalist based in Marburg, Germany. He writes about health, the environment and any other science topic that catches his eye. Follow him on Twitter @VijaySciWri.

This article originally appeared in Knowable Magazine, an independent journalistic endeavor from Annual Reviews. Sign up for the newsletter.