We work hard to improve measurement and quantitative research in policing, especially on the difficult problem of quantifying racial bias. The field of policing research is in need of rigorous statistical analysis: traditional research methods struggle to disentangle correlation from causation and often ignore well-known issues like selection bias. Sparse data has led to scattershot analytic approaches across multiple disciplines, making it difficult to reconcile seemingly conflicting results. We work to systematize policing research using modern statistical frameworks and techniques so the field can advance.
Journal of Political Institutions and Political Economy
A series of controversial police-involved killings and nationwide protests have recently reinvigorated the study of racial bias in policing. But a fractured interdisciplinary literature presents contradictory claims, and scholars have struggled to reconcile a dizzying array of seemingly incompatible analytic approaches that often rely on implausible and/or unstated assumptions. This confusion arose in part because data constraints have prompted researchers to examine only isolated aspects of the police–civilian encounters they seek to understand — focusing only on traffic stops in one study, or fatal shootings in another — while neglecting the complex, multi-stage nature of these interactions. The result is a conflicting and at times misleading body of evidence. To move toward a scientific consensus, scholars should converge on a common empirical framework that unites these disparate approaches under a shared conceptual umbrella, acknowledges the causal nature of the study of racial bias, accounts for the fundamental limitations of policing data, and yields substantively interpretable results that are useful to policymakers. We present such a framework and demonstrate its capacity to adjudicate conflicting claims, accumulate knowledge, and characterize the severity of one of the most pressing problems of institutional performance of our time.
A recent PNAS study, Johnson et al., investigates the role of race in fatal police shootings. Unlike previous studies which focused on victim race alone, the paper features original data about the race of officers who use deadly force and offers a rare accounting of other shooting attributes that contextualize fatal encounters. Johnson et al. discuss possible “discrimination by White officers,” but conclude racial diversity in police agencies brings limited benefits— a claim cited by major news outlets and in US Congressional testimony, inflaming an already contentious policy debate. Despite the value of this much-needed research, its approach is mathematically incapable of supporting its central claims. In this letter, we clarify the gap between what Johnson et al.’s study asserts and what it actually estimates, as well as the implications of that difference for policymaking and future scholarship on race and policing.
Researchers often lack the necessary data to credibly estimate racial discrimination in policing. In particular, police administrative records lack information on civilians police observe but do not investigate. In this article, we show that if police racially discriminate when choosing whom to investigate, analyses using administrative records to estimate racial discrimination in police behavior are statistically biased, and many quantities of interest are unidentified—even among investigated individuals—absent strong and untestable assumptions. Using principal stratification in a causal mediation framework, we derive the exact form of the statistical bias that results from traditional estimation. We develop a bias-correction procedure and nonparametric sharp bounds for race effects, replicate published findings, and show the traditional estimator can severely underestimate levels of racially biased policing or mask discrimination entirely. We conclude by outlining a general and feasible design for future studies that is robust to this inferential snare.
Studies of racial bias in policing often rely on data contaminated by selection issues, e.g. using records of stops or arrests—which may themselves be a product of racial bias—to estimate discrimination in subsequent actions like use of force. This feature raises the threat of post-treatment-selection bias, which recent work shows can lead to severe underestimates of discrimination. However, prominent studies continue to ignore this issue, employing standard regression techniques with contaminated data. In this paper, we formally analyze the key identifying assumption undergirding these studies, “subset ignorability,” and show it corresponds to the measure-zero set of knifeedge conditions in which differing biases happen to sum to zero. Because there is no substantive reason to believe such accidental cancellation would occur, we conclude this approach is not reliable in applied research, and we emphasize the need for continued caution and increased rigor in high-stakes analyses of discriminatory policing with contaminated data.
During the last decade, while national homicide rates have remained flat, New York City has experienced a second great crime decline, with gun violence declining by more than 50 percent since 2011. In this paper, we investigate one potential explanation for this dramatic and unexpected improvement in public safety—the New York Police Department’s shift to a more surgical form of “precision policing,” in which law enforcement focuses resources on a small number of individuals who are thought to be the primary drivers of violence. We study New York City’s campaign of “gang takedowns” in which suspected members of criminal gangs were arrested in highly coordinated raids and prosecuted on conspiracy charges. We show that gun violence in and around public housing communities fell by approximately one third in the first year after a gang takedown. Our estimates imply that gang takedowns explain nearly one quarter of the decline in gun violence in New York City’s public housing communities over the last eight years.
The notion that the unjustified use of force by police officers is concentrated among a few “bad apples” is a popular descriptor that has gained traction in scholarly research and achieved considerable influence among policymakers. But is removing the bad apples likely to have an appreciable effect on police misconduct? Leveraging a simple policy simulation and data from the Chicago Police Department, we estimate that removing the top 10% of officers identified based on ex ante risk and replacing them with officers drawn from the middle of the risk distribution would have led to only a 4–6% reduction in the use of force incidents in Chicago over a 10-year period. Our analysis suggests that surgically removing predictably problematic police officers is unlikely to have a large impact on citizen complaints. By assembling some of the first empirical evidence on the likely magnitude of incapacitation effects, we provide critical support for the idea that early warning systems must be designed, above all, to deter problematic behavior and promote accountability.
Diversification is a widely proposed policing reform, but its impact is difficult to assess. We used records of millions of daily patrol assignments, determined through fixed rules and preassigned rotations that mitigate self-selection, to compare the average behavior of officers of different demographic profiles working in comparable conditions. Relative to white officers, Black and Hispanic officers make far fewer stops and arrests, and they use force less often, especially against Black civilians. These effects are largest in majority-Black areas of Chicago and stem from reduced focus on enforcing low-level offenses, with greatest impact on Black civilians. Female officers also use less force than males, a result that holds within all racial groups. These results suggest that diversity reforms can improve police treatment of minority communities.
Politicians and law enforcement officials have advocated the militarization of local law enforcement on the grounds that it promotes public and officer safety, and some early research seemingly supported those claims. Two new studies reveal limitations in the data used in this prior work. When these issues are addressed, evidence for the benefits of militarization largely vanishes.
This study provides evidence of racial and gender disparities among police officers by examining a key metric of internal recognition: departmental award nominations. Using a novel dataset on Chicago police officers, we find that black (female) officers are significantly less likely to be nominated compared to their white (male) colleagues, even after controlling for cohort, age, experience, and key policing activity metrics such as arrests, uses of force, and complaints. Further, the discrepancy is likely not a result of statistical discrimination on the part of nominators, as the minority nominations gap grows among higher award percentiles.
The results reported in a large amount of the criminology literature reveal that hiring police officers leads to reductions in crime and that investments in police are an efficient means of crime control compared with investments in prisons. One concern, however, is that because police officers make arrests in the course of their duties, police hiring, albeit efficient, is an inevitable driver of “mass incarceration.” In this article, we consider the dynamics through which police hiring affects downstream incarceration rates. Using state-level panel data as well county-level data from California, we uncover novel evidence in favor of a potentially unexpected and yet entirely intuitive result: that investments in law enforcement are unlikely to increase state prison populations markedly and may even lead to a modest decrease in the number of state prisoners. As such, investments in police may, in fact, yield a “double dividend” to society by reducing incarceration rates as well as crime rates.
High-profile incidents of police misconduct have led to widespread calls for law enforcement reform. But prior studies cast doubt on whether police commanders can control officers, and offer few policy remedies because of their focus on potentially immutable officer traits like personality. I advance an alternative, institutional perspective and demonstrate that police officers—sometimes characterized as autonomous—are highly responsive to managerial directives. Using millions of records of police-citizen interactions alongside officer interviews, I evaluate the impact of a change to the protocol for stopping criminal suspects on police performance. An interrupted time series analysis shows the directive produced an immediate increase in the rate of stops producing evidence of the suspected crime. Interviewed officers said the order signaled increased managerial scrutiny, leading them to adopt more conservative tactics. Procedural changes can quickly and dramatically alter officer behavior, suggesting a reform strategy sometimes forestalled by psychological and personality-driven accounts of police reform.
Policies to make police forces more representative of communities have centered on race. But race may crudely proxy views and lived experiences, undermining classic theories of representative bureaucracy. To conduct a multi-dimensional analysis, we merge personnel records, voter files and census data to examine roughly 220,000 officers from 97 of the 100 largest local U.S. agencies—over one third of local law enforcement agents nationwide. We show officers skew more White, Republican, politically active, male, and high-income than their jurisdictions; they also surround themselves with similarly unrepresentative neighbors. In a quasi-experimental analysis in Chicago, we find Democratic and minority officers initiate fewer stops, arrests, and uses of force than Republican and White counterparts facing common circumstances. The Black-White behavioral gap is often far larger than the Democratic-Republican gap, a pattern not observed among Hispanic officers. Our results complicate conventional understandings of descriptive representation, highlighting the importance of multi-dimensional perspectives of diversity.
Does policing the police increase crime? Previous studies estimating the effect of police oversight on crime rely on major policing scandals as shocks to examine the effect of oversight on crime. We argue that the simultaneous effect of public outrage on officer behavior and crime contaminates these results, and provide a conceptual framework which distinguishes between oversight and outrage. We identify two events, relating to unexpected court rulings in Chicago, that increased oversight and caused a decline in reported misconduct but had virtually no public reaction. Despite the decline in reported misconduct, crime and officer activity were unaffected. We contrast this with a major policing scandal, after which we find both a rise in crime rates without an equivalent increase in arrests and a decline in officer stops and use of force. Our results suggest that police oversight can reduce misconduct without increasing crime.
Political elites increasingly express interest in evidence-based policymaking, but transparent research collaborations necessary to generate relevant evidence pose political risks, including the discovery of sub-par performance and misconduct. If aversion to collaboration is non-random, collaborations may produce evidence that fails to generalize. We assess selection into research collaborations in the critical policy arena of policing by sending requests to discuss research partnerships to roughly 3,000 law enforcement agencies in 48 states. A host of agency and jurisdiction attributes fail to predict affirmative responses to generic requests, alleviating concerns over generalizability. However, across two experiments, mentions of agency performance in our correspondence depressed affirmative responses— even among top-performing agencies—by roughly eight percentage points. Many agencies that initially indicate interest in transparent, evidence-based policymaking are likely engaging in cheap talk, and recoil once performance evaluations are made salient. These dynamics can inhibit valuable policy experimentation in many communities.
There is substantial evidence showing racial bias in firms’ hiring decisions, but less is known about bias in career progression. We construct a dataset of police award nominations to estimate the Black-white nomination gap. Exploiting institutional features of the police department to obtain plausibly causal estimates, we find that white supervisors are less likely to nominate Black officers relative to other officers conditional on their work performance. We also find suggestive evidence that supervisors are less likely to learn about and advocate for Black officers. These results, corroborated by an online experiment, suggest that the Black-white nomination gap may stem in part from a racial engagement and advocacy gap. Given the reliance on subjective evaluations for promotions in many organizations, our findings may have important implications for the Black-white promotion gap and the lack of diversity in upper-management positions.
We develop an empirical model of the mechanism used to assign police officers to Chicago districts and examine the efficiency and equity of alternative allocations. We document that the current bidding process, which grants priority based on seniority, results in the assignment of more experienced officers to less violent and high-income neighborhoods. Our empirical model combines estimates of heterogeneous officer preferences underlying the bidding process with causal estimates of the effects of officer experience on neighborhood crime. Equalizing officer seniority across districts would reduce violent crime rate by 4.6 percent and significantly decrease inequality in crime, discretionary arrests, and officer use of force across neighborhoods. Moreover, this assignment can be achieved in a revenue-neutral way while resulting in small welfare gains for police officers, implying that it is more equitable and efficient.
We study the link between officer injuries-on-duty and their peers’ force-use using a network of officers who attended the police academy together through a random lottery. Peer injuries-on-duty increase the probability of using force by 7% in the subsequent week. Officers are also more likely to injure suspects and receive complaints about neglecting victims and violating suspects’ constitutional rights. The effect is concentrated in a narrow time window following the event and is not associated with significantly lower injury risk to the officer. Together, these findings suggest that officers’ emotional responses drive the increase in use of force and misconduct.
Elected leaders rely on “fire alarms” to promote good government: in the presence of bureaucratic malfeasance, citizens can cry foul, so appropriate remedies can be pursued. But what happens when bureaucrats tamper with fire alarms? I explore this in the context of police misconduct, leveraging the sudden relocation of a complaint center in Chicago to test how changing the cost of “pulling fire alarms” affects the supply of information on police wrongdoing. Using rare data on complaints against police, I use adifference-in-differences design to estimate civilians’ complaint valuation. I find opportunity cost deters civilians from reporting misconduct, especially for those seeking help from police. Using a structural model, I show this increased burden would decrease the rate of sustained allegations for failure to provide service but increase the rate for constitutional violations. These results shed light on the complicated interplay between the cost of civilian oversight and government performance.
We study the effect of temperature on police-involved civilian deaths in the U.S. from 2000 to 2016. We show that violent crimes and assaulted or killed officers increase with warmer days (≥17C), indicating an increased risk of personal harm on such days. Despite higher threat level, temperatures have a precise null impact on the number of deaths via firearms, suggesting officers exercise judgment over their use of firearms independently of threat level. However, deaths from Tasers significantly increase during ‘extremely warm’ days (≥32C), indicating a need to reevaluate Taser-use policies to prevent unintended deaths.
We engage in research on how the mass public perceives policing and crime, as well as in public-facing education so academics and the general public understand the issues involved and have the best information available. This includes setting the academic and public record straight in instances of flawed research and misinformation.
For decades, high-profile incidents of excessive force against minorities have fueled allegations of abusive policing in the United States and demands for reform. Yet one of the main driver’s of today’s policing crisis remains unchanged: massive racial disparities in law enforcement. Making use of formal statistical frameworks for drawing causal inferences – that is, reliable conclusions about how and why events occur, given explicitly stated assumptions and observed data – we have shown the importance of measuring and accounting for the long chain of events from officer deployment to contact, detainment, and violence.
The efficiency of any police action depends on the relative magnitude of its crime-reducing benefits and legitimacy costs. Policing strategies that are socially efficient at the city level may be harmful at the local level, because the distribution of direct costs and benefits of police actions that reduce victimization is not the same as the distribution of indirect benefits of feeling safe. In the United States, the local misallocation of police resources is disproportionately borne by Black and Hispanic individuals. Despite the complexity of this particular problem, the incentives facing both police departments and police officers tend to be structured as if the goals of policing were simple—to reduce crime by as much as possible. Formal data collection on the crime-reducing benefits of policing, and not the legitimacy costs, produces further incentives to provide more engagement than may be efficient in any specific encounter, at both the officer and departmental level. There is currently little evidence as to what screening, training, or monitoring strategies are most effective at encouraging individual officers to balance the crime reducing benefits and legitimacy costs of their actions.
A deeply flawed study of police shootings, published in an influential journal, has been retracted by its authors. This is a positive step for the crucial debate on police violence, because pundits continue to use this baseless article to dismiss concerns over racial bias in policing.
Following George Floyd’s killing and ensuing social unrest, policing has ascended to the top of the nation’s political agenda. Yet many politicians and pundits dismiss concerns over systemic racial bias in policing, with some calling it a “myth.” These claims rest on deeply flawed science that has nonetheless been circulated widely and uncritically by major news outlets that span the political spectrum. Misleading statistics have been used to justify racial injustice in the past. They should not be used to do so now.
Minneapolis citizens had already put the city’s police department on notice that Derek Chauvin, the officer charged with second degree murder in the death of George Floyd, might be dangerous. Over his 18-year career, they lodged at least 17 complaints against him, but none resulted in meaningful discipline. Chauvin’s example mirrors a pattern seen in earlier prominent police killings such as the murder by Jason Van Dyke of 17-year-old Laquan McDonald, in Chicago – in which officers with a history of abusive behavior were nonetheless allowed to continue interacting with the public. Why don’t police take complaints seriously? And what can be done to make complaint systems work better?
Is the use of force by police racially biased, and if so, to what degree? Producing hard evidence on this question has posed an immense challenge for social scientists – including us – for decades. So when a peer-reviewed study was published last year in the prestigious Proceedings of the National Academy of Sciences that purported to have overcome these obstacles, we read it with great interest. But while the study was widely covered in the media, it was based on a logical fallacy and erroneous statistical reasoning, and sheds no light on whether police violence is racially biased. We demonstrated this mathematically in a letter published in the same journal last week, but only after months of public protest and controversy. The difficulty of fixing blatant mistakes in academic publications threatens not only the advancement of science but also the promise of evidence-based policymaking.
The increasingly visible presence of heavily armed police units in American communities has stoked widespread concern over the militarization of local law enforcement. Advocates claim militarized policing protects officers and deters violent crime, while critics allege these tactics are targeted at racial minorities and erode trust in law enforcement. Using a rare geocoded census of SWAT team deployments from Maryland, I show that militarized police units are more often deployed in communities with large shares of African American residents, even after controlling for local crime rates. Further, using nationwide panel data on local police militarization, I demonstrate that militarized policing fails to enhance officer safety or reduce local crime. Finally, using survey experiments—one of which includes a large oversample of African American respondents—I show that seeing militarized police in news reports may diminish police reputation in the mass public. In the case of militarized policing, the results suggest that the often-cited trade-off between public safety and civil liberties is a false choice.
Recent protests have brought criminal justice to the forefront of U.S. politics. Moving preferences on policies like mandatory minimums, however, remains a central challenge to widespread reform. Across six experiments (N > 11,000), we show that changing criminal justice policy preferences remains difficult. A common explanation for widespread support for punitive policy is that most Americans believe crime is rising even during periods of decline. However, we find while the public is willing to accept factual corrections about crime rates, this never prompts reconsideration of policy opinions. Additional experiments deploying common persuasive designs show co-partisan elite cues have no effect, but individuals update their opinions when factual corrections are combined with forced consideration of opposition views or when pressured by in-group members. These interventions are cognitively burdensome, logistically challenging to scale and produce only small effects. Policy preferences are movable, but simple information treatments are ineffective, complicating criminal justice reform.
Promoting public safety is a central mandate of government. But despite decades of dramatic improvements, most Americans believe crime is rising—a mysterious pattern that may pervert the criminal justice policymaking process. What explains this disconnect? We test five plausible explanations: survey mismeasurement, extrapolation from local crime conditions, lack of exposure to facts, partisan cues and the racialization of crime. Cross-referencing over a decade of crime records with geolocated polling data and original survey experiments, we show individuals readily update beliefs when presented with accurate crime statistics, but this effect is attenuated when statistics are embedded in a typical crime news article, and confidence in perceptions is diminished when a copartisan elite undermines official statistics. We conclude Americans misperceive crime because of the frequency and manner of encounters with relevant statistics. Our results suggest widespread misperceptions are likely to persist barring foundational changes in Americans’ information consumption habits, or elite assistance.