Dude, Where’s My Treatment Effect? Errors in Administrative Data Linking and the Destruction of Statistical Power in Randomized Experiments

Objective The increasing availability of large administrative datasets has led to an exciting innovation in criminal justice research—using administrative data to measure experimental outcomes in lieu of costly primary data collection. We demonstrate that this type of randomized experiment can have an unfortunate consequence: the destruction of statistical power. Combining experimental data with administrative records to track outcomes of interest typically requires linking datasets without a common identifier. In order to minimize mistaken linkages, researchers often use stringent linking rules like “exact matching” to ensure that speculative matches do not lead to errors in an analytic dataset. We show that this, seemingly conservative, approach leads to underpowered experiments, leaves real treatment effects undetected, and can therefore have profound implications for entire experimental literatures. Methods We derive an analytic result for the consequences of linking errors on statistical power and show how the problem varies across combinations of relevant inputs, including linking error rate, outcome density and sample size. Results Given that few experiments are overly well-powered, even small amounts of linking error can have considerable impact on Type II error rates. In contrast to exact matching, machine learning-based probabilistic matching algorithms allow researchers to recover a considerable share of the statistical power lost under stringent data-linking rules. Conclusion Our results demonstrate that probabilistic linking substantially outperforms stringent linking criteria. Failure to implement linking procedures designed to reduce linking errors can have dire consequences for subsequent analyses and, more broadly, for the viability of this type of experimental research.

[1]  L. Sherman The power few: experimental criminology and the reduction of harm , 2007 .

[2]  Hannah R. Rothstein,et al.  Publication bias as a threat to the validity of meta-analytic results , 2008 .

[3]  Dennis J. Aigner,et al.  Regression with a binary independent variable subject to errors of observation , 1973 .

[4]  David Weisburd,et al.  Justifying the use of non-experimental methods and disqualifying the use of randomized controlled trials: challenging folklore in evaluation research in crime and justice , 2010 .

[5]  Michael Kremer,et al.  Chapter 61 Using Randomization in Development Economics Research: A Toolkit ★ , 2007 .

[6]  Anna Ferrante Developing an Offender-Based Tracking System: The Western Australia Inois Project , 1993 .

[7]  Ellen G. Cohn,et al.  The variable effects of arrest on criminal careers: The Milwaukee domestic violence experiment , 1992 .

[8]  Joshua D. Angrist,et al.  Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[9]  Philip J. Cook,et al.  An Experimental Evaluation of a Comprehensive Employment-Oriented Prisoner Re-entry Program , 2014, Journal of Quantitative Criminology.

[10]  Kevin Arceneaux,et al.  A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark , 2010 .

[11]  Judd B. Kessler,et al.  The Effects of Youth Employment: Evidence from New York City Lotteries , 2016 .

[12]  U. Fischbacher,et al.  Are People Conditionally Cooperative? Evidence from a Public Goods Experiment , 2001 .

[13]  J. Angrist Mostly Harmless Econometrics , 2008 .

[14]  M. Law,et al.  A New Method for Assessing How Sensitivity and Specificity of Linkage Studies Affects Estimation , 2014, PloS one.

[15]  David Kirk,et al.  LABELING EFFECTS OF FIRST JUVENILE ARRESTS: SECONDARY DEVIANCE AND SECONDARY SANCTIONING , 2014 .

[16]  Timothy D. Wilson,et al.  Comment on “Estimating the reproducibility of psychological science” , 2016, Science.

[17]  S. Mullainathan,et al.  Do People Mean What They Say? Implications for Subjective Survey Data , 2001 .

[18]  Richard A. Berk,et al.  Randomized experiments as the bronze standard , 2005 .

[19]  Ted Enamorado,et al.  Using a Probabilistic Model to Assist Merging of Large-Scale Administrative Records , 2018, American Political Science Review.

[20]  Julia Lane Building an Infrastructure to Support the Use of Government Administrative Data for Program Performance and Social Science Research , 2018 .

[21]  Alese Wooditch,et al.  Sample size, effect size, and statistical power: a replication study of Weisburd’s paradox , 2015 .

[22]  G. Imbens,et al.  Better Late than Nothing: Some Comments on Deaton (2009) and Heckman and Urzua (2009) , 2009 .

[23]  Anthony A. Braga,et al.  POLICING CRIME AND DISORDER HOT SPOTS: A RANDOMIZED CONTROLLED TRIAL* , 2008 .

[24]  Richard E. Tremblay,et al.  The Montreal Longitudinal and Experimental Study , 2003 .

[25]  M. Geerken,et al.  Rap sheets in criminological research: Considerations and caveats , 1994 .

[26]  William Wells,et al.  The validity of criminal justice contacts reported by inmates: A comparison of self-reported data with official prison records , 2010 .

[27]  Bruce G. Taylor,et al.  A PROACTIVE RESPONSE TO FAMILY VIOLENCE: THE RESULTS OF A RANDOMIZED EXPERIMENT , 1997 .

[28]  Soko Setoguchi,et al.  Comparing record linkage software programs and algorithms using real-world data , 2019, PloS one.

[29]  A. Petrosino,et al.  "Scared Straight" and other juvenile awareness programs for preventing juvenile delinquency. , 2002, The Cochrane database of systematic reviews.

[30]  Denise C. Gottfredson,et al.  Long-term effects of participation in the Baltimore City drug treatment court: Results from an experimental study , 2006 .

[31]  G. Gigerenzer,et al.  Do studies of statistical power have an effect on the power of studies , 1989 .

[32]  Jeffrey A. Smith,et al.  Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators? , 2000 .

[33]  Magne Mogstad,et al.  Family Welfare Cultures , 2013, SSRN Electronic Journal.

[34]  Brett R. Gordon,et al.  A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook , 2018, Mark. Sci..

[35]  K. Campbell,et al.  Impact of record-linkage methodology on performance indicators and multivariate relationships. , 2009, Journal of substance abuse treatment.

[36]  James J. Feigenbaum,et al.  Automated Linking of Historical Data , 2019, Journal of Economic Literature.

[37]  Charles E. Loeffler,et al.  DOES IMPRISONMENT ALTER THE LIFE COURSE? EVIDENCE ON CRIME AND EMPLOYMENT FROM A NATURAL EXPERIMENT , 2013 .

[38]  Matthew C. Makel,et al.  Replication in Criminology and the Social Sciences , 2018 .

[39]  David P Farrington,et al.  A Short History of Randomized Experiments in Criminology , 2003, Evaluation review.

[40]  Robert Östling,et al.  Wealth, Health, and Child Development: Evidence from Administrative Data on Swedish Lottery Players , 2015 .

[41]  D B Dunson,et al.  Theoretical limits of microclustering for record linkage , 2018, Biometrika.

[42]  Donald B. Rubin,et al.  Comment: The Design and Analysis of Gold Standard Randomized Experiments , 2008 .

[43]  Michelle C. Kondo,et al.  Citywide cluster randomized trial to restore blighted vacant land and its effects on violence, crime, and fear , 2018, Proceedings of the National Academy of Sciences.

[44]  Jon A. Krosnick,et al.  Measuring Voter Registration and Turnout in Surveys Do Official Government Records Yield More Accurate Assessments , 2016 .

[45]  Crystal S. Yang,et al.  The Effects of Pre-Trial Detention on Conviction, Future Crime, and Employment: Evidence from Randomly Assigned Judges , 2016 .

[46]  Harvey Goldstein,et al.  Challenges in administrative data linkage for research , 2017, Big Data Soc..

[47]  P. Lahiri,et al.  Regression Analysis With Linked Data , 2005 .

[48]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[49]  R Fisher,et al.  Design of Experiments , 1936 .

[50]  Randi Hjalmarsson,et al.  Like Godfather, Like Son: Exploring the Intergenerational Nature of Crime , 2009 .

[51]  Craig I. Watson,et al.  Fingerprint Vendor Technology Evaluation , 2014 .

[52]  J. McCord,et al.  Cures That Harm: Unanticipated Outcomes of Crime Prevention Programs , 2003 .

[53]  Benjamin Hansen,et al.  Legal Access to Alcohol and Criminality , 2016, Journal of health economics.

[54]  John P. A. Ioannidis,et al.  The Power of Bias in Economics Research , 2017 .

[55]  Alexandre Mas,et al.  Potential Unemployment Insurance Duration and Labor Supply: The Individual and Market-Level Response to a Benefit Cut , 2016, Journal of Political Economy.

[56]  Benjamin Hansen,et al.  Punishment and Deterrence: Evidence from Drunk Driving , 2013 .

[57]  R. Lalonde Evaluating the Econometric Evaluations of Training Programs with Experimental Data , 1984 .

[58]  Lorraine Mazerolle,et al.  History of randomized controlled experiments in criminal justice , 2014 .

[59]  Robert J. Sampson,et al.  POISONED DEVELOPMENT: ASSESSING CHILDHOOD LEAD EXPOSURE AS A CAUSE OF CRIME IN A BIRTH COHORT FOLLOWED THROUGH ADOLESCENCE , 2018 .

[60]  Pamela K. Lattimore,et al.  Does Swift, Certain, and Fair ‘Work': Outcome Findings from the HOPE Demonstration Field Experiment , 2016 .

[61]  Joseph Price,et al.  Combining Family History and Machine Learning to Link Historical Records , 2019 .

[62]  Shanti Gomatam,et al.  An empirical comparison of record linkage procedures , 2002, Statistics in medicine.

[63]  Paul Nieuwbeerta,et al.  The impact of military service on criminal offending over the life course: evidence from a Dutch conviction cohort , 2012 .

[64]  D. Weisburd Ethical Practice and Evaluation of Interventions in Crime and Justice , 2003, Evaluation review.

[65]  Petra E. Todd,et al.  Reconciling Conflicting Evidence on the Performance of Propensity-Score Matching Methods , 2001 .

[66]  Robert J. Sampson,et al.  LIFE‐COURSE DESISTERS? TRAJECTORIES OF CRIME AMONG DELINQUENT BOYS FOLLOWED TO AGE 70* , 2003 .

[67]  Lescher Fg Nervous Complications of Infective Hepatitis. , 1944 .

[68]  D. Clark,et al.  Comparison of probabilistic and deterministic record linkage in the development of a statewide trauma registry. , 1995, Proceedings. Symposium on Computer Applications in Medical Care.

[69]  Abhijit Banerjee,et al.  The Experimental Approach to Development Economics , 2008 .

[70]  Megan Stoltz,et al.  The lack of experimental research in criminology—evidence from Criminology and Justice Quarterly , 2020, Journal of Experimental Criminology.

[71]  N. Freudenberg,et al.  Linking women in jail to community services: factors associated with rearrest and retention of drug-using women following release from jail. , 1998, Journal of the American Medical Women's Association.

[72]  Grant Duwe,et al.  A randomized experiment of a prisoner reentry program: updated results from an evaluation of the Minnesota Comprehensive Offender Reentry Plan (MCORP) , 2014 .

[73]  Robert F. Boruch,et al.  Resolving Ethical and Legal Problems in Randomized Experiments , 2000 .

[74]  Anthony A. Braga,et al.  And We Wonder Why Criminology Is Sometimes Considered Irrelevant in Real‐World Policy Conversations , 2016 .

[75]  P M Dunn,et al.  James Lind (1716-94) of Edinburgh and the treatment of scurvy , 1997, Archives of disease in childhood. Fetal and neonatal edition.

[76]  Anthony A. Braga,et al.  Problem-oriented policing in violent crime places: A randomized controlled experiment , 1999 .

[77]  Daniel S. Nagin,et al.  The Real Gold Standard: Measuring Counterfactual Worlds That Matter Most to Social Science and Policy , 2019, Annual Review of Criminology.

[78]  Dennis Deck,et al.  Record linkage software in the public domain: a comparison of Link Plus, The Link King, and a `basic' deterministic algorithm , 2008, Health Informatics J..

[79]  J. Dinardo,et al.  The Returns to Computer Use Revisited: Have Pencils Changed the Wage Structure Too? , 1996 .

[80]  C. Ford,et al.  Ascertainment of vital status through the National Death Index and the Social Security Administration. , 1985, American journal of epidemiology.

[81]  Edwin Powers,et al.  An experiment in the prevention of delinquency : the Cambridge Somerville youth study , 1952 .

[82]  Robert J. Sampson,et al.  Public and Private Spheres of Neighborhood Disorder , 2015 .

[83]  James J. Heckman,et al.  Assessing the Case for Social Experiments , 1995 .

[84]  Angus Deaton Instruments, Randomization, and Learning about Development , 2010 .

[85]  Seth J. Hill Changing votes or changing voters? How candidates and election context swing voters and mobilize the base , 2017 .

[86]  Janet L. Lauritsen,et al.  LIMITATIONS IN THE USE OF LONGITUDINAL SELF‐REPORT DATA: A COMMENT , 1999 .

[87]  Torkild Hovde Lyngstad,et al.  Nordic Register Data and Their Untapped Potential for Criminological Knowledge , 2011, Crime and Justice.

[88]  Grant Duwe,et al.  Evaluating the Minnesota Comprehensive Offender Reentry Plan (MCORP): Results from a Randomized Experiment , 2012 .

[89]  David P. Farrington,et al.  A Half Century of Randomized Experiments on Crime and Justice , 2006, Crime and Justice.

[90]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[91]  D. Moher,et al.  Statistical power, sample size, and their reporting in randomized controlled trials. , 1994, JAMA.

[92]  David P. Farrington,et al.  Key longitudinal-experimental studies in criminology , 2006 .

[93]  Anna Louise Stewart,et al.  Administrative data linkage as a tool for developmental and life-course criminology: The Queensland Linkage Project , 2015 .

[94]  Ivan P. Fellegi,et al.  A Theory for Record Linkage , 1969 .

[95]  S. Ettner,et al.  Linking hospital discharge and death records--accuracy and sources of bias. , 2004, Journal of clinical epidemiology.

[96]  James J. Feigenbaum,et al.  Automated Census Record Linking: A Machine Learning Approach , 2016 .

[97]  Sheldon X. Zhang,et al.  An experimental evaluation of a nationally recognized employment-focused offender reentry program , 2014 .

[98]  Lyria Bennett Moses,et al.  The Challenges of Doing Criminology in the Big Data Era: Towards a Digital and Data-driven Approach , 2017 .

[99]  John Neter,et al.  The Effect of Mismatching on the Measurement of Response Errors , 1965 .

[100]  Alan S. Gerber,et al.  Do Statistical Reporting Standards Affect What Is Published? Publication Bias in Two Leading Political Science Journals , 2008 .

[101]  Ted Enamorado,et al.  Active Learning for Probabilistic Record Linkage . ∗ , 2018 .

[102]  Christine Eibner,et al.  The Efficacy of the Rio Hondo DUI Court , 2007, Evaluation review.

[103]  Jordan M. Hyatt,et al.  On the potential of incorporating administrative register data into randomized experiments , 2019, Journal of Experimental Criminology.

[104]  Carlos Dobkin,et al.  The Economic Consequences of Hospital Admissions , 2016, The American economic review.

[105]  Peter Christen,et al.  Quality and Complexity Measures for Data Linkage and Deduplication , 2007, Quality Measures in Data Mining.

[106]  Anuj K. Shah,et al.  Thinking, Fast and Slow? Some Field Experiments to Reduce Crime and Dropout in Chicago* , 2015, The quarterly journal of economics.

[107]  Gideon Nave,et al.  Evaluating replicability of laboratory experiments in economics , 2016, Science.

[108]  David Powell,et al.  Medical Care Spending and Labor Market Outcomes: Evidence from Workers' Compensation Reforms , 2014, The American economic review.

[109]  Fritz Scheuren,et al.  Regression Analysis of Data Files that Are Computer Matched , 1993 .

[110]  Shane D. Johnson,et al.  Domestic Burglary Repeats and Space-Time Clusters , 2005 .

[111]  Jennifer L. Doleac,et al.  Which prisoner reentry programs work? Replicating and extending analyses of three RCTs , 2020 .

[112]  Peter Christen,et al.  Data Matching , 2012, Data-Centric Systems and Applications.

[113]  Michael S. Caudy,et al.  Risk tells us who, but not what or how" empirical assessment of the complexity of criminogenic needs to inform correctional programming , 2015 .

[114]  Nancy A. Morris,et al.  The Validity of Self-reported Prevalence, Frequency, and Timing of Arrest: An Evaluation of Data Collected Using a Life Event Calendar , 2010 .

[115]  Arie Hasman,et al.  Results from simulated data sets: probabilistic record linkage outperforms deterministic record linkage. , 2011, Journal of clinical epidemiology.

[116]  Will Dobbie,et al.  The Intergenerational Effects of Parental Incarceration , 2018, SSRN Electronic Journal.

[117]  Robert J. Sampson,et al.  Gold Standard Myths: Observations on the Experimental Turn in Quantitative Criminology , 2010 .

[118]  H B NEWCOMBE,et al.  Automatic linkage of vital records. , 1959, Science.

[119]  Tasseli McKay,et al.  Taking children into account: Addressing the intergenerational effects of parental incarceration , 2011 .

[120]  J. Vinther,et al.  Fossil Evidence for Evolution of the Shape and Color of Penguin Feathers , 2010, Science.

[121]  Augustine Denteh,et al.  Estimating the Associations between Snap and Food Insecurity, Obesity, and Food Purchases with Imperfect Administrative Measures of Participation , 2018, Southern Economic Journal.

[122]  Linda Steg,et al.  The Spreading of Disorder , 2008, Science.

[123]  Christopher Wildeman,et al.  PATERNAL INCARCERATION AND CHILDREN'S RISK OF BEING CHARGED BY EARLY ADULTHOOD: EVIDENCE FROM A DANISH POLICY SHOCK* , 2017 .

[124]  Lawrence W. Sherman,et al.  General deterrent effects of police patrol in crime “hot spots”: A randomized, controlled trial , 1995 .

[125]  Edwin Powers,et al.  An Experiment in Prevention of Delinquency , 1949 .

[126]  J. Laub,et al.  UNRAVELING FAMILIES AND DELINQUENCY: A REANALYSIS OF THE GLUECKS' DATA* , 1988 .

[127]  Jacob Cohen Statistical Power Analysis , 1992 .

[128]  William E. Winkler,et al.  Methods for Record Linkage and Bayesian Networks , 2002 .

[129]  John M. MacDonald,et al.  Effect of Gang Injunctions on Crime: A Study of Los Angeles from 1988–2014 , 2018, Journal of Quantitative Criminology.

[130]  Gail Mason,et al.  Design Sensitivity in Criminal Justice Experiments , 1993, Crime and Justice.

[131]  David P. Farrington,et al.  Randomized Experiments on Crime and Justice , 1983, Crime and Justice.

[132]  Sandra E. Black,et al.  Why the Apple Doesn't Fall Far: Understanding Intergenerational Transmission of Human Capital , 2003, SSRN Electronic Journal.

[133]  John K. Roman,et al.  The Multi-site Adult Drug Court Evaluation: Executive Summary: (718382011-001) , 2011 .

[134]  Monica Deza,et al.  The intergenerational effects of education on delinquency , 2017, Journal of Economic Behavior & Organization.

[135]  Jerry Daday,et al.  Exploring Demographic, Structural, and Behavioral Overlap Among Homicide Offenders and Victims , 2006 .

[136]  Murat Sariyar,et al.  Active learning strategies for the deduplication of electronic patient data using classification trees , 2012, J. Biomed. Informatics.

[137]  Sara B. Heller Summer jobs reduce violence among disadvantaged youth , 2014, Science.

[138]  A. Khwaja,et al.  Do Lenders Favor Politically Connected Firms? Rent Provision in an Emerging Financial Market , 2004 .

[139]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[140]  Mikhail Bilenko,et al.  Learnable Similarity Functions and their Applications to Clustering and Record Linkage , 2004, AAAI.

[141]  J. Lynch NOT EVEN OUR OWN FACTS: CRIMINOLOGY IN THE ERA OF BIG DATA , 2018, Criminology.