Assessing Disparate Impacts of Personalized Interventions: Identifiability and Bounds

Personalized interventions in social services, education, and healthcare leverage individual-level causal effect predictions in order to give the best treatment to each individual or to prioritize program interventions for the individuals most likely to benefit. While the sensitivity of these domains compels us to evaluate the fairness of such policies, we show that actually auditing their disparate impacts per standard observational metrics, such as true positive rates, is impossible since ground truths are unknown. Whether our data is experimental or observational, an individual's actual outcome under an intervention different than that received can never be known, only predicted based on features. We prove how we can nonetheless point-identify these quantities under the additional assumption of monotone treatment response, which may be reasonable in many applications. We further provide a sensitivity analysis for this assumption by means of sharp partial-identification bounds under violations of monotonicity of varying strengths. We show how to use our results to audit personalized interventions using partially-identified ROC and xROC curves and demonstrate this in a case study of a French job training dataset.

[1]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[2]  Charles F. Manski,et al.  Social Choice with Partial Knowledge of Treatment Response , 2020 .

[3]  S. Athey,et al.  Generalized random forests , 2016, The Annals of Statistics.

[4]  D. Basu Randomization Analysis of Experimental Data: The Fisher Randomization Test , 1980 .

[5]  Uri Shalit,et al.  Estimating individual treatment effect: generalization bounds and algorithms , 2016, ICML.

[6]  T. Mkandawire Targeting and universalism in poverty reduction , 2007 .

[7]  Xiaojie Mao,et al.  Interval Estimation of Individual-Level Causal Effects Under Unobserved Confounding , 2018, AISTATS.

[8]  Nathan Kallus,et al.  Recursive Partitioning for Personalization using Observational Data , 2016, ICML.

[9]  Nathan Kallus,et al.  The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the xAUC Metric , 2019, NeurIPS.

[10]  Caitlin Brown,et al.  Smart Hedging against Carbon Leakage , 2015, Economic Policy.

[11]  Yiling Chen,et al.  Fair classification and social welfare , 2019, FAT*.

[12]  Sanmay Das,et al.  Allocating Interventions Based on Predicted Outcomes: A Case Study on Homelessness Services , 2019, AAAI.

[13]  J. Heckman,et al.  Removing the Veil of Ignorance in Assessing the Distributional Impacts of Social Policies , 2002, SSRN Electronic Journal.

[14]  Avi Feller,et al.  Algorithmic Decision Making in the Presence of Unmeasured Confounding , 2018, 1805.01868.

[15]  Susan Athey,et al.  Beyond prediction: Using big data for policy problems , 2017, Science.

[16]  Toniann Pitassi,et al.  Fairness through Causal Awareness: Learning Causal Latent-Variable Models for Biased Data , 2018, FAT.

[17]  M. Berger,et al.  Evaluating profiling as a means of allocating government services , 2000 .

[18]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[19]  Dean Karlan,et al.  A multifaceted program causes lasting progress for the very poor: Evidence from six countries , 2015, Science.

[20]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[21]  R. Tyrrell Rockafellar,et al.  Convex Analysis , 1970, Princeton Landmarks in Mathematics and Physics.

[22]  Nathan Kallus,et al.  Policy Evaluation with Latent Confounders via Optimal Balance , 2019, NeurIPS.

[23]  Michael Lechner,et al.  Targeting labour market programmes — results from a randomized experiment , 2007, SSRN Electronic Journal.

[24]  D. Katz The American Statistical Association , 2000 .

[25]  Jonathan M. V. Davis,et al.  Using Causal Forests to Predict Treatment Heterogeneity: An Application to Summer Jobs , 2017 .

[26]  Hannah Lebovits Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor , 2018, Public Integrity.

[27]  Gerard J. van den Berg,et al.  Active Labor Market Policies , 2016, SSRN Electronic Journal.

[28]  Nathan Kallus,et al.  Balanced Policy Evaluation and Learning , 2017, NeurIPS.

[29]  M. Gurgand,et al.  Private and Public Provision of Counseling to Job-Seekers: Evidence from a Large Controlled Experiment , 2014, SSRN Electronic Journal.

[30]  J. Pearl,et al.  Bounds on Treatment Effects from Studies with Imperfect Compliance , 1997 .

[31]  Ilya Molchanov,et al.  Partial identification using random set theory , 2010, Journal of Econometrics.

[32]  Xiaojie Mao,et al.  Assessing algorithmic fairness with unobserved protected class using data combination , 2019, FAT*.

[33]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[34]  Abraham Charnes,et al.  Programming with linear fractional functionals , 1962 .

[35]  C. Manski Partial Identification of Probability Distributions , 2003 .

[36]  Debopam Bhattacharya,et al.  Inferring Welfare Maximizing Treatment Assignment Under Budget Constraints , 2008 .

[37]  Esther Rolf,et al.  Delayed Impact of Fair Machine Learning , 2018, ICML.

[38]  C. Manski Monotone Treatment Response , 2009, Identification for Prediction and Decision.

[39]  Nathan Kallus,et al.  Confounding-Robust Policy Improvement , 2018, NeurIPS.

[40]  Ravi Shroff,et al.  Predictive Analytics for City Agencies: Lessons from Children's Services , 2017, Big Data.

[41]  Solon Barocas,et al.  Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions , 2018, 1811.07867.

[42]  Joichi Ito,et al.  Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment , 2017, FAT.

[43]  Christian Hansen,et al.  Double/Debiased/Neyman Machine Learning of Treatment Effects , 2017, 1701.08687.

[44]  A. Narayanan,et al.  Fairness and Machine Learning Limitations and Opportunities , 2018 .

[45]  C. Blumberg Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[46]  John Langford,et al.  Doubly Robust Policy Evaluation and Optimization , 2014, ArXiv.

[47]  Andrew Bennett,et al.  Deep Generalized Method of Moments for Instrumental Variable Analysis , 2019, NeurIPS.

[48]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[49]  Stefan Wager,et al.  Efficient Policy Learning , 2017, ArXiv.

[50]  Krishna P. Gummadi,et al.  Fairness Behind a Veil of Ignorance: A Welfare Analysis for Automated Decision Making , 2018, NeurIPS.

[51]  D. V. Lindley,et al.  Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment , 1980 .

[52]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[53]  Nathan Kallus,et al.  Residual Unfairness in Fair Machine Learning from Prejudiced Data , 2018, ICML.

[54]  Toniann Pitassi,et al.  Fairness Through Causal Awareness: Learning Latent-Variable Models for Biased Data , 2018, ArXiv.

[55]  Madeleine Udell,et al.  Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved , 2018, FAT.

[56]  David Sontag,et al.  Why Is My Classifier Discriminatory? , 2018, NeurIPS.

[57]  Nathan Kallus,et al.  Classifying Treatment Responders Under Causal Effect Monotonicity , 2019, ICML.

[58]  Toru Kitagawa,et al.  Who should be Treated? Empirical Welfare Maximization Methods for Treatment Choice , 2015 .

[59]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[60]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[61]  Jure Leskovec,et al.  The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables , 2017, KDD.

[62]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .