Fair Decisions Despite Imperfect Predictions

Consequential decisions are increasingly informed by sophisticated data-driven predictive models. However, consistently learning accurate predictive models requires access to ground truth labels. Unfortunately, in practice, labels may only exist conditional on certain decisions---if a loan is denied, there is not even an option for the individual to pay back the loan. In this paper, we show that, in this selective labels setting, learning to predict is suboptimal in terms of both fairness and utility. To avoid this undesirable behavior, we propose to directly learn stochastic decision policies that maximize utility under fairness constraints. In the context of fair machine learning, our results suggest the need for a paradigm shift from "learning to predict" to "learning to decide". Experiments on synthetic and real-world data illustrate the favorable properties of learning to decide, in terms of both utility and fairness.

[1]  John Langford,et al.  Doubly Robust Policy Evaluation and Learning , 2011, ICML.

[2]  Nathan Srebro,et al.  From Fair Decision Making To Social Equality , 2018, FAT.

[3]  Stefan Wager,et al.  Efficient Policy Learning , 2017, ArXiv.

[4]  Solon Barocas,et al.  Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions , 2018, 1811.07867.

[5]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[6]  Christopher Jung,et al.  Online Learning with an Unknown Fairness Metric , 2018, NeurIPS.

[7]  Philip M. Long,et al.  Apple Tasting , 2000, Inf. Comput..

[8]  J. Kiefer,et al.  Stochastic Estimation of the Maximum of a Regression Function , 1952 .

[9]  Joaquin Quiñonero Candela,et al.  Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[10]  Thorsten Joachims,et al.  The Self-Normalized Estimator for Counterfactual Learning , 2015, NIPS.

[11]  John Langford,et al.  Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.

[12]  Esther Rolf,et al.  Delayed Impact of Fair Machine Learning , 2018, ICML.

[13]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[14]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[15]  Jure Leskovec,et al.  Human Decisions and Machine Predictions , 2017, The quarterly journal of economics.

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[18]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[19]  Duncan J. Watts,et al.  Objecting to experiments that compare two unobjectionable policies or treatments , 2019, Proceedings of the National Academy of Sciences.

[20]  Bernhard Schölkopf,et al.  Consequential Ranking Algorithms and Long-term Welfare , 2019, ArXiv.

[21]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[22]  Jonathan P. How,et al.  Off-Policy Reinforcement Learning with , 2014 .

[23]  D. Ensign,et al.  Decision making with limited feedback : Error bounds for recidivism prediction and predictive policing , 2017 .

[24]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[25]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[26]  Miroslav Dudík,et al.  Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need? , 2018, CHI.

[27]  Aaron Roth,et al.  Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.

[28]  Thorsten Joachims,et al.  Counterfactual Risk Minimization: Learning from Logged Bandit Feedback , 2015, ICML.

[29]  Nathan Kallus,et al.  Balanced Policy Evaluation and Learning , 2017, NeurIPS.

[30]  Aaron Roth,et al.  Fairness in Reinforcement Learning , 2016, ICML.

[31]  Avi Feller,et al.  Algorithmic Decision Making in the Presence of Unmeasured Confounding , 2018, 1805.01868.

[32]  John Langford,et al.  Exploration scavenging , 2008, ICML '08.

[33]  Andreas Krause,et al.  Preventing Disparate Treatment in Sequential Decision Making , 2018, IJCAI.

[34]  D. Rubin Causal Inference Using Potential Outcomes , 2005 .

[35]  Nathan Srebro,et al.  Learning Non-Discriminatory Predictors , 2017, COLT.

[36]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[37]  Jure Leskovec,et al.  The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables , 2017, KDD.

[38]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[39]  Nathan Kallus,et al.  Residual Unfairness in Fair Machine Learning from Prejudiced Data , 2018, ICML.

[40]  Cynthia Rudin,et al.  Learning Cost-Effective and Interpretable Treatment Regimes , 2017, AISTATS.

[41]  Yiling Chen,et al.  A Short-term Intervention for Long-term Fairness in the Labor Market , 2017, WWW.

[42]  Yang Liu,et al.  Bayesian Fairness , 2019, AAAI.

[43]  Adish Singla,et al.  Enhancing the Accuracy and Fairness of Human Decision Making , 2018, NeurIPS.