Ensuring Fairness under Prior Probability Shifts

In this paper, we study the problem of fair classification in the presence of prior probability shifts, where the training set distribution differs from the test set. This phenomenon can be observed in the yearly records of several real-world datasets, such as recidivism records and medical expenditure surveys. If unaccounted for, such shifts can cause the predictions of a classifier to become unfair towards specific population subgroups. While the fairness notion called Proportional Equality (PE) accounts for such shifts, a procedure to ensure PE-fairness was unknown. In this work, we propose a method, called CAPE, which provides a comprehensive solution to the aforementioned problem. CAPE makes novel use of prevalence estimation techniques, sampling and an ensemble of classifiers to ensure fair predictions under prior probability shifts. We introduce a metric, called prevalence difference (PD), which CAPE attempts to minimize in order to ensure PE-fairness. We theoretically establish that this metric exhibits several desirable properties. We evaluate the efficacy of CAPE via a thorough empirical evaluation on synthetic datasets. We also compare the performance of CAPE with several popular fair classifiers on real-world datasets like COMPAS (criminal risk assessment) and MEPS (medical expenditure panel survey). The results indicate that CAPE ensures PE-fair predictions, while performing well on other performance metrics.

[1]  Alexandra Chouldechova,et al.  The Frontiers of Fairness in Machine Learning , 2018, ArXiv.

[2]  Indre Zliobaite,et al.  Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models , 2016, Artificial Intelligence and Law.

[3]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[4]  Jeannette M. Wing,et al.  Ensuring Fairness Beyond the Training Data , 2020, NeurIPS.

[5]  Lu Zhang,et al.  Achieving non-discrimination in prediction , 2017, IJCAI.

[6]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[7]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[8]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[9]  Hany Farid,et al.  The accuracy, fairness, and limits of predicting recidivism , 2018, Science Advances.

[10]  Enrique Alegre,et al.  Class distribution estimation based on the Hellinger distance , 2013, Inf. Sci..

[11]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[12]  Salvatore Ruggieri,et al.  A multidisciplinary survey on discrimination analysis , 2013, The Knowledge Engineering Review.

[13]  Francisco Herrera,et al.  A unifying view on dataset shift in classification , 2012, Pattern Recognit..

[14]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[15]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[16]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[17]  Peter A. Flach,et al.  Patterns of dataset shift , 2014 .

[18]  Suresh Venkatasubramanian,et al.  A comparative study of fairness-enhancing interventions in machine learning , 2018, FAT.

[19]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[20]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[21]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[22]  Rachel K. E. Bellamy,et al.  AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias , 2018, ArXiv.

[23]  Maria-Florina Balcan,et al.  Envy-Free Classification , 2018, NeurIPS.

[24]  Maya R. Gupta,et al.  Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints , 2018, ICML.

[25]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[26]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[27]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[28]  Luca Oneto,et al.  Fairness in Machine Learning , 2020, INNSBDDL.

[29]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[30]  Maya R. Gupta,et al.  Satisfying Real-world Goals with Dataset Constraints , 2016, NIPS.

[31]  Nisarg Shah,et al.  Designing Fairly Fair Classifiers Via Economic Fairness Notions , 2020, WWW.

[32]  GonzáLez-CastroVíCtor,et al.  Class distribution estimation based on the Hellinger distance , 2013 .

[33]  Guangquan Zhang,et al.  Learning under Concept Drift: A Review , 2019, IEEE Transactions on Knowledge and Data Engineering.

[34]  Christopher T. Lowenkamp,et al.  False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[35]  George Forman,et al.  Counting Positives Accurately Despite Inaccurate Classification , 2005, ECML.

[36]  Krishna P. Gummadi,et al.  From Parity to Preference-based Notions of Fairness in Classification , 2017, NIPS.

[37]  David C. Parkes,et al.  Fairness without Harm: Decoupled Classifiers with Preference Guarantees , 2019, ICML.

[38]  Nathan Srebro,et al.  Learning Non-Discriminatory Predictors , 2017, COLT.

[39]  Rayid Ghani,et al.  Aequitas: A Bias and Fairness Audit Toolkit , 2018, ArXiv.

[40]  Jon M. Kleinberg,et al.  Discrimination in the Age of Algorithms , 2018, SSRN Electronic Journal.

[41]  Suvam Mukherjee,et al.  Fairness Through the Lens of Proportional Equality , 2019, AAMAS.

[42]  Marco Saerens,et al.  Adjusting the Outputs of a Classifier to New a Priori Probabilities: A Simple Procedure , 2002, Neural Computation.

[43]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[44]  Nitesh V. Chawla,et al.  Why is quantification an interesting learning problem? , 2016, Progress in Artificial Intelligence.

[45]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[46]  José Hernández-Orallo,et al.  Quantification via Probability Estimators , 2010, 2010 IEEE International Conference on Data Mining.

[47]  George Forman,et al.  Quantifying trends accurately despite classifier error and class imbalance , 2006, KDD '06.

[48]  Adam Tauman Kalai,et al.  Decoupled classifiers for fair and efficient machine learning , 2017, ArXiv.

[49]  Nisheeth K. Vishnoi,et al.  Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees , 2018, FAT.

[50]  ŽliobaiteIndre,et al.  Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models , 2016 .

[51]  Xiangliang Zhang,et al.  Decision Theory for Discrimination-Aware Classification , 2012, 2012 IEEE 12th International Conference on Data Mining.

[52]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.