Parity-based cumulative fairness-aware boosting

Data-driven AI systems can lead to discrimination on the basis of protected attributes like gender or race. One reason for this behavior is the encoded societal biases in the training data (e.g., females are underrepresented), which is aggravated in the presence of unbalanced class distributions (e.g., “granted” is the minority class). Stateof-the-art fairness-aware machine learning approaches focus on preserving the overall classification accuracy while improving fairness. In the presence of class-imbalance, such methods may further aggravate the problem of discrimination by denying an already underrepresented group (e.g., females) the fundamental rights of equal social privileges (e.g., equal credit opportunity). To this end, we propose AdaFair, a fairness-aware boosting ensemble that changes the data distribution at each round, taking into account not only the class errors but also the fairness-related performance of the model defined cumulatively based on the partial ensemble. Except for the in-training boosting of the group discriminated over each round, AdaFair directly tackles imbalance during the post-training phase by optimizing the number of ensemble learners for balanced error performance (BER). AdaFair can facilitate different parity-based fairness notions and mitigate effectively discriminatory outcomes. Our experiments show that our approach can achieve parity in terms of statistical parity, equal opportunity, and disparate mistreatment while maintaining good predictive performance for all classes.

[1]  Eirini Ntoutsi,et al.  FAHT: An Adaptive Fairness-aware Decision Tree Classifier , 2019, IJCAI.

[2]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..

[3]  Benjamin Fish,et al.  A Confidence-Based Approach for Balancing Fairness and Accuracy , 2016, SDM.

[4]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[5]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[6]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[7]  Toon Calders,et al.  Discrimination Aware Decision Tree Learning , 2010, 2010 IEEE International Conference on Data Mining.

[8]  Eirini Ntoutsi,et al.  AdaFair: Cumulative Fairness Adaptive Boosting , 2019, CIKM.

[9]  Michael Luca,et al.  Digital Discrimination: The Case of Airbnb.com , 2014 .

[10]  Eirini Ntoutsi,et al.  Fairness-enhancing interventions in stream classification , 2019, DEXA.

[11]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[12]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[13]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[14]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[15]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[16]  Franco Turini,et al.  Measuring Discrimination in Socially-Sensitive Decision Records , 2009, SDM.

[17]  Eirini Ntoutsi,et al.  Online Fairness-Aware Learning with Imbalanced Data Streams , 2021, ArXiv.

[18]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[19]  Toon Calders,et al.  Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[20]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[21]  Eirini Ntoutsi,et al.  Dealing with Bias via Data Augmentation in Supervised Learning Scenarios , 2018 .

[22]  M. Tomasello,et al.  Fair Is Not Fair Everywhere , 2015, Psychological science.

[23]  Yiannis Kompatsiaris,et al.  Adaptive Sensitive Reweighting to Mitigate Bias in Fairness-aware Classification , 2018, WWW.

[24]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[25]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[26]  Latanya Sweeney,et al.  Discrimination in online ad delivery , 2013, CACM.

[27]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[28]  Vasileios Iosifidis Semi-supervised learning and fairness-aware learning under class imbalance , 2020 .

[29]  Steffen Staab,et al.  Bias in data‐driven artificial intelligence systems—An introductory survey , 2020, WIREs Data Mining Knowl. Discov..

[30]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[31]  Vasileios Iosifidis,et al.  FABBOO - Online Fairness-Aware Learning Under Class Imbalance , 2020, DS.

[32]  Joachim M. Buhmann,et al.  The Balanced Accuracy and Its Posterior Distribution , 2010, 2010 20th International Conference on Pattern Recognition.

[33]  Bodo Rosenhahn,et al.  FairNN- Conjoint Learning of Fair Representations for Fair Decisions , 2020, DS.

[34]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[35]  Besnik Fetahu,et al.  FAE: A Fairness-Aware Ensemble Framework , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[36]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.