Adaptive Sensitive Reweighting to Mitigate Bias in Fairness-aware Classification

Machine learning bias and fairness have recently emerged as key issues due to the pervasive deployment of data-driven decision making in a variety of sectors and services. It has often been argued that unfair classifications can be attributed to bias in training data, but previous attempts to 'repair' training data have led to limited success. To circumvent shortcomings prevalent in data repairing approaches, such as those that weight training samples of the sensitive group (e.g. gender, race, financial status) based on their misclassification error, we present a process that iteratively adapts training sample weights with a theoretically grounded model. This model addresses different kinds of bias to better achieve fairness objectives, such as trade-offs between accuracy and disparate impact elimination or disparate mistreatment elimination. We show that, compared to previous fairness-aware approaches, our methodology achieves better or similar trades-offs between accuracy and unfairness mitigation on real-world and synthetic datasets.

[1]  Benjamin Fish,et al.  Fair Boosting : a Case Study , 2015 .

[2]  Richard D. Phillips,et al.  Information Effect of Entry into Credit Ratings Market: The Case of Insurers' Ratings , 2011 .

[3]  E. Molho,et al.  Scalarization and Stability in Vector Optimization , 2002 .

[4]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[5]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[6]  Toon Calders,et al.  Discrimination Aware Decision Tree Learning , 2010, 2010 IEEE International Conference on Data Mining.

[7]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[8]  Benjamin Fish,et al.  A Confidence-Based Approach for Balancing Fairness and Accuracy , 2016, SDM.

[9]  Nisheeth K. Vishnoi,et al.  Ranking with Fairness Constraints , 2017, ICALP.

[10]  Salvatore Ruggieri,et al.  A multidisciplinary survey on discrimination analysis , 2013, The Knowledge Engineering Review.

[11]  Dan A. Biddle Adverse Impact and Test Validation: A Practitioner's Guide to Valid and Defensible Employment Testing , 2005 .

[12]  Ramakant Nevatia,et al.  Learning to associate: HybridBoosted multi-target tracker for crowded scene , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[14]  Toon Calders,et al.  Controlling Attribute Effect in Linear Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[15]  Maya R. Gupta,et al.  Satisfying Real-world Goals with Dataset Constraints , 2016, NIPS.

[16]  Krishna P. Gummadi,et al.  Fairness Constraints: A Mechanism for Fair Classification , 2015, ArXiv.

[17]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[18]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[19]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[20]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[21]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[22]  Jun Sakuma,et al.  Fairness-aware Learning through Regularization Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[23]  Krishna P. Gummadi,et al.  Learning Fair Classifiers , 2015, 1507.05259.

[24]  Toon Calders,et al.  Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[25]  D. Finkel,et al.  Convergence analysis of the direct algorithm , 2004 .

[26]  Seth Neel,et al.  Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[27]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[28]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[29]  Casey Rothschild,et al.  Economic Effects of Risk Classification Bans , 2014 .

[30]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[31]  Michael Feldman,et al.  Computational Fairness: Preventing Machine-Learned Discrimination , 2015 .

[32]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[33]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[34]  D. Finkel,et al.  Direct optimization algorithm user guide , 2003 .

[35]  Qinghua Hu,et al.  Large-margin nearest neighbor classifiers via sample weight learning , 2011, Neurocomputing.

[36]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[37]  H. Banks Center for Research in Scientific Computationにおける研究活動 , 1999 .

[38]  Y. Nesterov,et al.  Primal-dual subgradient methods for minimizing uniformly convex functions , 2010, 1401.1792.

[39]  Jun Sakuma,et al.  Fairness-Aware Learning with Restriction of Universal Dependency using f-Divergences , 2015, ArXiv.

[40]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[41]  Paulo Cortez,et al.  Using data mining for bank direct marketing: an application of the CRISP-DM methodology , 2011 .

[42]  Shelly L. Peffer Title VII and Disparate-Treatment Discrimination Versus Disparate-Impact Discrimination , 2009 .

[43]  Toon Calders,et al.  Handling Conditional Discrimination , 2011, 2011 IEEE 11th International Conference on Data Mining.