Fairness in Credit Scoring: Assessment, Implementation and Profit Implications

The rise of algorithmic decision-making has spawned much research on fair machine learning (ML). Financial institutions use ML for building risk scorecards that support a range of credit-related decisions. Yet, the literature on fair ML in credit scoring is scarce. The paper makes three contributions. First, we revisit statistical fairness criteria and examine their adequacy for credit scoring. Second, we catalog algorithmic options for incorporating fairness goals in the ML model development pipeline. Last, we empirically compare different fairness processors in a profit-oriented credit scoring context using real-world data. The empirical results substantiate the evaluation of fairness measures, identify suitable options to implement fair credit scoring, and clarify the profit-fairness trade-off in lending decisions. We find that multiple fairness criteria can be approximately satisfied at once and recommend separation as a proper criterion for measuring the fairness of a scorecard. We also find fair in-processors to deliver a good balance between profit and fairness and show that algorithmic discrimination can be reduced to a reasonable level at a relatively low cost. The codes corresponding to the paper are available on GitHub1.

[1]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[2]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[3]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[4]  Esther Rolf,et al.  Delayed Impact of Fair Machine Learning , 2018, ICML.

[5]  Toon Calders,et al.  Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[6]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[7]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[8]  Bart Baesens,et al.  Development and application of consumer credit scoring models using profit-based classification measures , 2014, Eur. J. Oper. Res..

[9]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[10]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[11]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[12]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[13]  Xiangliang Zhang,et al.  Decision Theory for Discrimination-Aware Classification , 2012, 2012 IEEE 12th International Conference on Data Mining.

[14]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[15]  Bart Baesens,et al.  A multi-objective approach for profit-driven feature selection in credit scoring , 2019, Decis. Support Syst..

[16]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[17]  Paul Goldsmith-Pinkham,et al.  Predictably Unequal? The Effects of Machine Learning on Credit Markets , 2017, The Journal of Finance.

[18]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[19]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[20]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[21]  Franco Turini,et al.  k-NN as an implementation of situation testing for discrimination discovery and prevention , 2011, KDD.

[22]  Joe Whittaker,et al.  Quantile regression for modelling distributions of profit and loss , 2007, Eur. J. Oper. Res..

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Suresh Venkatasubramanian,et al.  A comparative study of fairness-enhancing interventions in machine learning , 2018, FAT.

[25]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[26]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[27]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[28]  Shira Mitchell,et al.  Algorithmic Fairness: Choices, Assumptions, and Definitions , 2021, Annual Review of Statistics and Its Application.

[29]  Max Welling,et al.  The Variational Fair Autoencoder , 2015, ICLR.

[30]  J. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research , 2015, Eur. J. Oper. Res..

[31]  Maya R. Gupta,et al.  Satisfying Real-world Goals with Dataset Constraints , 2016, NIPS.

[32]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[33]  T. Cleary TEST BIAS: PREDICTION OF GRADES OF NEGRO AND WHITE STUDENTS IN INTEGRATED COLLEGES , 1968 .

[34]  Kristian Lum,et al.  An algorithm for removing sensitive information: Application to race-independent recidivism prediction , 2017, The Annals of Applied Statistics.

[35]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[36]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[37]  Jonathan N. Crook,et al.  Recent developments in consumer credit risk assessment , 2007, Eur. J. Oper. Res..

[38]  Nisheeth K. Vishnoi,et al.  Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees , 2018, FAT.

[39]  Jonathan Crook,et al.  Reject inference, augmentation, and sample selection , 2007, Eur. J. Oper. Res..

[40]  Nathan Srebro,et al.  Learning Non-Discriminatory Predictors , 2017, COLT.

[41]  Bart Baesens,et al.  Deep learning for credit scoring: Do or don't? , 2021, Eur. J. Oper. Res..