Fair Decision Rules for Binary Classification

In recent years, machine learning has begun automating decision making in fields as varied as college admissions, credit lending, and criminal sentencing. The socially sensitive nature of some of these applications together with increasing regulatory constraints has necessitated the need for algorithms that are both fair and interpretable. In this paper we consider the problem of building Boolean rule sets in disjunctive normal form (DNF), an interpretable model for binary classification, subject to fairness constraints. We formulate the problem as an integer program that maximizes classification accuracy with explicit constraints on two different measures of classification parity: equality of opportunity and equalized odds. Column generation framework, with a novel formulation, is used to efficiently search over exponentially many possible rules. When combined with faster heuristics, our method can deal with large data-sets. Compared to other fair and interpretable classifiers, our method is able to find rule sets that meet stricter notions of fairness with a modest trade-off in accuracy.

[1]  Ailsa H. Land,et al.  An Automatic Method of Solving Discrete Programming Problems , 1960 .

[2]  R. Gomory,et al.  A Linear Programming Approach to the Cutting-Stock Problem , 1961 .

[3]  Margo I. Seltzer,et al.  Scalable Bayesian Rule Lists , 2016, ICML.

[4]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[5]  Cynthia Rudin,et al.  A Bayesian Framework for Learning Rule Sets for Interpretable Classification , 2017, J. Mach. Learn. Res..

[6]  Martin W. P. Savelsbergh,et al.  Branch-and-Price: Column Generation for Solving Huge Integer Programs , 1998, Oper. Res..

[7]  Cynthia Rudin,et al.  Falling Rule Lists , 2014, AISTATS.

[8]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[9]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[10]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[11]  Jure Leskovec,et al.  Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[12]  Sanjeeb Dash,et al.  Boolean Decision Rules via Column Generation , 2018, NeurIPS.

[13]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[14]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[15]  Cynthia Rudin,et al.  Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[17]  Kristina Lerman,et al.  A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[18]  Amos J. Storkey,et al.  Censoring Representations with an Adversary , 2015, ICLR.

[19]  S. Ilker Birbil,et al.  Rule Covering for Interpretation and Boosting , 2020, ArXiv.

[20]  Phebe Vayanos,et al.  Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making , 2019, AAAI.

[21]  Cynthia Rudin,et al.  Learning Cost-Effective and Interpretable Treatment Regimes , 2017, AISTATS.

[22]  Seth Neel,et al.  A Convex Framework for Fair Regression , 2017, ArXiv.

[23]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[24]  Lu Zhang,et al.  Fairness-aware Classification: Criterion, Convexity, and Bounds , 2018, ArXiv.

[25]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[26]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[27]  Oktay Günlük,et al.  Optimal Generalized Decision Trees via Integer Programming , 2016, ArXiv.

[28]  Ulrike von Luxburg,et al.  Too Relaxed to Be Fair , 2020, ICML.

[29]  Margo I. Seltzer,et al.  Learning Certifiably Optimal Rule Lists , 2017, KDD.

[30]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[31]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[32]  Tong Wang,et al.  Learning Optimized Or's of And's , 2015, ArXiv.

[33]  G. Nemhauser,et al.  Integer Programming , 2020 .

[34]  John Langford,et al.  A Reductions Approach to Fair Classification , 2018, ICML.

[35]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[36]  Dimitris Bertsimas,et al.  Optimal classification trees , 2017, Machine Learning.

[37]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[38]  Kush R. Varshney,et al.  Learning sparse two-level boolean rules , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[39]  Toon Calders,et al.  Discrimination Aware Decision Tree Learning , 2010, 2010 IEEE International Conference on Data Mining.

[40]  Christopher T. Lowenkamp,et al.  False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[41]  Cynthia Rudin,et al.  Supersparse Linear Integer Models for Interpretable Classification , 2013, 1306.6677.

[42]  Cynthia Rudin,et al.  Learning Optimized Risk Scores , 2016, J. Mach. Learn. Res..

[43]  Faisal Kamiran,et al.  Quantifying explainable discrimination and removing illegal discrimination in automated decision making , 2012, Knowledge and Information Systems.

[44]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.