Themis-ml: A Fairness-Aware Machine Learning Interface for End-To-End Discrimination Discovery and Mitigation

ABSTRACT As more industries integrate machine learning into socially sensitive decision processes like hiring, loan-approval, and parole-granting, we are at risk of perpetuating historical and contemporary socioeconomic disparities. This is a critical problem because on the one hand, organizations who use but do not understand the discriminatory potential of such systems will facilitate the widening of social disparities under the assumption that algorithms are categorically objective. On the other hand, the responsible use of machine learning can help us measure, understand, and mitigate the implicit historical biases in socially sensitive data by expressing implicit decision-making mental models in terms of explicit statistical models. In this article we specify, implement, and evaluate a “fairness-aware” machine learning interface called themis-ml, which is intended for use by individual data scientists and engineers, academic research teams, or larger product teams who use machine learning in production systems.

[1]  Scott Fortmann-Roe,et al.  Understanding the bias-variance tradeoff , 2012 .

[2]  Apurv Jain Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2017, Business Economics.

[3]  G. Barnett,et al.  DXplain. An evolving diagnostic decision-support system. , 1987, JAMA.

[4]  Harris Mateen Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2018 .

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[7]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[8]  Lester Ingber Statistical mechanics of neocortical interactions: Applications of canonical momenta indicators to electroencephalography , 1997 .

[9]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[10]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[11]  Masataka Yoshimura,et al.  Decision-making support system for human resource allocation in product development projects , 2006 .

[12]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[13]  Yang Jing L1 Regularization Path Algorithm for Generalized Linear Models , 2008 .

[14]  Bernd Bischl,et al.  mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..

[15]  Tom Fahey,et al.  Evaluation of computer based clinical decision support system and risk chart for management of hypertension in primary care: randomised controlled trial , 2000, BMJ : British Medical Journal.

[16]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[17]  Xiangliang Zhang,et al.  Decision Theory for Discrimination-Aware Classification , 2012, 2012 IEEE 12th International Conference on Data Mining.

[18]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[19]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[20]  Carlo Giupponi,et al.  Towards the development of a decision support system for water resource management , 2005, Environ. Model. Softw..

[21]  Indre Zliobaite,et al.  A survey on measuring indirect discrimination in machine learning , 2015, ArXiv.

[22]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[23]  Gilles Louppe,et al.  Independent consultant , 2013 .

[24]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.