Fairness-aware machine learning: a perspective

Algorithms learned from data are increasingly used for deciding many aspects in our life: from movies we see, to prices we pay, or medicine we get. Yet there is growing evidence that decision making by inappropriately trained algorithms may unintentionally discriminate people. For example, in automated matching of candidate CVs with job descriptions, algorithms may capture and propagate ethnicity related biases. Several repairs for selected algorithms have already been proposed, but the underlying mechanisms how such discrimination happens from the computational perspective are not yet scientifically understood. We need to develop theoretical understanding how algorithms may become discriminatory, and establish fundamental machine learning principles for prevention. We need to analyze machine learning process as a whole to systematically explain the roots of discrimination occurrence, which will allow to devise global machine learning optimization criteria for guaranteed prevention, as opposed to pushing empirical constraints into existing algorithms case-by-case. As a result, the state-of-the-art will advance from heuristic repairing, to proactive and theoretically supported prevention. This is needed not only because law requires to protect vulnerable people. Penetration of big data initiatives will only increase, and computer science needs to provide solid explanations and accountability to the public, before public concerns lead to unnecessarily restrictive regulations against machine learning.

[1]  Chris Clifton,et al.  Combating discrimination using Bayesian networks , 2014, Artificial Intelligence and Law.

[2]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..

[3]  Michael Luca,et al.  Digital Discrimination: The Case of Airbnb.com , 2014 .

[4]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and Discrimination , 2014, ArXiv.

[5]  Toon Calders,et al.  Discrimination Aware Decision Tree Learning , 2010, 2010 IEEE International Conference on Data Mining.

[6]  Faisal Kamiran,et al.  Quantifying explainable discrimination and removing illegal discrimination in automated decision making , 2012, Knowledge and Information Systems.

[7]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[8]  Indre Zliobaite,et al.  Optimizing regression models for data streams with missing values , 2014, Machine Learning.

[9]  Franco Turini,et al.  Data mining for discrimination discovery , 2010, TKDD.

[10]  Indre Zliobaite,et al.  Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models , 2016, Artificial Intelligence and Law.

[11]  Franco Turini,et al.  k-NN as an implementation of situation testing for discrimination discovery and prevention , 2011, KDD.

[12]  Peter A. Flach,et al.  ROC curves in cost space , 2013, Machine Learning.

[13]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[14]  Toon Calders,et al.  Handling Conditional Discrimination , 2011, 2011 IEEE 11th International Conference on Data Mining.

[15]  Toon Calders,et al.  Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures , 2013, Discrimination and Privacy in the Information Society.

[16]  Indre Zliobaite,et al.  On the relation between accuracy and fairness in binary classification , 2015, ArXiv.

[17]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[18]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[19]  Toon Calders,et al.  Controlling Attribute Effect in Linear Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[20]  Josep Domingo-Ferrer,et al.  A Methodology for Direct and Indirect Discrimination Prevention in Data Mining , 2013, IEEE Transactions on Knowledge and Data Engineering.

[21]  Salvatore Ruggieri,et al.  A multidisciplinary survey on discrimination analysis , 2013, The Knowledge Engineering Review.

[22]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[23]  Frank A. Pasquale,et al.  [89WashLRev0001] The Scored Society: Due Process for Automated Predictions , 2014 .

[24]  T. Tinkham The Uses and Misuses of Statistical Proof in Age Discrimination Claims , 2010 .

[25]  David H. Kaye,et al.  Statistical Evidence of Discrimination , 1982 .

[26]  Indre Zliobaite,et al.  A survey on measuring indirect discrimination in machine learning , 2015, ArXiv.

[27]  Latanya Sweeney,et al.  Discrimination in online ad delivery , 2013, CACM.