FA*IR: A Fair Top-k Ranking Algorithm

In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n » k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria. Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proportion of protected candidates in every prefix of the top-k ranking remains statistically above or indistinguishable from a given minimum. Utility is operationalized in two ways: (i) every candidate included in the top-k should be more qualified than every candidate not included; and (ii) for every pair of candidates in the top-k, the more qualified candidate should be ranked above. An efficient algorithm is presented for producing the Fair Top-k Ranking, and tested experimentally on existing datasets as well as new datasets released with this paper, showing that our approach yields small distortions with respect to rankings that maximize utility without considering fairness criteria. To the best of our knowledge, this is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an under-represented group along a ranked list.

[1]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[2]  Aaron Roth,et al.  Fair Learning in Markovian Environments , 2016, ArXiv.

[3]  Franco Turini,et al.  Integrating induction and deduction for finding evidence of discrimination , 2009, Artificial Intelligence and Law.

[4]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[5]  Josep Domingo-Ferrer,et al.  Generalization-based privacy preservation and discrimination prevention in data publishing and mining , 2014, Data Mining and Knowledge Discovery.

[6]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[7]  Julia Stoyanovich,et al.  Measuring Fairness in Ranked Outputs , 2016, SSDBM.

[8]  Michael D. Ekstrand,et al.  Recommender Response to Diversity and Popularity Bias in User Profiles , 2017, FLAIRS.

[9]  Tetsuya Sakai,et al.  Evaluating diversified search results using per-intent graded relevance , 2011, SIGIR.

[10]  Franco Turini,et al.  Measuring Discrimination in Socially-Sensitive Decision Records , 2009, SDM.

[11]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[12]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[13]  Krishna P. Gummadi,et al.  Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media , 2017, CSCW.

[14]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[15]  Franco Turini,et al.  Discrimination-aware data mining , 2008, KDD.

[16]  Francesco Bonchi,et al.  Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining , 2016, KDD.

[17]  Francesco Bonchi,et al.  Exposing the probabilistic causal structure of discrimination , 2015, International Journal of Data Science and Analytics.

[18]  Evelyn Ellis,et al.  EU Anti-Discrimination Law , 2005 .

[19]  T. Sowell Affirmative Action Around the World: An Empirical Study , 2004 .

[20]  Tània Verge Gendering Representation in Spain: Opportunities and Limits of Gender Quotas , 2010 .

[21]  Josep Domingo-Ferrer,et al.  A Methodology for Direct and Indirect Discrimination Prevention in Data Mining , 2013, IEEE Transactions on Knowledge and Data Engineering.

[22]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[23]  Aaron Roth,et al.  Fairness in Reinforcement Learning , 2016, ICML.

[24]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[25]  Natan Lerner,et al.  Group Rights and Discrimination in International Law , 1990 .

[26]  Toon Calders,et al.  Discrimination Aware Decision Tree Learning , 2010, 2010 IEEE International Conference on Data Mining.

[27]  Nisheeth K. Vishnoi,et al.  How to be Fair and Diverse? , 2016, ArXiv.

[28]  Toon Calders,et al.  Handling Conditional Discrimination , 2011, 2011 IEEE 11th International Conference on Data Mining.

[29]  Matevz Kunaver,et al.  Diversity in recommender systems - A survey , 2017, Knowl. Based Syst..

[30]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[31]  Nisheeth K. Vishnoi,et al.  Ranking with Fairness Constraints , 2017, ICALP.

[32]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[33]  Helen Nissenbaum,et al.  Bias in computer systems , 1996, TOIS.

[34]  Apurv Jain Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2017, Business Economics.

[35]  Nick Craswell,et al.  An experimental comparison of click position-bias models , 2008, WSDM '08.

[36]  Suresh Venkatasubramanian,et al.  On the (im)possibility of fairness , 2016, ArXiv.

[37]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[38]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[39]  Krishna P. Gummadi,et al.  Fairness Constraints: A Mechanism for Fair Classification , 2015, ArXiv.