Practical Compositional Fairness: Understanding Fairness in Multi-Task ML Systems

Most literature in fairness has focused on improving fairness with respect to one single model or one single objective. However, real-world machine learning systems are usually composed of many different components. Unfortunately, recent research has shown that even if each component is ``fair,'' the overall system can still be ``unfair'' \cite{dwork2018fairness}. In this paper, we focus on how well fairness composes over multiple components in \emph{real systems}. We consider two recently proposed fairness metrics for rankings: exposure and pairwise ranking accuracy gap. We provide theory that demonstrates a set of conditions under which fairness of individual models does compose. We then present an analytical framework for both understanding whether a system's signals can achieve compositional fairness, and diagnosing which of these signals lowers the overall system's end-to-end fairness the most. Despite previously bleak theoretical results, on multiple data-sets---including a large-scale real-world recommender system---we find that the overall system's end-to-end fairness is largely achievable by improving fairness in individual components.

[1]  Zhe Zhao,et al.  Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations , 2017, ArXiv.

[2]  Krishna P. Gummadi,et al.  Learning Fair Classifiers , 2015, 1507.05259.

[3]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[4]  Nathan Kallus,et al.  The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the xAUC Metric , 2019, NeurIPS.

[5]  Cynthia Dwork,et al.  Fairness Under Composition , 2018, ITCS.

[6]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[7]  Harikrishna Narasimhan,et al.  Pairwise Fairness for Ranking and Regression , 2019, AAAI.

[8]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[9]  Ricardo Baeza-Yates,et al.  FA*IR: A Fair Top-k Ranking Algorithm , 2017, CIKM.

[10]  Max Welling,et al.  The Variational Fair Autoencoder , 2015, ICLR.

[11]  James Y. Zou,et al.  Multiaccuracy: Black-Box Post-Processing for Fairness in Classification , 2018, AIES.

[12]  Michael D. Ekstrand,et al.  Exploring author gender in book rating and recommendation , 2018, User Modeling and User-Adapted Interaction.

[13]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[14]  Sahin Cem Geyik,et al.  Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search , 2019, KDD.

[15]  Sreenivas Gollapudi,et al.  An axiomatic approach for result diversification , 2009, WWW '09.

[16]  Ed H. Chi,et al.  Fairness in Recommendation Ranking through Pairwise Comparisons , 2019, KDD.

[17]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[18]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[19]  Thorsten Joachims,et al.  Fairness of Exposure in Rankings , 2018, KDD.

[20]  Jure Leskovec,et al.  Learning Attitudes and Attributes from Multi-aspect Reviews , 2012, 2012 IEEE 12th International Conference on Data Mining.

[21]  Zhe Zhao,et al.  Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts , 2018, KDD.

[22]  Joaquin Quiñonero Candela,et al.  Practical Lessons from Predicting Clicks on Ads at Facebook , 2014, ADKDD'14.

[23]  Toniann Pitassi,et al.  Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[24]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[25]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[26]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[27]  John Langford,et al.  A Reductions Approach to Fair Classification , 2018, ICML.

[28]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[29]  Filip Radlinski,et al.  Learning optimally diverse rankings over large document collections , 2010, ICML.

[30]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[31]  Indre Zliobaite,et al.  On the relation between accuracy and fairness in binary classification , 2015, ArXiv.

[32]  Fabrizio Silvestri,et al.  Efficient Diversification of Web Search Results , 2011, Proc. VLDB Endow..

[33]  Jimmy J. Lin,et al.  A cascade ranking model for efficient ranked retrieval , 2011, SIGIR.

[34]  Maya R. Gupta,et al.  Satisfying Real-world Goals with Dataset Constraints , 2016, NIPS.

[35]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[36]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[37]  Suju Rajan,et al.  Beyond clicks: dwell time for personalization , 2014, RecSys '14.

[38]  AdomaviciusGediminas,et al.  Toward the Next Generation of Recommender Systems , 2005 .

[39]  Maya R. Gupta,et al.  Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints , 2018, ICML.

[40]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[41]  C. Dwork,et al.  Group Fairness Under Composition , 2018 .

[42]  Lucy Vasserman,et al.  Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification , 2019, WWW.

[43]  Robin D. Burke,et al.  Hybrid Recommender Systems: Survey and Experiments , 2002, User Modeling and User-Adapted Interaction.

[44]  Allison Woodruff,et al.  Putting Fairness Principles into Practice: Challenges, Metrics, and Improvements , 2019, AIES.

[45]  Vaibhava Goel,et al.  McGan: Mean and Covariance Feature Matching GAN , 2017, ICML.

[46]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.