Group Fairness by Probabilistic Modeling with Latent Fair Decisions

Machine learning systems are increasingly being used to make impactful decisions such as loan applications and criminal justice risk assessments, and as such, ensuring fairness of these systems is critical. This is often challenging as the labels in the data are biased. This paper studies learning fair probability distributions from biased data by explicitly modeling a latent variable that represents a hidden, unbiased label. In particular, we aim to achieve demographic parity by enforcing certain independencies in the learned model. We also show that group fairness guarantees are meaningful only if the distribution used to provide those guarantees indeed captures the real-world data. In order to closely model the data distribution, we employ probabilistic circuits, an expressive and tractable probabilistic model, and propose an algorithm to learn them from incomplete data. We evaluate our approach on a synthetic dataset in which observed labels indeed come from fair labels but with added bias, and demonstrate that the fair labels are successfully retrieved. Moreover, we show on real-world datasets that our approach not only is a better model than existing methods of how the data was generated but also achieves competitive accuracy.

[1]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[2]  Vibhav Gogate,et al.  Cutset Networks: A Simple, Tractable, and Scalable Approach for Improving the Accuracy of Chow-Liu Trees , 2014, ECML/PKDD.

[3]  Adnan Darwiche,et al.  A differential approach to inference in Bayesian networks , 2000, JACM.

[4]  Alexandra Chouldechova,et al.  Fairness Evaluation in Presence of Biased Noisy Labels , 2020, AISTATS.

[5]  Silvia Chiappa,et al.  Wasserstein Fair Classification , 2019, UAI.

[6]  Kristian Kersting,et al.  Random Sum-Product Networks: A Simple and Effective Approach to Probabilistic Deep Learning , 2019, UAI.

[7]  John Langford,et al.  A Reductions Approach to Fair Classification , 2018, ICML.

[8]  Seth Neel,et al.  Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[9]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[10]  Chris Clifton,et al.  Combating discrimination using Bayesian networks , 2014, Artificial Intelligence and Law.

[11]  Toon Calders,et al.  Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[12]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[13]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[16]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..

[17]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and Discrimination , 2014, ArXiv.

[18]  Rina Dechter,et al.  AND/OR Branch-and-Bound for Graphical Models , 2005, IJCAI.

[19]  Nathan Srebro,et al.  Learning Non-Discriminatory Predictors , 2017, COLT.

[20]  Salvatore Ruggieri,et al.  A multidisciplinary survey on discrimination analysis , 2013, The Knowledge Engineering Review.

[21]  Adnan Darwiche,et al.  Modeling and Reasoning with Bayesian Networks , 2009 .

[22]  Nir Friedman,et al.  Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning , 2009 .

[23]  Guy Van den Broeck,et al.  Strudel: Learning Structured-Decomposable Probabilistic Circuits , 2020, ArXiv.

[24]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[25]  Adnan Darwiche,et al.  A Logical Approach to Factoring Belief Networks , 2002, KR.

[26]  Dan Roth,et al.  On the Hardness of Approximate Reasoning , 1993, IJCAI.

[27]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[28]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[29]  Vibhav Gogate,et al.  Merging Strategies for Sum-Product Networks: From Trees to Graphs , 2016, UAI.

[30]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[31]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[32]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[33]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[34]  Adnan Darwiche,et al.  On Relaxing Determinism in Arithmetic Circuits , 2017, ICML.

[35]  Guy Van den Broeck,et al.  Learning the Structure of Probabilistic Sentential Decision Diagrams , 2017, UAI.

[36]  Guy Van den Broeck,et al.  Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns , 2019, AAAI.

[37]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[38]  Guy Van den Broeck,et al.  Probabilistic Sentential Decision Diagrams , 2014, KR.

[39]  Avrim Blum,et al.  Recovering from Biased Data: Can Fairness Constraints Improve Accuracy? , 2019, FORC.

[40]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[41]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[42]  Pierre Marquis,et al.  A Knowledge Compilation Map , 2002, J. Artif. Intell. Res..

[43]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[44]  Ofir Nachum,et al.  Identifying and Correcting Label Bias in Machine Learning , 2019, AISTATS.