Fair Distributions from Biased Samples: A Maximum Entropy Optimization Framework

One reason for the emergence of bias in AI systems is biased data -- datasets that may not be true representations of the underlying distributions -- and may over or under-represent groups with respect to protected attributes such as gender or race. We consider the problem of correcting such biases and learning distributions that are "fair", with respect to measures such as proportional representation and statistical parity, from the given samples. Our approach is based on a novel formulation of the problem of learning a fair distribution as a maximum entropy optimization problem with a given expectation vector and a prior distribution. Technically, our main contributions are: (1) a new second-order method to compute the (dual of the) maximum entropy distribution over an exponentially-sized discrete domain that turns out to be faster than previous methods, and (2) methods to construct prior distributions and expectation vectors that provably guarantee that the learned distributions satisfy a wide class of fairness criteria. Our results also come with quantitative bounds on the total variation distance between the empirical distribution obtained from the samples and the learned fair distribution. Our experimental results include testing our approach on the COMPAS dataset and showing that the fair distributions not only improve disparate impact values but when used to train classifiers only incur a small loss of accuracy.

[1]  Nisheeth K. Vishnoi,et al.  Maximum Entropy Distributions: Bit Complexity and Stability , 2017, COLT.

[2]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[3]  Kush R. Varshney,et al.  Fairness GAN , 2018, IBM Journal of Research and Development.

[4]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[5]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[6]  Leslie G. Valiant,et al.  Random Generation of Combinatorial Structures from a Uniform Distribution , 1986, Theor. Comput. Sci..

[7]  Gary King,et al.  Logistic Regression in Rare Events Data , 2001, Political Analysis.

[8]  Avi Wigderson,et al.  Much Faster Algorithms for Matrix Scaling , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[9]  Aleksander Madry,et al.  Matrix Scaling and Balancing via Box Constrained Newton's Method and Interior Point Methods , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[10]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[11]  Mohit Singh,et al.  Entropy, optimization and counting , 2013, STOC.

[12]  Toon Calders,et al.  Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures , 2013, Discrimination and Privacy in the Information Society.

[13]  Lu Zhang,et al.  FairGAN: Fairness-aware Generative Adversarial Networks , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[14]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[15]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[16]  Nisheeth K. Vishnoi,et al.  Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees , 2018, FAT.

[17]  J. Gibbs Elementary Principles in Statistical Mechanics: Developed with Especial Reference to the Rational Foundation of Thermodynamics , 1902 .

[18]  M. H. Wright The interior-point revolution in optimization: History, recent developments, and lasting consequences , 2004 .

[19]  Apurv Jain Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2017, Business Economics.

[20]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[21]  Xiangliang Zhang,et al.  Decision Theory for Discrimination-Aware Classification , 2012, 2012 IEEE 12th International Conference on Data Mining.

[22]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[23]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[24]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[25]  Miroslav Dudík Maximum entropy density estimation and modeling geographic distributions of species , 2007 .

[26]  Jimeng Sun,et al.  Generating Multi-label Discrete Patient Records using Generative Adversarial Networks , 2017, MLHC.

[27]  Dan A. Biddle Adverse Impact and Test Validation: A Practitioner's Guide to Valid and Defensible Employment Testing , 2005 .

[28]  Michael Veale,et al.  Slave to the Algorithm? Why a 'Right to an Explanation' Is Probably Not the Remedy You Are Looking For , 2017 .

[29]  Toon Calders,et al.  Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[30]  Constantine Bekas,et al.  BAGAN: Data Augmentation with Balancing GAN , 2018, ArXiv.