A Variational Approach to Privacy and Fairness

In this article, we propose a new variational approach to learn private and/or fair representations. This approach is based on the Lagrangians of a new formulation of the privacy and fairness optimization problems that we propose. In this formulation, we aim to generate representations of the data that keep a prescribed level of the relevant information that is not shared by the private or sensitive data, while minimizing the remaining information they keep. The proposed approach (i) exhibits the similarities of the privacy and fairness problems, (ii) allows us to control the trade-off between utility and privacy or fairness through the Lagrange multiplier parameter, and (iii) can be comfortably incorporated to common representation learning algorithms such as the VAE, the $\beta$-VAE, the VIB, or the nonlinear IB.

[1]  Un Desa Transforming our world : The 2030 Agenda for Sustainable Development , 2016 .

[2]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[3]  Raymond W. Yeung,et al.  A new outlook of Shannon's information measures , 1991, IEEE Trans. Inf. Theory.

[4]  COMPAS Risk Scales : Demonstrating Accuracy Equity and Predictive Parity Performance of the COMPAS Risk Scales in Broward County , 2016 .

[5]  Mikael Skoglund,et al.  The Convex Information Bottleneck Lagrangian , 2020, Entropy.

[6]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[7]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[8]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[9]  Geoff Gordon,et al.  Inherent Tradeoffs in Learning Fair Representations , 2019, NeurIPS.

[10]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[11]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[12]  AI Koan,et al.  Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.

[13]  Max Welling,et al.  The Variational Fair Autoencoder , 2015, ICLR.

[14]  Catuscia Palamidessi,et al.  Generating Optimal Privacy-Protection Mechanisms via Machine Learning , 2019, ArXiv.

[15]  Stefano Soatto,et al.  Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).

[16]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[17]  Silvia Chiappa,et al.  Wasserstein Fair Classification , 2019, UAI.

[18]  Toniann Pitassi,et al.  Fairness through Causal Awareness: Learning Causal Latent-Variable Models for Biased Data , 2018, FAT.

[19]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[20]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[21]  Cathy O'Neil,et al.  Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2016, Vikalpa: The Journal for Decision Makers.

[22]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[23]  Yoshua Bengio,et al.  Mutual Information Neural Estimation , 2018, ICML.

[24]  Toniann Pitassi,et al.  Flexibly Fair Representation Learning by Disentanglement , 2019, ICML.

[25]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[26]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[27]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[28]  Jihun Hamm,et al.  Minimax Filter: Learning to Preserve Privacy from Inference Attacks , 2016, J. Mach. Learn. Res..

[29]  Yixin Wang,et al.  Equal Opportunity and Affirmative Action via Counterfactual Predictions , 2019, ArXiv.

[30]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[31]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[32]  Tom Chothia,et al.  Statistical Measurement of Information Leakage , 2010, TACAS.

[33]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[34]  Han Zhao,et al.  Conditional Learning of Fair Representations , 2019, ICLR.

[35]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[36]  David H. Wolpert,et al.  Nonlinear Information Bottleneck , 2017, Entropy.

[37]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[38]  Married,et al.  Classification with no discrimination by preferential sampling , 2010 .

[39]  Ilya Shpitser,et al.  Fair Inference on Outcomes , 2017, AAAI.

[40]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[41]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[42]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[43]  Alexandra Chouldechova,et al.  The Frontiers of Fairness in Machine Learning , 2018, ArXiv.

[44]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[45]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[46]  Riccardo Bettati,et al.  Anonymity vs. Information Leakage in Anonymity Systems , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[47]  Ian S. Fischer,et al.  The Conditional Entropy Bottleneck , 2020, Entropy.

[48]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[49]  K. Jarrod Millman,et al.  Array programming with NumPy , 2020, Nat..

[50]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[51]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[52]  Dacheng Tao,et al.  Variational approach for privacy funnel optimization on continuous data , 2020, J. Parallel Distributed Comput..

[53]  Amos J. Storkey,et al.  Censoring Representations with an Adversary , 2015, ICLR.

[54]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[55]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[56]  Muriel Médard,et al.  From the Information Bottleneck to the Privacy Funnel , 2014, 2014 IEEE Information Theory Workshop (ITW 2014).

[57]  Songül Tolan,et al.  Fair and Unbiased Algorithmic Decision Making: Current State and Future Challenges , 2018, ArXiv.

[58]  Sudeep Kamath,et al.  An Operational Approach to Information Leakage , 2018, IEEE Transactions on Information Theory.

[59]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[60]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[61]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[62]  Flávio du Pin Calmon,et al.  Privacy against statistical inference , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).