DP-SGD vs PATE: Which Has Less Disparate Impact on Model Accuracy?

Recent advances in differentially private deep learning have demonstrated that application of differential privacy– specifically the DP-SGD algorithm– has a disparate impact on different sub-groups in the population, which leads to a significantly high drop-in model utility for subpopulations that are under-represented (minorities), compared to well-represented ones. In this work, we aim to compare PATE, another mechanism for training deep learning models using differential privacy, with DP-SGD in terms of fairness. We show that PATE does have a disparate impact too, however, it is much less severe than DP-SGD. We draw insights from this observation on what might be promising directions in achieving better fairness-privacy trade-offs.

[1]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[2]  Krishna P. Gummadi,et al.  On Fairness, Diversity and Randomness in Algorithmic Decision Making , 2017, ArXiv.

[3]  Kristina Lerman,et al.  A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[4]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[5]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[6]  Peter Kairouz,et al.  Practical and Private (Deep) Learning without Sampling or Shuffling , 2021, ICML.

[7]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[8]  David Evans,et al.  Evaluating Differentially Private Machine Learning in Practice , 2019, USENIX Security Symposium.

[9]  Harshvardhan Sikka,et al.  Benchmarking Differentially Private Residual Networks for Medical Imagery , 2020, ArXiv.

[10]  Fatemehsadat Mireshghallah,et al.  Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy , 2020, PPMLP@CCS.

[11]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[12]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[13]  Ashwin Machanavajjhala,et al.  Fair decision making using privacy-protected data , 2019, FAT*.

[14]  Ferdinando Fioretto,et al.  Decision Making with Differential Privacy under a Fairness Lens , 2021, IJCAI.

[15]  Pascal Van Hentenryck,et al.  Differential Privacy of Hierarchical Census Data: An Optimization Approach , 2019, CP.

[16]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[17]  Vitaly Shmatikov,et al.  Differential Privacy Has Disparate Impact on Model Accuracy , 2019, NeurIPS.

[18]  Marzyeh Ghassemi,et al.  Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings , 2020, FAccT.

[19]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[20]  John M. Abowd,et al.  The U.S. Census Bureau Adopts Differential Privacy , 2018, KDD.

[21]  R. Raskar,et al.  Privacy in Deep Learning: A Survey , 2020, ArXiv.

[22]  Aaron Roth,et al.  Differentially Private Fair Learning , 2018, ICML.

[23]  Jinshuo Dong,et al.  Deep Learning with Gaussian Differential Privacy , 2020, Harvard data science review.

[24]  Lingxiao Wang,et al.  Revisiting Membership Inference Under Realistic Assumptions , 2020, Proc. Priv. Enhancing Technol..

[25]  Pascal Van Hentenryck,et al.  Differentially Private and Fair Deep Learning: A Lagrangian Dual Approach , 2020, AAAI.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).