Can You Fake It Until You Make It?: Impacts of Differentially Private Synthetic Data on Downstream Classification Fairness

The recent adoption of machine learning models in high-risk settings such as medicine has increased demand for developments in privacy and fairness. Rebalancing skewed datasets using synthetic data created by generative adversarial networks (GANs) has shown potential to mitigate disparate impact on minoritized subgroups. However, such generative models are subject to privacy attacks that can expose sensitive data from the training dataset. Differential privacy (DP) is the current leading solution for privacy-preserving machine learning. Differentially private GANs (DP GANs) are often considered a potential solution for improving model fairness while maintaining privacy of sensitive training data. We investigate the impact of using synthetic images from DP GANs on downstream classification model utility and fairness. We demonstrate that existing DP GANs cannot simultaneously maintain model utility, privacy, and fairness. The images generated from GAN models trained with DP exhibit extreme decreases in image quality and utility which leads to poor downstream classification model performance. Our evaluation highlights the friction between privacy, fairness, and utility and how this directly translates into real loss of performance and representation in common machine learning settings. Our results show that additional work improving the utility and fairness of DP generative models is required before they can be utilized as a potential solution to privacy and fairness issues stemming from lack of diversity in the training dataset.

[1]  M. Hawes Implementing Differential Privacy: Seven Lessons From the 2020 United States Census , 2020 .

[2]  Pascal Van Hentenryck,et al.  Differentially Private and Fair Deep Learning: A Lagrangian Dual Approach , 2020, AAAI.

[3]  Kush R. Varshney,et al.  Estimating Skin Tone and Effects on Classification Performance in Dermatology Datasets , 2019, ArXiv.

[4]  Yuan Yuan,et al.  A hybrid deep learning model for consumer credit scoring , 2018, 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD).

[5]  Prateek Mittal,et al.  DARTS: Deceiving Autonomous Cars with Toxic Signs , 2018, ArXiv.

[6]  Julapa Jagtiani,et al.  The Roles of Alternative Data and Machine Learning in Fintech Lending: Evidence from the Lendingclub Consumer Platform , 2018, Financial Management.

[7]  Stephan Günnemann,et al.  Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift , 2018, NeurIPS.

[8]  Diego H. Milone,et al.  Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis , 2020, Proceedings of the National Academy of Sciences.

[9]  Constantine Bekas,et al.  BAGAN: Data Augmentation with Balancing GAN , 2018, ArXiv.

[10]  Gustavo Carneiro,et al.  Hidden stratification causes clinically meaningful failures in machine learning for medical imaging , 2019, CHIL.

[11]  Kamalika Chaudhuri,et al.  A Stability-based Validation Procedure for Differentially Private Machine Learning , 2013, NIPS.

[12]  Reihaneh Torkzadehmahani,et al.  DP-CGAN: Differentially Private Synthetic Data and Label Generation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  Cheng Lei,et al.  Modeling the Biological Pathology Continuum with HSIC-regularized Wasserstein Auto-encoders , 2019, ArXiv.

[14]  Kunal Talwar,et al.  Private selection from private candidates , 2018, STOC.

[15]  C. Troncoso,et al.  Synthetic Data - A Privacy Mirage , 2020, ArXiv.

[16]  Emily Denton,et al.  Bringing the People Back In: Contesting Benchmark Machine Learning Datasets , 2020, ArXiv.

[17]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[18]  Aaron Roth,et al.  Differentially Private Fair Learning , 2018, ICML.

[19]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[20]  Li Zhang,et al.  Rényi Differential Privacy of the Sampled Gaussian Mechanism , 2019, ArXiv.

[21]  Roger G. Mark,et al.  MIMIC-CXR: A large publicly available database of labeled chest radiographs , 2019, ArXiv.

[22]  George Shih,et al.  A patient-centric dataset of images and metadata for identifying melanomas using clinical context , 2020, Scientific Data.

[23]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[24]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[25]  I. Kohane,et al.  Big Data and Machine Learning in Health Care. , 2018, JAMA.

[26]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[27]  Judy Hoffman,et al.  Predictive Inequity in Object Detection , 2019, ArXiv.

[28]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[29]  Úlfar Erlingsson,et al.  Tempered Sigmoid Activations for Deep Learning with Differential Privacy , 2020, AAAI.

[30]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[34]  Heidi Ledford Millions of black people affected by racial bias in health-care algorithms , 2019, Nature.

[35]  Suman V. Ravuri,et al.  A Clinically Applicable Approach to Continuous Prediction of Future Acute Kidney Injury , 2019, Nature.

[36]  Ke Yan,et al.  Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks , 2019, Scientific Reports.

[37]  Nicholas Carlini,et al.  Label-Only Membership Inference Attacks , 2020, ICML.

[38]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[39]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[40]  Douglas Heaven,et al.  Why deep-learning AIs are so easy to fool , 2019, Nature.

[41]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[42]  Boi Faltings,et al.  Generating Differentially Private Datasets Using GANs , 2018, ArXiv.

[43]  R. Cook Influential Observations in Linear Regression , 1979 .

[44]  A. Lo,et al.  Consumer Credit Risk Models Via Machine-Learning Algorithms , 2010 .

[45]  Thomas Plagemann,et al.  Augmenting Physiological Time Series Data: A Case Study for Sleep Apnea Detection , 2019, ECML/PKDD.

[46]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47]  Amalia Luque,et al.  The impact of class imbalance in classification performance metrics based on the binary confusion matrix , 2019, Pattern Recognit..

[48]  Fatemehsadat Mireshghallah,et al.  Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy , 2020, PPMLP@CCS.

[49]  Omar Abdel Wahab,et al.  Generative Adversarial Networks for Mitigating Biases in Machine Learning Systems , 2019, ECAI.

[50]  David Evans,et al.  Evaluating Differentially Private Machine Learning in Practice , 2019, USENIX Security Symposium.

[51]  Jonathan Ullman,et al.  Auditing Differentially Private Machine Learning: How Private is Private SGD? , 2020, NeurIPS.

[52]  Mario Fritz,et al.  GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs , 2019, ArXiv.

[53]  Kimmo Kärkkäinen,et al.  FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age , 2019, ArXiv.

[54]  Hung Ba Improving Detection of Credit Card Fraudulent Transactions using Generative Adversarial Networks , 2019, ArXiv.

[55]  Emiliano De Cristofaro,et al.  LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[56]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[57]  Claus Aranha,et al.  Data Augmentation Using GANs , 2019, ArXiv.

[58]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[59]  H. Brendan McMahan,et al.  A General Approach to Adding Differential Privacy to Iterative Training Procedures , 2018, ArXiv.

[60]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[61]  H. Brendan McMahan,et al.  Generative Models for Effective ML on Private, Decentralized Datasets , 2019, ICLR.

[62]  Marzyeh Ghassemi,et al.  CheXclusion: Fairness gaps in deep chest X-ray classifiers , 2020, PSB.

[63]  Robert C. Holte,et al.  Severe Class Imbalance: Why Better Algorithms Aren't the Answer , 2005, ECML.

[64]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[65]  Mihaela van der Schaar,et al.  PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees , 2018, ICLR.

[66]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[67]  Ling Liu,et al.  Towards Demystifying Membership Inference Attacks , 2018, ArXiv.

[68]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[69]  Vitaly Shmatikov,et al.  Differential Privacy Has Disparate Impact on Model Accuracy , 2019, NeurIPS.

[70]  Vitaly Feldman,et al.  Does learning require memorization? a short tale about a long tail , 2019, STOC.