Privacy for All: Demystify Vulnerability Disparity of Differential Privacy against Membership Inference Attack

Machine learning algorithms, when applied to sensitive data, pose a potential threat to privacy. A growing body of prior work has demonstrated that membership inference attack (MIA) can disclose specific private information in the training data to an attacker. Meanwhile, the algorithmic fairness of machine learning has increasingly caught attention from both academia and industry. Algorithmic fairness ensures that the machine learning models do not discriminate a particular demographic group of individuals (e.g., black and female people). Given that MIA is indeed a learning model, it raises a serious concern if MIA ``fairly'' treats all groups of individuals equally. In other words, whether a particular group is more vulnerable against MIA than the other groups. This paper examines the algorithmic fairness issue in the context of MIA and its defenses. First, for fairness evaluation, it formalizes the notation of vulnerability disparity (VD) to quantify the difference of MIA treatment on different demographic groups. Second, it evaluates VD on four real-world datasets, and shows that VD indeed exists in these datasets. Third, it examines the impacts of differential privacy, as a defense mechanism of MIA, on VD. The results show that although DP brings significant change on VD, it cannot eliminate VD completely. Therefore, fourth, it designs a new mitigation algorithm named FAIRPICK to reduce VD. An extensive set of experimental results demonstrate that FAIRPICK can effectively reduce VD for both with and without the DP deployment.

[1]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[2]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[3]  Krishna P. Gummadi,et al.  The Case for Process Fairness in Learning: Feature Selection for Fair Decision Making , 2016 .

[4]  Rebecca N. Wright,et al.  A Practical Differentially Private Random Decision Tree Classifier , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[5]  C. Dwork,et al.  Group Fairness Under Composition , 2018 .

[6]  Ankur Taly,et al.  Counterfactual Fairness in Text Classification through Robustness , 2018, AIES.

[7]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[8]  Dawn Song,et al.  Towards Practical Differentially Private Convex Optimization , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[9]  Yin Yang,et al.  Functional Mechanism: Regression Analysis under Differential Privacy , 2012, Proc. VLDB Endow..

[10]  Hany Farid,et al.  The accuracy, fairness, and limits of predicting recidivism , 2018, Science Advances.

[11]  Toon Calders,et al.  Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[12]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[13]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[14]  Cynthia Dwork,et al.  Differential Privacy for Statistics: What we Know and What we Want to Learn , 2010, J. Priv. Confidentiality.

[15]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[16]  David Evans,et al.  Evaluating Differentially Private Machine Learning in Practice , 2019, USENIX Security Symposium.

[17]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[18]  Divesh Srivastava,et al.  Differentially Private Publication of Sparse Data , 2011, ArXiv.

[19]  Felix FX Lindner,et al.  Vulnerability Extrapolation: Assisted Discovery of Vulnerabilities Using Machine Learning , 2011, WOOT.

[20]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[21]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[22]  Dan A. Biddle Adverse Impact and Test Validation: A Practitioner's Guide to Valid and Defensible Employment Testing , 2005 .

[23]  Robert Laganière,et al.  Membership Inference Attack against Differentially Private Deep Learning Model , 2018, Trans. Data Priv..

[24]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[25]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[26]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[27]  Emiliano De Cristofaro,et al.  LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[28]  Vladimir Vapnik,et al.  A new learning paradigm: Learning using privileged information , 2009, Neural Networks.

[29]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[30]  Vitaly Shmatikov,et al.  The Natural Auditor: How To Tell If Someone Used Your Words To Train Their Model , 2018, ArXiv.

[31]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[32]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[33]  Michael Backes,et al.  MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples , 2019, CCS.

[34]  Maya R. Gupta,et al.  Satisfying Real-world Goals with Dataset Constraints , 2016, NIPS.

[35]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[36]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[37]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[38]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[39]  Assaf Schuster,et al.  Data mining with differential privacy , 2010, KDD.

[40]  Junfeng Yang,et al.  NEUZZ: Efficient Fuzzing with Neural Program Smoothing , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[41]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[42]  Vitaly Shmatikov,et al.  Auditing Data Provenance in Text-Generation Models , 2018, KDD.

[43]  Charles Elkan,et al.  Differential Privacy and Machine Learning: a Survey and Review , 2014, ArXiv.

[44]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[45]  Huy Kang Kim,et al.  GIDS: GAN based Intrusion Detection System for In-Vehicle Network , 2018, 2018 16th Annual Conference on Privacy, Security and Trust (PST).

[46]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[47]  Kai Chen,et al.  Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.

[48]  Liwei Song,et al.  Membership Inference Attacks Against Adversarially Robust Deep Learning Models , 2019, 2019 IEEE Security and Privacy Workshops (SPW).

[49]  Yu Zhang,et al.  Differentially Private High-Dimensional Data Publication via Sampling-Based Inference , 2015, KDD.

[50]  Jun Sakuma,et al.  Fairness-aware Learning through Regularization Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[51]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[52]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[53]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[54]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[55]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[56]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[57]  Jennifer Widom,et al.  The Beckman Report on Database Research , 2014, SGMD.

[58]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[59]  Wenqi Wei,et al.  Demystifying Membership Inference Attacks in Machine Learning as a Service , 2019, IEEE Transactions on Services Computing.

[60]  Novi Quadrianto,et al.  Recycling Privileged Learning and Distribution Matching for Fairness , 2017, NIPS.