Data and Model Dependencies of Membership Inference Attack

Machine learning (ML) models have been shown to be vulnerable to Membership Inference Attacks (MIA), which infer the membership of a given data point in the target dataset by observing the prediction output of the ML model. While the key factors for the success of MIA have not yet been fully understood, existing defense mechanisms such as using L2 regularization \cite{10shokri2017membership} and dropout layers \cite{salem2018ml} take only the model's overfitting property into consideration. In this paper, we provide an empirical analysis of the impact of both the data and ML model properties on the vulnerability of ML techniques to MIA. Our results reveal the relationship between MIA accuracy and properties of the dataset and training model in use. In particular, we show that the size of shadow dataset, the class and feature balance and the entropy of the target dataset, the configurations and fairness of the training model are the most influential factors. Based on those experimental findings, we conclude that along with model overfitting, multiple properties jointly contribute to MIA success instead of any single property. Building on our experimental findings, we propose using those data and model properties as regularizers to protect ML models against MIA. Our results show that the proposed defense mechanisms can reduce the MIA accuracy by up to 25\% without sacrificing the ML model prediction utility.

[1]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[2]  Gerhard Nahler,et al.  Pearson Correlation Coefficient , 2020, Definitions.

[3]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[4]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[5]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[6]  Kai Chen,et al.  Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.

[7]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[8]  Douglas A. Reynolds,et al.  Gaussian Mixture Models , 2018, Encyclopedia of Biometrics.

[9]  Robert Laganière,et al.  Membership Inference Attack against Differentially Private Deep Learning Model , 2018, Trans. Data Priv..

[10]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[11]  Reuben Binns,et al.  Fairness in Machine Learning: Lessons from Political Philosophy , 2017, FAT.

[12]  C. Dwork,et al.  Exposed! A Survey of Attacks on Private Data , 2017, Annual Review of Statistics and Its Application.

[13]  Jinyuan Jia,et al.  AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning , 2018, USENIX Security Symposium.

[14]  Paul Irolla,et al.  Demystifying the Membership Inference Attack , 2019, 2019 12th CMI Conference on Cybersecurity and Privacy (CMI).

[15]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[16]  Pratik Gajane,et al.  On formalizing fairness in prediction with machine learning , 2017, ArXiv.

[17]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[18]  Emiliano De Cristofaro,et al.  LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[19]  Raghav Bhaskar,et al.  On Inferring Training Data Attributes in Machine Learning Models , 2019, ArXiv.

[20]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[21]  Kevin Duh,et al.  Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System? , 2019, TACL.

[22]  Dorothea Kolossa,et al.  Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding , 2018, NDSS.

[23]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[24]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[25]  Luca Oneto,et al.  Fairness in Machine Learning , 2020, INNSBDDL.

[26]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[27]  Matt Fredrikson,et al.  Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference , 2020, USENIX Security Symposium.

[28]  Daniel Bernau,et al.  Monte Carlo and Reconstruction Membership Inference Attacks against Generative Models , 2019, Proc. Priv. Enhancing Technol..

[29]  Ling Liu,et al.  Towards Demystifying Membership Inference Attacks , 2018, ArXiv.

[30]  Carmela Troncoso,et al.  Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning , 2019, ArXiv.

[31]  Michael Veale,et al.  Algorithms that remember: model inversion attacks and data protection law , 2018, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[32]  Emiliano De Cristofaro,et al.  : Membership Inference Attacks Against Generative Models , 2018 .

[33]  Mario Fritz,et al.  GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs , 2019, ArXiv.

[34]  Kai Peng,et al.  SocInf: Membership Inference Attacks on Social Media Health Data With Machine Learning , 2019, IEEE Transactions on Computational Social Systems.

[35]  Mario Fritz,et al.  GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models , 2019, CCS.

[36]  Michael Backes,et al.  Membership Privacy in MicroRNA-based Studies , 2016, CCS.

[37]  Kazuyuki Shudo,et al.  A Framework for Searching a Predictive Model , 2018 .

[38]  Wenqi Wei,et al.  Effects of Differential Privacy and Data Skewness on Membership Inference Vulnerability , 2019, 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA).