When Machine Unlearning Jeopardizes Privacy

The right to be forgotten states that a data owner has the right to erase her data from an entity storing it. In the context of machine learning (ML), the right to be forgotten requires an ML model owner to remove the data owner's data from the training set used to build the ML model, a process known as machine unlearning. While originally designed to protect the privacy of the data owner, we argue that machine unlearning may leave some imprint of the data in the ML model and thus create unintended privacy risks. In this paper, we perform the first study on investigating the unintended information leakage caused by machine unlearning. We propose a novel membership inference attack which leverages the different outputs of an ML model's two versions to infer whether the deleted sample is part of the training set. Our experiments over five different datasets demonstrate that the proposed membership inference attack achieves strong performance. More importantly, we show that our attack in multiple cases outperforms the classical membership inference attack on the original ML model, which indicates that machine unlearning can have counterproductive effects on privacy. We notice that the privacy degradation is especially significant for well-generalized ML models where classical membership inference does not perform well. We further investigate two mechanisms to mitigate the newly discovered privacy risks and show that the only effective mechanism is to release the predicted label only. We believe that our results can help improve privacy in practical implementation of machine unlearning.

[1]  Michael P. Wellman,et al.  Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[2]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[3]  Virgílio A. F. Almeida,et al.  The Right to be Forgotten in the Media: A Data-Driven Study , 2016, Proc. Priv. Enhancing Technol..

[4]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[5]  Sebastian Schelter "Amnesia" - Machine Learning Models That Can Forget User Data Very Fast , 2020, Conference on Innovative Data Systems Research.

[6]  Mario Fritz,et al.  GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs , 2019, ArXiv.

[7]  Gang Wang,et al.  LEMNA: Explaining Deep Learning based Security Applications , 2018, CCS.

[8]  Tom Goldstein,et al.  Certified Data Removal from Machine Learning Models , 2020, ICML.

[9]  Elie Bursztein,et al.  Five Years of the Right to be Forgotten , 2019, CCS.

[10]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[11]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[12]  Michael Backes,et al.  MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples , 2019, CCS.

[13]  Nikita Borisov,et al.  Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations , 2018, CCS.

[14]  Seong Joon Oh,et al.  Towards Reverse-Engineering Black-Box Neural Networks , 2017, ICLR.

[15]  Bernt Schiele,et al.  Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation , 2020, ECCV.

[16]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[17]  David Lie,et al.  Machine Unlearning , 2019, 2021 IEEE Symposium on Security and Privacy (SP).

[18]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[19]  Matt Fredrikson,et al.  Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference , 2020, USENIX Security Symposium.

[20]  Junfeng Yang,et al.  Towards Making Systems Forget with Machine Unlearning , 2015, 2015 IEEE Symposium on Security and Privacy.

[21]  Jianfeng Ma,et al.  Learn to Forget: User-Level Memorization Elimination in Federated Learning , 2020, ArXiv.

[22]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[23]  Michael P. Wellman,et al.  SoK: Security and Privacy in Machine Learning , 2018, 2018 IEEE European Symposium on Security and Privacy (EuroS&P).

[24]  Ting Wang,et al.  Model-Reuse Attacks on Deep Learning Systems , 2018, CCS.

[25]  Michael Backes,et al.  Membership Privacy in MicroRNA-based Studies , 2016, CCS.

[26]  Liwei Song,et al.  Towards Probabilistic Verification of Machine Unlearning , 2020, ArXiv.

[27]  Stefano Soatto,et al.  Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations , 2020, ECCV.

[28]  Yang Zhang,et al.  Quantifying Location Sociality , 2016, HT.

[29]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[30]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[31]  Ninghui Li,et al.  Membership Inference Attacks and Defenses in Supervised Learning via Generalization Gap , 2020, ArXiv.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Yizheng Chen,et al.  On Training Robust PDF Malware Classifiers , 2019, USENIX Security Symposium.

[34]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[35]  Peter Kieseberg,et al.  Humans forget, machines remember: Artificial intelligence and the Right to Be Forgotten , 2017, Comput. Law Secur. Rev..

[36]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[37]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[38]  David Berthelot,et al.  High Accuracy and High Fidelity Extraction of Neural Networks , 2020, USENIX Security Symposium.

[39]  Kamalika Chaudhuri,et al.  Approximate Data Deletion from Machine Learning Models: Algorithms and Evaluations , 2020, ArXiv.

[40]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[41]  Michael Backes,et al.  Towards Plausible Graph Anonymization , 2020, NDSS.

[42]  Yang Zhang,et al.  Tagvisor: A Privacy Advisor for Sharing Hashtags , 2018, WWW.

[43]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[44]  Yizheng Chen,et al.  Neutaint: Efficient Dynamic Taint Analysis with Neural Networks , 2019, 2020 IEEE Symposium on Security and Privacy (SP).

[45]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[46]  Yang Zhang,et al.  walk2friends: Inferring Social Links from Mobility Profiles , 2017, CCS.

[47]  Emiliano De Cristofaro,et al.  LOGAN: Evaluating Privacy Leakage of Generative Models Using Generative Adversarial Networks , 2017, ArXiv.

[48]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[49]  Matthias Zeppelzauer,et al.  Machine Unlearning: Linear Filtration for Logit-based Classifiers , 2020, ArXiv.

[50]  Vitaly Shmatikov,et al.  Auditing Data Provenance in Text-Generation Models , 2018, KDD.

[51]  Shanqing Guo,et al.  How to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of DNN , 2019, ACSAC.

[52]  Reza Shokri,et al.  Exploiting Transparency Measures for Membership Inference: a Cautionary Tale , 2020 .

[53]  Yang Zhang,et al.  Language in Our Time: An Empirical Analysis of Hashtags , 2019, WWW.

[54]  Ting Wang,et al.  DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[55]  Konrad Rieck,et al.  Misleading Authorship Attribution of Source Code using Adversarial Learning , 2019, USENIX Security Symposium.

[56]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[57]  Yang Zhang,et al.  DeepCity: A Feature Learning Framework for Mining Location Check-Ins , 2017, ICWSM.

[58]  Yang Zhang,et al.  MBeacon: Privacy-Preserving Beacons for DNA Methylation Data , 2019, NDSS.

[59]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[60]  Michael Backes,et al.  Dynamic Backdoor Attacks Against Machine Learning Models , 2020, ArXiv.

[61]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[62]  Markus Strohmaier,et al.  Privacy Attacks on Network Embeddings , 2019, ArXiv.

[63]  James Zou,et al.  Making AI Forget You: Data Deletion in Machine Learning , 2019, NeurIPS.