Machine, Unlearning

Once users have shared their data online, it is generally difficult for them to revoke access and ask for the data to be deleted. Machine learning (ML) exacerbates this problem because any model trained with said data may have memorized it, putting users at risk of a successful privacy attack exposing their information. Yet, having models unlearn is notoriously difficult. After a data point is removed from a training set, one often resorts to entirely retraining downstream models from scratch. We introduce SISA training, a framework that decreases the number of model parameters affected by an unlearning request and caches intermediate outputs of the training algorithm to limit the number of model updates that need to be computed to have these parameters unlearn. This framework reduces the computational overhead associated with unlearning, even in the worst-case setting where unlearning requests are made uniformly across the training set. In some cases, we may have a prior on the distribution of unlearning requests that will be issued by users. We may take this prior into account to partition and order data accordingly and further decrease overhead from unlearning. Our evaluation spans two datasets from different application domains, with corresponding motivations for unlearning. Under no distributional assumptions, we observe that SISA training improves unlearning for the Purchase dataset by 3.13x, and 1.658x for the SVHN dataset, over retraining from scratch. We also validate how knowledge of the unlearning distribution provides further improvements in retraining time by simulating a scenario where we model unlearning requests that come from users of a commercial product that is available in countries with varying sensitivity to privacy. Our work contributes to practical data governance in machine learning.

[1]  Blaine Nelson,et al.  Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[2]  Tom Goldstein,et al.  Certified Data Removal from Machine Learning Models , 2020, ICML.

[3]  Aleksey Boyko,et al.  Detecting Cancer Metastases on Gigapixel Pathology Images , 2017, ArXiv.

[4]  Torsten Hoefler,et al.  Demystifying Parallel and Distributed Deep Learning , 2018, ACM Comput. Surv..

[5]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[7]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[8]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[9]  James Zou,et al.  Making AI Forget You: Data Deletion in Machine Learning , 2019, NeurIPS.

[10]  Yoshua Bengio,et al.  Boosting Neural Networks , 2000, Neural Computation.

[11]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[12]  Abhi Shelat,et al.  Full Accounting for Verifiable Outsourcing , 2017, CCS.

[13]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[14]  Elie Bursztein,et al.  Five Years of the Right to be Forgotten , 2019, CCS.

[15]  Junfeng Yang,et al.  Towards Making Systems Forget with Machine Unlearning , 2015, 2015 IEEE Symposium on Security and Privacy.

[16]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[17]  Vijay Chidambaram,et al.  The Seven Sins of Personal-Data Processing Systems under GDPR , 2019, HotCloud.

[18]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[19]  Sebastian Nowozin,et al.  Oblivious Multi-Party Machine Learning on Trusted Processors , 2016, USENIX Security Symposium.

[20]  Alessandro Mantelero,et al.  The EU Proposal for a General Data Protection Regulation and the roots of the 'right to be forgotten' , 2013, Comput. Law Secur. Rev..

[21]  S. Weisberg,et al.  Characterizations of an Empirical Influence Function for Detecting Influential Cases in Regression , 1980 .

[22]  Lingfan Yu,et al.  The Efficient Server Audit Problem, Deduplicated Re-execution, and the Web , 2017, SOSP.

[23]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[24]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[25]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[26]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[28]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[29]  Joaquin Quiñonero Candela,et al.  Practical Lessons from Predicting Clicks on Ads at Facebook , 2014, ADKDD'14.

[30]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[31]  Yomi Kastro,et al.  Real-time prediction of online shoppers’ purchasing intention using multilayer perceptron and LSTM recurrent neural networks , 2019, Neural Computing and Applications.

[32]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[33]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[34]  Ling Huang,et al.  ANTIDOTE: understanding and defending against poisoning of anomaly detectors , 2009, IMC '09.

[35]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[36]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[37]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[38]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[39]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[40]  Dan Feldman,et al.  Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds , 2018, ICLR.

[41]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[42]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[43]  Jerome H. Saltzer,et al.  The protection of information in computer systems , 1975, Proc. IEEE.

[44]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[45]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[46]  Srinath T. V. Setty,et al.  Making argument systems for outsourced computation practical (sometimes) , 2012, NDSS.

[47]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[48]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[49]  Andrew M. Dai,et al.  Gmail Smart Compose: Real-Time Assisted Writing , 2019, KDD.

[50]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..