Algorithms that remember: model inversion attacks and data protection law

Many individuals are concerned about the governance of machine learning systems and the prevention of algorithmic harms. The EU's recent General Data Protection Regulation (GDPR) has been seen as a core tool for achieving better governance of this area. While the GDPR does apply to the use of models in some limited situations, most of its provisions relate to the governance of personal data, while models have traditionally been seen as intellectual property. We present recent work from the information security literature around ‘model inversion’ and ‘membership inference’ attacks, which indicates that the process of turning training data into machine-learned systems is not one way, and demonstrate how this could lead some models to be legally classified as personal data. Taking this as a probing experiment, we explore the different rights and obligations this would trigger and their utility, and posit future directions for algorithmic governance and regulation. This article is part of the theme issue ‘Governing artificial intelligence: ethical, legal, and technical opportunities and challenges’.

[1]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[2]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[3]  Krishna P. Gummadi,et al.  Blind Justice: Fairness with Encrypted Sensitive Attributes , 2018, ICML.

[4]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[5]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[6]  Michel van Eeten,et al.  Collectively exercising the right of access: individual effort, societal effect , 2017, Internet Policy Rev..

[7]  Junfeng Yang,et al.  Towards Making Systems Forget with Machine Unlearning , 2015, 2015 IEEE Symposium on Security and Privacy.

[8]  Rathindra Sarathy,et al.  Does Differential Privacy Protect Terry Gross' Privacy? , 2010, Privacy in Statistical Databases.

[9]  Jun Zhao,et al.  Third Party Tracking in the Mobile Ecosystem , 2018, WebSci.

[10]  Martín Abadi,et al.  On the Protection of Private Information in Machine Learning Systems: Two Recent Approches , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[11]  Beth-Anne Schuelke-Leech,et al.  Smart Technologies and the End(s) of Law , 2017 .

[12]  Jon Crowcroft,et al.  Unclouded Vision , 2011, ICDCN.

[13]  Stephen J. Cox,et al.  Speaker-independent machine lip-reading with speaker-dependent viseme classifiers , 2015, AVSP.

[14]  Athanasios V. Vasilakos,et al.  Cloud Computing , 2014, ACM Comput. Surv..

[15]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[16]  M. I. V. Eale,et al.  SLAVE TO THE ALGORITHM ? WHY A ‘ RIGHT TO AN EXPLANATION ’ IS PROBABLY NOT THE REMEDY YOU ARE LOOKING FOR , 2017 .

[17]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[18]  David Banisar,et al.  The Right to Information and Privacy: Balancing Rights and Managing Conflicts , 2011 .

[19]  Kieron O'Hara,et al.  Functional anonymisation: Personal data and the data environment , 2018, Comput. Law Secur. Rev..

[20]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[21]  Brent Mittelstadt,et al.  From Individual to Group Privacy in Big Data Analytics , 2017 .

[22]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[23]  Nadezhda Purtova The law of everything. Broad concept of personal data and future of EU data protection law , 2018 .

[24]  Mireille Hildebrandt,et al.  Profiling and the rule of law , 2008 .

[25]  Y. Hoffman Knock! Knock! Who's there? , 1995, Michigan health & hospitals.

[26]  Charles Elkan,et al.  Differential Privacy and Machine Learning: a Survey and Review , 2014, ArXiv.

[27]  Úlfar Erlingsson,et al.  The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets , 2018, ArXiv.

[28]  Anton Vedder,et al.  KDD: The challenge to individualism , 1999, Ethics and Information Technology.

[29]  Michael Veale,et al.  When data protection by design and data subject rights clash , 2018 .

[30]  Vitaly Shmatikov,et al.  2011 IEEE Symposium on Security and Privacy “You Might Also Like:” Privacy Risks of Collaborative Filtering , 2022 .

[31]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[32]  Somesh Jha,et al.  The Unintended Consequences of Overfitting: Training Data Inference Attacks , 2017, ArXiv.

[33]  Paul Ohm Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization , 2009 .

[34]  Giovanni Felici,et al.  Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers , 2013, Int. J. Secur. Networks.

[35]  Hamed Haddadi,et al.  Enabling the new economic actor: data protection, the digital economy, and the Databox , 2016, Personal and Ubiquitous Computing.

[36]  Alexander Russell,et al.  Behavior vs. introspection: refining prediction of clinical depression via smartphone sensing data , 2016, 2016 IEEE Wireless Health (WH).

[37]  Paul De Hert,et al.  The unaccountable state of surveillance: Exercising access rights in Europe , 2017 .

[38]  Jun Zhao,et al.  Better the Devil You Know: Exposing the Data Sharing Practices of Smartphone Apps , 2017, CHI.

[39]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[40]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[41]  R. Sarathy,et al.  Fool's Gold: an Illustrated Critique of Differential Privacy , 2013 .

[42]  Zhi-Hua Zhou,et al.  Learnware: on the future of machine learning , 2016, Frontiers of Computer Science.

[43]  Julia Powles,et al.  "Meaningful Information" and the Right to Explanation , 2017, FAT.

[44]  Paul De Hert,et al.  Exercising Access Rights in Belgium , 2017 .

[45]  Vitaly Shmatikov,et al.  Machine Learning Models that Remember Too Much , 2017, CCS.

[46]  Ilya Mironov,et al.  On significance of the least significant bits for differential privacy , 2012, CCS.

[47]  Thomas Steinke,et al.  Differential Privacy: A Primer for a Non-Technical Audience , 2018 .

[48]  Jatinder Singh,et al.  Decision Provenance: Capturing data flow for accountable systems , 2018, ArXiv.