Privacy Side Channels in Machine Learning Systems

Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum, when in reality, ML models are part of larger systems that include components for training data filtering, output monitoring, and more. In this work, we introduce privacy side channels: attacks that exploit these system-level components to extract private information at far higher rates than is otherwise possible for standalone models. We propose four categories of side channels that span the entire ML lifecycle (training data filtering, input preprocessing, output post-processing, and query filtering) and allow for either enhanced membership inference attacks or even novel threats such as extracting users' test queries. For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees. Moreover, we show that systems which block language models from regenerating training data can be exploited to allow exact reconstruction of private keys contained in the training set -- even if the model did not memorize these keys. Taken together, our results demonstrate the need for a holistic, end-to-end privacy analysis of machine learning.

[1]  Eric Michael Smith,et al.  Llama 2: Open Foundation and Fine-Tuned Chat Models , 2023, ArXiv.

[2]  Andrew M. Dai,et al.  PaLM 2 Technical Report , 2023, ArXiv.

[3]  Mohit Iyyer,et al.  Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense , 2023, NeurIPS.

[4]  J. Such,et al.  MalProtect: Stateful Defense Against Adversarial Query Attacks in ML-Based Malware Detection , 2023, IEEE Transactions on Information Forensics and Security.

[5]  Florian Tramèr,et al.  Tight Auditing of Differentially Private Machine Learning , 2023, USENIX Security Symposium.

[6]  Florian Tramèr,et al.  Extracting Training Data from Diffusion Models , 2023, USENIX Security Symposium.

[7]  Nicolas Papernot,et al.  Learned Systems Security , 2022, ArXiv.

[8]  T. Goldstein,et al.  Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Christopher A. Choquette-Choo,et al.  Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning , 2022, ICML.

[10]  A. Dehghantanha,et al.  Proactive Detection of Query-based Adversarial Scenarios in NLP Systems , 2022, AISec@CCS.

[11]  Christopher A. Choquette-Choo,et al.  Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy , 2022, ArXiv.

[12]  Ludwig Schmidt,et al.  LAION-5B: An open large-scale dataset for training next generation image-text models , 2022, NeurIPS.

[13]  Samuel L. Smith,et al.  Unlocking High-Accuracy Differentially Private Image Classification through Scale , 2022, ArXiv.

[14]  Prafulla Dhariwal,et al.  Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.

[15]  Florian Tramèr,et al.  Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets , 2022, CCS.

[16]  Florian Tramèr,et al.  Quantifying Memorization Across Neural Language Models , 2022, ICLR.

[17]  Colin Raffel,et al.  Deduplicating Training Data Mitigates Privacy Risks in Language Models , 2022, ICML.

[18]  Borja Balle,et al.  Reconstructing Training Data with Informed Adversaries , 2022, 2022 IEEE Symposium on Security and Privacy (SP).

[19]  Po-Sen Huang,et al.  Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.

[20]  Florian Tramèr,et al.  Membership Inference Attacks From First Principles , 2021, 2022 IEEE Symposium on Security and Privacy (SP).

[21]  Yizheng Chen,et al.  SEAT: Similarity Encoder by Adversarial Training for Detecting Model Extraction Attack Queries , 2021, AISec@CCS.

[22]  Nicholas Carlini,et al.  Deduplicating Training Data Makes Language Models Better , 2021, ACL.

[23]  Aditya Kanade,et al.  Stateful Detection of Model Extraction Attacks , 2021, ArXiv.

[24]  Nicholas Carlini,et al.  Poisoning the Unlabeled Dataset of Semi-Supervised Learning , 2021, USENIX Security Symposium.

[25]  Stella Biderman,et al.  GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow , 2021 .

[26]  Milad Nasr,et al.  Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[27]  Azalia Mirhoseini,et al.  FLAME: Taming Backdoors in Federated Learning (Extended Version 1) , 2021, 2101.02281.

[28]  Charles Foster,et al.  The Pile: An 800GB Dataset of Diverse Text for Language Modeling , 2020, ArXiv.

[29]  Tom B. Brown,et al.  Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.

[30]  Nicolas Papernot,et al.  Label-Only Membership Inference Attacks , 2020, ICML.

[31]  Ben Y. Zhao,et al.  Blacklight: Defending Black-Box Adversarial Attacks on Deep Neural Networks , 2020, ArXiv.

[32]  Mohammad Abdullah Al Faruque,et al.  Leaky DNN: Stealing Deep-Learning Model Secret with GPU Context-Switching Side-Channel , 2020, 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[33]  Vinod Ganapathy,et al.  ActiveThief: Model Extraction Using Active Learning and Unannotated Public Data , 2020, AAAI.

[34]  Qi Xuan,et al.  Open DNN Box by Power Side-Channel Attack , 2019, IEEE Transactions on Circuits and Systems II: Express Briefs.

[35]  Nicholas Carlini,et al.  Stateful Detection of Black-Box Adversarial Attacks , 2019, Proceedings of the 1st ACM Workshop on Security and Privacy on Artificial Intelligence.

[36]  Prateek Mittal,et al.  Privacy Risks of Securing Machine Learning Models against Adversarial Examples , 2019, CCS.

[37]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Yuan Xie,et al.  Neural Network Model Extraction Attacks in Edge Devices by Hearing Architectural Hints , 2019, ArXiv.

[39]  Vasisht Duddu,et al.  Stealing Neural Networks via Timing Side Channels , 2018, ArXiv.

[40]  Benjamin Edwards,et al.  Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering , 2018, SafeAI@AAAI.

[41]  T. Dumitras,et al.  Security Analysis of Deep Neural Networks Operating in the Presence of Cache Side-Channel Attacks , 2018, ArXiv.

[42]  Ivan Beschastnikh,et al.  Mitigating Sybils in Federated Learning Poisoning , 2018, ArXiv.

[43]  Samuel Marchal,et al.  PRADA: Protecting Against DNN Model Stealing Attacks , 2018, 2019 IEEE European Symposium on Security and Privacy (EuroS&P).

[44]  Logan Engstrom,et al.  Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[45]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[46]  Bo Luo,et al.  I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators , 2018, ACSAC.

[47]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[48]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[49]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[50]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[51]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[52]  A. Juels,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[53]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[54]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[55]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[56]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[58]  Alexandra Birch,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[59]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[60]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[61]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[62]  Benny Pinkas,et al.  Proofs of ownership in remote storage systems , 2011, CCS '11.

[63]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[64]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[65]  Serge Vaudenay,et al.  Security Flaws Induced by CBC Padding - Applications to SSL, IPSEC, WTLS , 2002, EUROCRYPT.

[66]  Eli Biham,et al.  Cryptanalysis of Skipjack Reduced to 31 Rounds Using Impossible Differentials , 1999, Journal of Cryptology.

[67]  Daniel Bleichenbacher,et al.  Chosen Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS #1 , 1998, CRYPTO.

[68]  Paul C. Kocher,et al.  Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems , 1996, CRYPTO.

[69]  Sergei Vassilvitskii,et al.  Training Text-to-Text Transformers with Privacy Guarantees , 2022, FINDINGS.

[70]  Yang Liu,et al.  SeInspect: Defending Model Stealing via Heterogeneous Semantic Inspection , 2022, ESORICS.

[71]  Brayan Stiven Torrres Ovalle GitHub Copilot , 2022, Encuentro Internacional de Educación en Ingeniería.

[72]  Fengjun Li,et al.  CONTRA: Defending Against Poisoning Attacks in Federated Learning , 2021, ESORICS.

[73]  Lejla Batina,et al.  CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel , 2019, USENIX Security Symposium.

[74]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[75]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[76]  Siva Sai Yerubandi,et al.  Differential Power Analysis , 2002 .