SoK: Machine Learning Governance

The application of machine learning (ML) in computer systems introduces not only many benefits but also risks to society. In this paper, we develop the concept of ML governance to balance such benefits and risks, with the aim of achieving responsible applications of ML. Our approach first systematizes research towards ascertaining ownership of data and models, thus fostering a notion of identity specific to ML systems. Building on this foundation, we use identities to hold principals accountable for failures of ML systems through both attribution and auditing. To increase trust in ML systems, we then survey techniques for developing assurance, i.e., confidence that the system meets its security requirements and does not exhibit certain known failures. This leads us to highlight the need for techniques that allow a model owner to manage the life cycle of their system, e.g., to patch or retire their ML system. Put altogether, our systematization of knowledge standardizes the interactions between principals involved in the deployment of ML throughout its life cycle. We highlight opportunities for future work, e.g., to formalize the resulting game between ML principals.

[1]  Zahra Ghodsi,et al.  SafetyNets: Verifiable Execution of Deep Neural Networks on an Untrusted Cloud , 2017, NIPS.

[2]  Cesare Alippi,et al.  Just-in-time ensemble of classifiers , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[3]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[4]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[5]  Somesh Jha,et al.  Attribution-Based Confidence Metric For Deep Neural Networks , 2019, NeurIPS.

[6]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[7]  João Gama,et al.  Learning with Local Drift Detection , 2006, ADMA.

[8]  Jie Lu,et al.  Regional Concept Drift Detection and Density Synchronized Drift Adaptation , 2017, IJCAI.

[9]  Nisheeth K. Vishnoi,et al.  Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees , 2018, FAT.

[10]  James Bailey,et al.  On the Convergence and Robustness of Adversarial Training , 2021, ICML.

[11]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[12]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[13]  Hongxia Jin,et al.  Generalized ODIN: Detecting Out-of-Distribution Image Without Learning From Out-of-Distribution Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Girijesh Prasad,et al.  EWMA model based shift-detection methods for detecting covariate shifts in non-stationary environments , 2015, Pattern Recognit..

[15]  Ming-Yu Liu,et al.  Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.

[16]  Tudor Dumitras,et al.  On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping , 2020, ArXiv.

[17]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[18]  Jonathan Ullman,et al.  Auditing Differentially Private Machine Learning: How Private is Private SGD? , 2020, NeurIPS.

[19]  Olga Russakovsky,et al.  REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets , 2020, International Journal of Computer Vision.

[20]  Milad Nasr,et al.  Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[21]  S. Venkatasubramanian,et al.  An Information-Theoretic Approach to Detecting Changes in Multi-Dimensional Data Streams , 2006 .

[22]  Arpita Patra,et al.  BLAZE: Blazing Fast Privacy-Preserving Machine Learning , 2020, IACR Cryptol. ePrint Arch..

[23]  Oscar Fontenla-Romero,et al.  Online Machine Learning , 2024, Machine Learning: Foundations, Methodologies, and Applications.

[24]  Dan Boneh,et al.  Adversarial Training and Robustness for Multiple Perturbations , 2019, NeurIPS.

[25]  Xu Zhang,et al.  Detecting and Simulating Artifacts in GAN Fake Images , 2019, 2019 IEEE International Workshop on Information Forensics and Security (WIFS).

[26]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[27]  Sarvar Patel,et al.  Practical Secure Aggregation for Federated Learning on User-Held Data , 2016, ArXiv.

[28]  Aseem Rastogi,et al.  Secure Medical Image Analysis with CrypTFlow , 2020, ArXiv.

[29]  Guy N. Rothblum,et al.  Probably Approximately Metric-Fair Learning , 2018, ICML.

[30]  Atul Prakash,et al.  MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Lei Du,et al.  A Selective Detector Ensemble for Concept Drift Detection , 2015, Comput. J..

[32]  Xiangliang Zhang,et al.  A PCA-Based Change Detection Framework for Multidimensional Data Streams: Change Detection in Multidimensional Data Streams , 2015, KDD.

[33]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[34]  Vitaly Shmatikov,et al.  Machine Learning Models that Remember Too Much , 2017, CCS.

[35]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[36]  Cesare Alippi,et al.  Hierarchical Change-Detection Tests , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Thomas G. Dietterich,et al.  Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[38]  Patrick D. McDaniel,et al.  Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[39]  Justin Thaler,et al.  Time-Optimal Interactive Proofs for Circuit Evaluation , 2013, CRYPTO.

[40]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[41]  Xiangyu Zhang,et al.  ABS: Scanning Neural Networks for Back-doors by Artificial Brain Stimulation , 2019, CCS.

[42]  Tolga Bolukbasi,et al.  Guided Integrated Gradients: an Adaptive Path Method for Removing Noise , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  David Berthelot,et al.  High Accuracy and High Fidelity Extraction of Neural Networks , 2020, USENIX Security Symposium.

[44]  Ananda Theertha Suresh,et al.  Remember What You Want to Forget: Algorithms for Machine Unlearning , 2021, NeurIPS.

[45]  Nishanth Chandran,et al.  Production-level Open Source Privacy Preserving Inference in Medical Imaging , 2021, ArXiv.

[46]  Zhifei Zhang,et al.  Analyzing User-Level Privacy Attack Against Federated Learning , 2020, IEEE Journal on Selected Areas in Communications.

[47]  Jakub Konecný,et al.  Federated Optimization: Distributed Optimization Beyond the Datacenter , 2015, ArXiv.

[48]  Nicolas Papernot,et al.  Entangled Watermarks as a Defense against Model Extraction , 2020, USENIX Security Symposium.

[49]  Matt Fredrikson,et al.  Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference , 2019, USENIX Security Symposium.

[50]  Atul Prakash,et al.  Protecting DNNs from Theft using an Ensemble of Diverse Models , 2021, ICLR.

[51]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[53]  Lejla Batina,et al.  CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel , 2019, USENIX Security Symposium.

[54]  Sameer Wagh,et al.  SecureNN: 3-Party Secure Computation for Neural Network Training , 2019, Proc. Priv. Enhancing Technol..

[55]  Dimitris K. Tasoulis,et al.  Exponentially weighted moving average charts for detecting concept drift , 2012, Pattern Recognit. Lett..

[56]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[57]  José del Campo-Ávila,et al.  Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds , 2015, IEEE Transactions on Knowledge and Data Engineering.

[58]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[59]  Junfeng Yang,et al.  Towards Making Systems Forget with Machine Unlearning , 2015, 2015 IEEE Symposium on Security and Privacy.

[60]  Michael Moeller,et al.  Inverting Gradients - How easy is it to break privacy in federated learning? , 2020, NeurIPS.

[61]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[62]  Maya R. Gupta,et al.  To Trust Or Not To Trust A Classifier , 2018, NeurIPS.

[63]  Jure Leskovec,et al.  The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables , 2017, KDD.

[64]  Xia Zhu,et al.  Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-out Classifiers , 2018, ECCV.

[65]  Patrick McDaniel,et al.  On the Robustness of Domain Constraints , 2021, CCS.

[66]  Nicholas Carlini,et al.  Label-Only Membership Inference Attacks , 2020, ICML.

[67]  Andrew J. Blumberg,et al.  Verifying computations without reexecuting them , 2015, Commun. ACM.

[68]  Ricard Gavaldà,et al.  Adaptive Learning from Evolving Data Streams , 2009, IDA.

[69]  David Lie,et al.  Machine Unlearning , 2019, 2021 IEEE Symposium on Security and Privacy (SP).

[70]  Chandramouli Shama Sastry,et al.  Detecting Out-of-Distribution Examples with In-distribution Examples and Gram Matrices , 2019, ArXiv.

[71]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[72]  Samuel Marchal,et al.  PRADA: Protecting Against DNN Model Stealing Attacks , 2018, 2019 IEEE European Symposium on Security and Privacy (EuroS&P).

[73]  Ion Stoica,et al.  Cerebro: A Platform for Multi-Party Cryptographic Collaborative Learning , 2021, IACR Cryptol. ePrint Arch..

[74]  Harini Suresh,et al.  A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle , 2021, EAAMO.

[75]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[76]  Karthik Dinakar,et al.  Studying up: reorienting the study of algorithmic fairness around issues of power , 2020, FAT*.

[77]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[78]  Bhavani M. Thuraisingham,et al.  Privacy Preserving Synthetic Data Release Using Deep Learning , 2018, ECML/PKDD.

[79]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[80]  Heng Wang,et al.  Concept drift detection for streaming data , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[81]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[82]  Jinyuan Jia,et al.  Local Model Poisoning Attacks to Byzantine-Robust Federated Learning , 2019, USENIX Security Symposium.

[83]  B. Barak Fully Homomorphic Encryption and Post Quantum Cryptography , 2010 .

[84]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[85]  Alex Hanna,et al.  Documenting Computer Vision Datasets: An Invitation to Reflexive Data Practices , 2021, FAccT.

[86]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[87]  Daniel Rueckert,et al.  End-to-end privacy preserving deep learning on multi-institutional medical imaging , 2021, Nature Machine Intelligence.

[88]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[89]  Paul W. Goldberg,et al.  The complexity of computing a Nash equilibrium , 2006, STOC '06.

[90]  Scott McCloskey,et al.  Detecting GAN-generated Imagery using Color Cues , 2018, ArXiv.

[91]  Moinuddin K. Qureshi,et al.  Defending Against Model Stealing Attacks With Adaptive Misinformation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[92]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[93]  Stefano Soatto,et al.  Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Ivan Martinovic,et al.  SLAP: Improving Physical Adversarial Examples with Short-Lived Adversarial Perturbations , 2021, USENIX Security Symposium.

[95]  Tobias Scheffer,et al.  Stackelberg games for adversarial prediction problems , 2011, KDD.

[96]  Junhong Wang,et al.  Dynamic extreme learning machine for data stream classification , 2017, Neurocomputing.

[97]  Jinyuan Jia,et al.  AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning , 2018, USENIX Security Symposium.

[98]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[99]  Abolfazl Asudeh,et al.  A Nutritional Label for Rankings , 2018, SIGMOD Conference.

[100]  Dawn Song,et al.  The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[101]  Craig Gentry,et al.  A fully homomorphic encryption scheme , 2009 .

[102]  Jie Lu,et al.  Fuzzy time windowing for gradual concept drift adaptation , 2017, 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[103]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[104]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[105]  M. Fukushima,et al.  MULTI-LEADER-FOLLOWER GAMES: MODELS, METHODS AND APPLICATIONS , 2015 .

[106]  Rick Salay,et al.  Analysis of Confident-Classifiers for Out-of-distribution Detection , 2019, ArXiv.

[107]  Yang Zhang,et al.  Label-Leaks: Membership Inference Attack with Label , 2020, ArXiv.

[108]  Sara Selwood,et al.  The politics of data collection: Gathering, analysing and using data about the subsidised cultural sector in England , 2002 .

[109]  Geoff Holmes,et al.  Improving Adaptive Bagging Methods for Evolving Data Streams , 2009, ACML.

[110]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[111]  Daniel M. Kane,et al.  Recent Advances in Algorithmic High-Dimensional Robust Statistics , 2019, ArXiv.

[112]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[113]  Talel Abdessalem,et al.  Adaptive random forests for evolving data stream classification , 2017, Machine Learning.

[114]  Ashish Choudhury,et al.  ASTRA: High Throughput 3PC over Rings with Application to Secure Prediction , 2019, IACR Cryptol. ePrint Arch..

[115]  Shira Mitchell,et al.  Algorithmic Fairness: Choices, Assumptions, and Definitions , 2021, Annual Review of Statistics and Its Application.

[116]  Chao Yang,et al.  A Survey on Deep Transfer Learning , 2018, ICANN.

[117]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[118]  Yantao Lu,et al.  Hermes Attack: Steal DNN Models with Lossless Inference Accuracy , 2020, ArXiv.

[119]  Xindong Wu,et al.  Three-layer concept drifting detection in text data streams , 2017, Neurocomputing.

[120]  Aleksander Madry,et al.  On Adaptive Attacks to Adversarial Example Defenses , 2020, NeurIPS.

[121]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[122]  Prateek Mittal,et al.  Privacy Risks of Securing Machine Learning Models against Adversarial Examples , 2019, CCS.

[123]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[124]  Bin Li,et al.  Identification of deep network generated images using disparities in color components , 2020, Signal Process..

[125]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[126]  Dejing Dou,et al.  HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.

[127]  Peter Wonka,et al.  Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[128]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[129]  Percy Liang,et al.  Certified Defenses for Data Poisoning Attacks , 2017, NIPS.

[130]  Nicolas Papernot,et al.  Data-Free Model Extraction , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[131]  Andrew McCallum,et al.  Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.

[132]  Yoav Shoham,et al.  Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[133]  Xiangliang Zhang,et al.  Decision Theory for Discrimination-Aware Classification , 2012, 2012 IEEE 12th International Conference on Data Mining.

[134]  Rachel Cummings,et al.  The Role of Differential Privacy in GDPR Compliance , 2018 .

[135]  Jing Yu,et al.  End-to-End Learning and Intervention in Games , 2020, NeurIPS.

[136]  Zhenkai Liang,et al.  Neural Network Inversion in Adversarial Setting via Background Knowledge Alignment , 2019, CCS.

[137]  Butler W. Lampson,et al.  31. Paper: Computer Security in the Real World Computer Security in the Real World , 2022 .

[138]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[139]  Tom Goldstein,et al.  Are adversarial examples inevitable? , 2018, ICLR.

[140]  Roberto Souto Maior de Barros,et al.  A Lightweight Concept Drift Detection Ensemble , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[141]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[142]  Larry S. Davis,et al.  Universal Adversarial Training , 2018, AAAI.

[143]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[144]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[145]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[146]  Rodrigo Bruno,et al.  Graviton: Trusted Execution Environments on GPUs , 2018, OSDI.

[147]  Alexandra Chouldechova,et al.  Fairer and more accurate, but for whom? , 2017, ArXiv.

[148]  Anca D. Dragan,et al.  Model Reconstruction from Model Explanations , 2018, FAT.

[149]  Peter Rindal,et al.  ABY3: A Mixed Protocol Framework for Machine Learning , 2018, IACR Cryptol. ePrint Arch..

[150]  Wenqi Wei,et al.  Demystifying Membership Inference Attacks in Machine Learning as a Service , 2019, IEEE Transactions on Services Computing.

[151]  Arvind Narayanan,et al.  Mitigating dataset harms requires stewardship: Lessons from 1000 papers , 2021, NeurIPS Datasets and Benchmarks.

[152]  Greg Yang,et al.  Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[153]  Tudor Dumitras,et al.  When Does Machine Learning FAIL? Generalized Transferability for Evasion and Poisoning Attacks , 2018, USENIX Security Symposium.

[154]  Ankur Srivastava,et al.  Neural Trojans , 2017, 2017 IEEE International Conference on Computer Design (ICCD).

[155]  Tanner Fiez,et al.  Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study , 2020, ICML.

[156]  Vitaly Shmatikov,et al.  Differential Privacy Has Disparate Impact on Model Accuracy , 2019, NeurIPS.

[157]  Lawrence D. Jackel,et al.  Limits on Learning Machine Accuracy Imposed by Data Quality , 1995, KDD.

[158]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[159]  Vitaly Feldman,et al.  Does learning require memorization? a short tale about a long tail , 2019, STOC.

[160]  Joy Buolamwini Gender shades : intersectional phenotypic and demographic evaluation of face datasets and gender classifiers , 2017 .

[161]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[162]  Liwei Song,et al.  Membership Inference Attacks Against Adversarially Robust Deep Learning Models , 2019, 2019 IEEE Security and Privacy Workshops (SPW).

[163]  Mark J. F. Gales,et al.  Predictive Uncertainty Estimation via Prior Networks , 2018, NeurIPS.

[164]  Jure Leskovec,et al.  WILDS: A Benchmark of in-the-Wild Distribution Shifts , 2021, ICML.

[165]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[166]  Takanori Maehara,et al.  Faking Fairness via Stealthily Biased Sampling , 2020, AAAI.

[167]  Alessandro Mantelero,et al.  The EU Proposal for a General Data Protection Regulation and the roots of the 'right to be forgotten' , 2013, Comput. Law Secur. Rev..

[168]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[169]  Rayid Ghani,et al.  Aequitas: A Bias and Fairness Audit Toolkit , 2018, ArXiv.

[170]  Thore Graepel,et al.  Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers , 2021, ICML.

[171]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[172]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[173]  Kevin Gimpel,et al.  Early Methods for Detecting Adversarial Images , 2016, ICLR.

[174]  Graham W. Taylor,et al.  Learning Confidence for Out-of-Distribution Detection in Neural Networks , 2018, ArXiv.

[175]  Somesh Jha,et al.  Semantic Adversarial Deep Learning , 2018, IEEE Design & Test.

[176]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[177]  Jishen Zhao,et al.  DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks , 2019, IJCAI.

[178]  Eyal Kushilevitz,et al.  Falcon: Honest-Majority Maliciously Secure Framework for Private Deep Learning , 2021, Proc. Priv. Enhancing Technol..

[179]  Bolei Zhou,et al.  Interpretable Basis Decomposition for Visual Explanation , 2018, ECCV.

[180]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[181]  Dan Boneh,et al.  Differentially Private Learning Needs Better Features (or Much More Data) , 2020, ICLR.

[182]  Martin Vechev,et al.  Learning Certified Individually Fair Representations , 2020, NeurIPS.

[183]  Vijay Ganesh,et al.  Amnesiac Machine Learning , 2020, AAAI.

[184]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[185]  Thomas Moyer,et al.  Trustworthy Whole-System Provenance for the Linux Kernel , 2015, USENIX Security Symposium.

[186]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[187]  Ilya Mironov,et al.  Cryptanalytic Extraction of Neural Network Models , 2020, CRYPTO.

[188]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[189]  Koichiro Yamauchi,et al.  Detecting Concept Drift Using Statistical Testing , 2007, Discovery Science.

[190]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[191]  Dongbin Zhao,et al.  An Incremental Change Detection Test Based on Density Difference Estimation , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[192]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[193]  Matthew Mirman,et al.  Fast and Effective Robustness Certification , 2018, NeurIPS.

[194]  Nicholas Carlini,et al.  Poisoning the Unlabeled Dataset of Semi-Supervised Learning , 2021, USENIX Security Symposium.

[195]  Colin Raffel,et al.  Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition , 2019, ICML.

[196]  Santiago Zanella Béguelin,et al.  Grey-box Extraction of Natural Language Models , 2021, ICML.

[197]  Jordi Luque,et al.  Input complexity and out-of-distribution detection with likelihood-based generative models , 2020, ICLR.

[198]  Elissa M. Redmiles,et al.  "I need a better description": An Investigation Into User Expectations For Differential Privacy , 2021, CCS.

[199]  Seth Neel,et al.  Adaptive Machine Unlearning , 2021, NeurIPS.

[200]  Dawn Xiaodong Song,et al.  Adversarial Examples for Generative Models , 2017, 2018 IEEE Security and Privacy Workshops (SPW).

[201]  Somesh Jha,et al.  Detecting Anomalous Inputs to DNN Classifiers By Joint Statistical Testing at the Layers , 2020, ArXiv.

[202]  Ankur P. Parikh,et al.  Thieves on Sesame Street! Model Extraction of BERT-based APIs , 2019, ICLR.

[203]  Úlfar Erlingsson,et al.  Encode, Shuffle, Analyze Privacy Revisited: Formalizations and Empirical Evaluation , 2020, ArXiv.

[204]  Honglak Lee,et al.  SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing , 2019, ECCV.

[205]  Jasper Snoek,et al.  Likelihood Ratios for Out-of-Distribution Detection , 2019, NeurIPS.

[206]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[207]  Yang Song,et al.  Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning , 2018, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[208]  Yao Lu,et al.  Oblivious Neural Network Predictions via MiniONN Transformations , 2017, IACR Cryptol. ePrint Arch..

[209]  Quoc V. Le,et al.  Smooth Adversarial Training , 2020, ArXiv.

[210]  L. V. D. Maaten,et al.  Certified Data Removal from Machine Learning Models , 2019, ICML.

[211]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[212]  Arslan Munir,et al.  Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks , 2017, MLDM.

[213]  Xiangliang Zhang,et al.  Adding Robustness to Support Vector Machines Against Adversarial Reverse Engineering , 2014, CIKM.

[214]  Fabio Roli,et al.  Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization , 2017, AISec@CCS.

[215]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[216]  Raluca Ada Popa,et al.  Delphi: A Cryptographic Inference System for Neural Networks , 2020, IACR Cryptol. ePrint Arch..

[217]  Santiago Zanella Béguelin,et al.  Analyzing Information Leakage of Updates to Natural Language Models , 2019, CCS.

[218]  Siddhant Garg,et al.  BAE: BERT-based Adversarial Examples for Text Classification , 2020, EMNLP.

[219]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[220]  Reza Shokri,et al.  On the Privacy Risks of Algorithmic Fairness , 2020, ArXiv.

[221]  Raluca Ada Popa,et al.  MUSE: Secure Inference Resilient to Malicious Clients , 2021, IACR Cryptol. ePrint Arch..

[222]  Matthias Zeppelzauer,et al.  Machine Unlearning: Linear Filtration for Logit-based Classifiers , 2020, ArXiv.

[223]  Larry S. Davis,et al.  Adversarial Training for Free! , 2019, NeurIPS.

[224]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[225]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[226]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[227]  Douglas Eck,et al.  Deduplicating Training Data Makes Language Models Better , 2021, ArXiv.

[228]  David Rolnick,et al.  Reverse-engineering deep ReLU networks , 2019, ICML.

[229]  Benjamin Edwards,et al.  Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering , 2018, SafeAI@AAAI.

[230]  Sébastien Gambs,et al.  Fairwashing: the risk of rationalization , 2019, ICML.

[231]  Cordelia Schmid,et al.  White-box vs Black-box: Bayes Optimal Strategies for Membership Inference , 2019, ICML.

[232]  Klaus-Robert Müller,et al.  Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.

[233]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[234]  Hsien-Hsin S. Lee,et al.  Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference , 2020, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[235]  Markus G. Kuhn,et al.  Information hiding-a survey , 1999, Proc. IEEE.

[236]  Alexander Levine,et al.  Deep Partition Aggregation: Provable Defense against General Poisoning Attacks , 2020, ICLR.

[237]  Lingxiao Wang,et al.  Revisiting Membership Inference Under Realistic Assumptions , 2020, Proc. Priv. Enhancing Technol..

[238]  Vitaly Feldman,et al.  What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation , 2020, NeurIPS.

[239]  Masao Fukushima,et al.  Quasi-variational inequalities, generalized Nash equilibria, and multi-leader-follower games , 2009, Comput. Manag. Sci..

[240]  Carrie Gates,et al.  Defining the insider threat , 2008, CSIIRW '08.

[241]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[242]  Nicolas Papernot,et al.  Proof-of-Learning: Definitions and Practice , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[243]  Justin Hsu,et al.  Data Poisoning against Differentially-Private Learners: Attacks and Defenses , 2019, IJCAI.

[244]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[245]  Ion Stoica,et al.  Helen: Maliciously Secure Coopetitive Learning for Linear Models , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[246]  Venkat N. Gudivada,et al.  Data Quality Considerations for Big Data and Machine Learning: Going Beyond Data Cleaning and Transformations , 2017 .

[247]  Hiromu Yakura,et al.  Robust Audio Adversarial Example for a Physical Attack , 2018, IJCAI.

[248]  David Lie,et al.  On the Exploitability of Audio Machine Learning Pipelines to Surreptitious Adversarial Examples , 2021, ArXiv.

[249]  Matthias Hein,et al.  Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[250]  Marta R. Costa-jussa,et al.  MT-Adapted Datasheets for Datasets: Template and Repository , 2020, ArXiv.

[251]  Elie Bursztein,et al.  Five Years of the Right to be Forgotten , 2019, CCS.

[252]  Aditi Raghunathan,et al.  Certified Defenses against Adversarial Examples , 2018, ICLR.

[253]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[254]  Sanjay Ranka,et al.  Statistical change detection for multi-dimensional data , 2007, KDD '07.

[255]  Xiaojin Zhu,et al.  Policy Poisoning in Batch Reinforcement Learning and Control , 2019, NeurIPS.

[256]  Quanyan Zhu,et al.  Game theory meets network security and privacy , 2013, CSUR.

[257]  Mani B. Srivastava,et al.  Generating Natural Language Adversarial Examples , 2018, EMNLP.

[258]  Jerry Li,et al.  Spectral Signatures in Backdoor Attacks , 2018, NeurIPS.

[259]  Miroslav Dudík,et al.  Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need? , 2018, CHI.

[260]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[261]  Peter Kairouz,et al.  Practical and Private (Deep) Learning without Sampling or Shuffling , 2021, ICML.

[262]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[263]  Pratyush Maini,et al.  Dataset Inference: Ownership Resolution in Machine Learning , 2021, ICLR.

[264]  Sameer Singh,et al.  Generating Natural Adversarial Examples , 2017, ICLR.

[265]  Benny Pinkas,et al.  Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring , 2018, USENIX Security Symposium.

[266]  Matthias Hein,et al.  Towards neural networks that provably know when they don't know , 2020, ICLR.

[267]  Alan L. Yuille,et al.  Intriguing Properties of Adversarial Training at Scale , 2020, ICLR.

[268]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[269]  Prateek Mittal,et al.  Dimensionality Reduction as a Defense against Evasion Attacks on Machine Learning Classifiers , 2017, ArXiv.

[270]  Michael Backes,et al.  When Machine Unlearning Jeopardizes Privacy , 2020, CCS.

[271]  Matt Bishop,et al.  The Art and Science of Computer Security , 2002 .

[272]  Nicolas Flammarion,et al.  Understanding and Improving Fast Adversarial Training , 2020, NeurIPS.

[273]  Thomas Hofmann,et al.  The Odds are Odd: A Statistical Test for Detecting Adversarial Examples , 2019, ICML.

[274]  Chengfang Fang,et al.  BDPL: A Boundary Differentially Private Layer Against Machine Learning Model Extraction Attacks , 2019, ESORICS.

[275]  Dan Boneh,et al.  Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware , 2018, ICLR.

[276]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[277]  Ce Zhang,et al.  RAB: Provable Robustness Against Backdoor Attacks , 2020, ArXiv.

[278]  Murat A. Erdogdu,et al.  Manipulating SGD with Data Ordering Attacks , 2021, NeurIPS.

[279]  Luisa Verdoliva,et al.  Do GANs Leave Artificial Fingerprints? , 2018, 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR).

[280]  Andrew Chi-Chih Yao,et al.  Probabilistic computations: Toward a unified measure of complexity , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[281]  David M. Sommer,et al.  Towards Probabilistic Verification of Machine Unlearning , 2020, ArXiv.

[282]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[283]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[284]  Helen Nissenbaum,et al.  An Ethical Highlighter for People-Centric Dataset Creation , 2020, ArXiv.

[285]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[286]  Tribhuvanesh Orekondy,et al.  Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks , 2020, ICLR.

[287]  Somesh Jha,et al.  Exploring Connections Between Active Learning and Model Extraction , 2018, USENIX Security Symposium.

[288]  Tom Goldstein,et al.  Transferable Clean-Label Poisoning Attacks on Deep Neural Nets , 2019, ICML.

[289]  Guangquan Zhang,et al.  Learning under Concept Drift: A Review , 2019, IEEE Transactions on Knowledge and Data Engineering.

[290]  Christos H. Papadimitriou,et al.  Strategic Classification , 2015, ITCS.

[291]  Elvis Dohmatob,et al.  Generalized No Free Lunch Theorem for Adversarial Robustness , 2018, ICML.

[292]  David A. Forsyth,et al.  SafetyNet: Detecting and Rejecting Adversarial Examples Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[293]  J. Zico Kolter,et al.  Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[294]  Ahmed Hosny,et al.  The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards , 2018, Data Protection and Privacy.

[295]  Somesh Jha,et al.  CaPC Learning: Confidential and Private Collaborative Learning , 2021, ICLR.

[296]  Celestine Mendler-Dünner,et al.  Performative Prediction , 2020, ICML.

[297]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[298]  K. Lum,et al.  To predict and serve? , 2016 .