Towards the Science of Security and Privacy in Machine Learning

Advances in machine learning (ML) in recent years have enabled a dizzying array of applications such as data analytics, autonomous systems, and security diagnostics. ML is now pervasive---new systems and models are being deployed in every domain imaginable, leading to rapid and widespread deployment of software based inference and decision making. There is growing recognition that ML exposes new vulnerabilities in software systems, yet the technical community's understanding of the nature and extent of these vulnerabilities remains limited. We systematize recent findings on ML security and privacy, focusing on attacks identified on these systems and defenses crafted to date. We articulate a comprehensive threat model for ML, and categorize attacks and defenses within an adversarial framework. Key insights resulting from works both in the ML and security communities are identified and the effectiveness of approaches are related to structural elements of ML algorithms and the data used to train them. We conclude by formally exploring the opposing relationship between model accuracy and resilience to adversarial manipulation. Through these explorations, we show that there are (possibly unavoidable) tensions between model complexity, accuracy, and resilience that must be calibrated for the environments in which they will be used.

[1]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[2]  Mohammad Zulkernine,et al.  Anomaly Based Network Intrusion Detection with Unsupervised Outlier Detection , 2006, 2006 IEEE International Conference on Communications.

[3]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[4]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[5]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[6]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[7]  Patrick D. McDaniel,et al.  Adversarial Perturbations Against Deep Neural Networks for Malware Classification , 2016, ArXiv.

[8]  Giovanni Felici,et al.  Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers , 2013, Int. J. Secur. Networks.

[9]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[10]  Amos J. Storkey,et al.  Censoring Representations with an Adversary , 2015, ICLR.

[11]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[12]  Lujo Bauer,et al.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[13]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[14]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[15]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[16]  Niels Provos,et al.  A Virtual Honeypot Framework , 2004, USENIX Security Symposium.

[17]  David J. Hand,et al.  Statistical fraud detection: A review , 2002 .

[18]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[19]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[20]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[21]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[22]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[23]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[24]  Lorenzo Rosasco,et al.  Are Loss Functions All the Same? , 2004, Neural Computation.

[25]  Yevgeniy Vorobeychik,et al.  Feature Cross-Substitution in Adversarial Classification , 2014, NIPS.

[26]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[27]  Andrea Vedaldi,et al.  Visualizing Deep Convolutional Neural Networks Using Natural Pre-images , 2015, International Journal of Computer Vision.

[28]  Ronald L. Rivest,et al.  ON DATA BANKS AND PRIVACY HOMOMORPHISMS , 1978 .

[29]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[30]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[31]  Patrick D. McDaniel,et al.  On the Effectiveness of Defensive Distillation , 2016, ArXiv.

[32]  Claudia Eckert,et al.  Is Feature Selection Secure against Training Data Poisoning? , 2015, ICML.

[33]  Kaizhu Huang,et al.  A Unified Gradient Regularization Family for Adversarial Examples , 2015, 2015 IEEE International Conference on Data Mining.

[34]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[35]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[36]  Ling Huang,et al.  Query Strategies for Evading Convex-Inducing Classifiers , 2010, J. Mach. Learn. Res..

[37]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[38]  Mikhail Belkin,et al.  Learning privately from multiparty data , 2016, ICML.

[39]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[40]  Vladimir Vapnik,et al.  A new learning paradigm: Learning using privileged information , 2009, Neural Networks.

[41]  Milind Tambe,et al.  Learning Adversary Behavior in Security Games: A PAC Model Perspective , 2015, AAMAS.

[42]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[43]  Tobias Scheffer,et al.  Stackelberg games for adversarial prediction problems , 2011, KDD.

[44]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[45]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[46]  Somesh Jha,et al.  Static Analysis of Executables to Detect Malicious Patterns , 2003, USENIX Security Symposium.

[47]  David A. Wagner,et al.  Defensive Distillation is Not Robust to Adversarial Examples , 2016, ArXiv.

[48]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[49]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[50]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[51]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[53]  Marius Kloft,et al.  Online Anomaly Detection under Adversarial Impact , 2010, AISTATS.

[54]  Milind Tambe,et al.  From physical security to cybersecurity , 2015, J. Cybersecur..

[55]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[56]  Sebastian Nowozin,et al.  Oblivious Multi-Party Machine Learning on Trusted Processors , 2016, USENIX Security Symposium.

[57]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[58]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[59]  Dale Schuurmans,et al.  Learning with a Strong Adversary , 2015, ArXiv.

[60]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[61]  Cynthia Rudin,et al.  Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model , 2015, ArXiv.

[62]  Juliane Hahn,et al.  Security And Game Theory Algorithms Deployed Systems Lessons Learned , 2016 .

[63]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence , 2017 .

[64]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[65]  Liva Ralaivola,et al.  Learning SVMs from Sloppily Labeled Data , 2009, ICANN.

[66]  Wenke Lee,et al.  Misleading worm signature generators using deliberate noise injection , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[67]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Micah Sherr,et al.  Hidden Voice Commands , 2016, USENIX Security Symposium.

[69]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[70]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[71]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[72]  Thomas C. Rindfleisch,et al.  Privacy, information technology, and health care , 1997, CACM.

[73]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[74]  Paul Barford,et al.  Data Poisoning Attacks against Autoregressive Models , 2016, AAAI.

[75]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[76]  Xiaojin Zhu,et al.  Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.

[77]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[78]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[79]  Ling Huang,et al.  ANTIDOTE: understanding and defending against poisoning of anomaly detectors , 2009, IMC '09.

[80]  Cannady,et al.  Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks , 2000 .

[81]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[82]  Christos H. Papadimitriou,et al.  Strategic Classification , 2015, ITCS.

[83]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[84]  Susmita Sur-Kolay,et al.  Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare , 2015, IEEE Journal of Biomedical and Health Informatics.

[85]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[86]  James Newsome,et al.  Polygraph: automatically generating signatures for polymorphic worms , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[87]  Jürgen Schmidhuber,et al.  Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.

[88]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[89]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[90]  Shyhtsun Felix Wu,et al.  On Attacking Statistical Spam Filters , 2004, CEAS.

[91]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[92]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[93]  Yevgeniy Vorobeychik,et al.  Optimal randomized classification in adversarial settings , 2014, AAMAS.

[94]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[95]  R. Altman,et al.  Estimation of the warfarin dose with clinical and pharmacogenetic data. , 2009, The New England journal of medicine.

[96]  A. Joseph,et al.  Bounding an Attack ’ s Complexity for a Simple Learning Model , 2006 .

[97]  Naresh Manwani,et al.  Noise Tolerance Under Risk Minimization , 2011, IEEE Transactions on Cybernetics.

[98]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[99]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[100]  Koby Crammer,et al.  Robust Support Vector Machine Training via Convex Outlier Ablation , 2006, AAAI.

[101]  Yanjun Qi,et al.  Automatically Evading Classifiers: A Case Study on PDF Malware Classifiers , 2016, NDSS.

[102]  James Davidson,et al.  Supervision via competition: Robot adversaries for learning tasks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[103]  Christopher Meek,et al.  Good Word Attacks on Statistical Spam Filters , 2005, CEAS.

[104]  Daniel Kifer,et al.  Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization , 2016, ArXiv.

[105]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[106]  David Warde-Farley,et al.  1 Adversarial Perturbations of Deep Neural Networks , 2016 .

[107]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[108]  Peter Glöckner,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2013 .

[109]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[110]  Blaine Nelson,et al.  Support Vector Machines Under Adversarial Label Noise , 2011, ACML.

[111]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.