Exploring Connections Between Active Learning and Model Extraction

Machine learning is being increasingly used by individuals, research institutions, and corporations. This has resulted in the surge of Machine Learning-as-a-Service (MLaaS) - cloud services that provide (a) tools and resources to learn the model, and (b) a user-friendly query interface to access the model. However, such MLaaS systems raise privacy concerns such as model extraction. In model extraction attacks, adversaries maliciously exploit the query interface to steal the model. More precisely, in a model extraction attack, a good approximation of a sensitive or proprietary model held by the server is extracted (i.e. learned) by a dishonest user who interacts with the server only via the query interface. This attack was introduced by Tramer et al. at the 2016 USENIX Security Symposium, where practical attacks for various models were shown. We believe that better understanding the efficacy of model extraction attacks is paramount to designing secure MLaaS systems. To that end, we take the first step by (a) formalizing model extraction and discussing possible defense strategies, and (b) drawing parallels between model extraction and established area of active learning. In particular, we show that recent advancements in the active learning domain can be used to implement powerful model extraction attacks, and investigate possible defense strategies.

[1]  John Langford,et al.  Agnostic Active Learning Without Constraints , 2010, NIPS.

[2]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[3]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[4]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[5]  Robert D. Nowak,et al.  Noisy Generalized Binary Search , 2009, NIPS.

[6]  Pravesh Kothari,et al.  Embedding Hard Learning Problems into Gaussian Space , 2014, Electron. Colloquium Comput. Complex..

[7]  Dawn Xiaodong Song,et al.  Black-box Attacks on Deep Neural Networks via Gradient Estimation , 2018, ICLR.

[8]  Michael Lindenbaum,et al.  Selective Sampling for Nearest Neighbor Classifiers , 1999, Machine Learning.

[9]  Amin Karbasi,et al.  Near-Optimal Active Learning of Halfspaces via Query Synthesis in the Noisy Setting , 2016, AAAI.

[10]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[11]  David A. Cohn,et al.  Training Connectionist Networks with Queries and Selective Sampling , 1989, NIPS.

[12]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[13]  Giovanni Felici,et al.  Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers , 2013, Int. J. Secur. Networks.

[14]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[15]  Steve Hanneke,et al.  Theory of Disagreement-Based Active Learning , 2014, Found. Trends Mach. Learn..

[16]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[17]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[18]  Hila Peleg,et al.  Abstraction-Based Interaction Model for Synthesis , 2018, VMCAI.

[19]  Rémi Munos,et al.  Active Regression by Stratification , 2014, NIPS.

[20]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[21]  Dan Boneh,et al.  The Space of Transferable Adversarial Examples , 2017, ArXiv.

[22]  Yuan Yao,et al.  Mercer's Theorem, Feature Maps, and Smoothing , 2006, COLT.

[23]  Tom Michael Mitchell Version spaces: an approach to concept learning. , 1979 .

[24]  Tara Javidi,et al.  Active Learning from Imperfect Labelers , 2016, NIPS.

[25]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[26]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[27]  Vikram Krishnamurthy,et al.  Algorithms for optimal scheduling and management of hidden Markov model sensors , 2002, IEEE Trans. Signal Process..

[28]  Xiangliang Zhang,et al.  Adding Robustness to Support Vector Machines Against Adversarial Reverse Engineering , 2014, CIKM.

[29]  Daniel J. Hsu,et al.  Heavy-tailed regression with a generalized median-of-means , 2014, ICML.

[30]  Michael P. Wellman,et al.  Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[31]  Tara Javidi,et al.  Noisy Bayesian active learning , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[32]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[33]  Robert D. Nowak,et al.  The Geometry of Generalized Binary Search , 2009, IEEE Transactions on Information Theory.

[34]  Sumit Gulwani,et al.  Synthesis from Examples: Interaction Models and Algorithms , 2012, 2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[35]  AngluinDana Learning regular sets from queries and counterexamples , 1987 .

[36]  Kentaro Inui,et al.  Selective Sampling for Example-based Word Sense Disambiguation , 1998, CL.

[37]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[38]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[39]  Kamalika Chaudhuri,et al.  Beyond Disagreement-Based Agnostic Active Learning , 2014, NIPS.

[40]  Hwanjo Yu,et al.  SVM selective sampling for ranking with application to data retrieval , 2005, KDD '05.

[41]  Jianfeng Lu,et al.  Active learning via query synthesis and nearest neighbour search , 2015, Neurocomputing.

[42]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[43]  Eyal Kushilevitz,et al.  Learning decision trees using the Fourier spectrum , 1991, STOC '91.

[44]  Matti Kääriäinen,et al.  Active Learning in the Non-realizable Case , 2006, ALT.

[45]  Sanjoy Dasgupta,et al.  Coarse sample complexity bounds for active learning , 2005, NIPS.

[46]  Sham M. Kakade,et al.  Convergence Rates of Active Learning for Maximum Likelihood Estimation , 2015, NIPS.

[47]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[48]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[49]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Human and Computer Vision , 2018, ArXiv.

[50]  Huan Liu,et al.  A selective sampling approach to active feature selection , 2004, Artif. Intell..

[51]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[52]  Maria-Florina Balcan,et al.  Active and passive learning of linear separators under log-concave distributions , 2012, COLT.

[53]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[54]  Richard M. Karp,et al.  Noisy binary search and its applications , 2007, SODA '07.

[55]  Xiangliang Zhang,et al.  Efficient Active Learning of Halfspaces via Query Synthesis , 2015, AAAI.

[56]  Chicheng Zhang,et al.  Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces , 2017, NIPS.

[57]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[58]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[59]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[60]  Tibor Hegedűs,et al.  Generalized teaching dimensions and the query complexity of learning , 1995, Annual Conference Computational Learning Theory.

[61]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[62]  Alvin Cheung,et al.  Interactive Query Synthesis from Input-Output Examples , 2017, SIGMOD Conference.

[63]  Sanjoy Dasgupta,et al.  A General Agnostic Active Learning Algorithm , 2007, ISAIM.

[64]  Yi Shi,et al.  How to steal a machine learning classifier with deep learning , 2017, 2017 IEEE International Symposium on Technologies for Homeland Security (HST).

[65]  Varun Kanade,et al.  Learning Using Local Membership Queries , 2012, COLT.

[66]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[67]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[68]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[69]  Eran Yahav,et al.  Synthesis with Abstract Examples , 2017, CAV.

[70]  Sanjoy Dasgupta,et al.  Two faces of active learning , 2011, Theor. Comput. Sci..

[71]  Ken E. Whelan,et al.  The Automation of Science , 2009, Science.

[72]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[73]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.