Model Extraction Attacks against Recurrent Neural Networks

Model extraction attacks are a kind of attacks in which an adversary obtains a new model, whose performance is equivalent to that of a target model, via query access to the target model efficiently, i.e., fewer datasets and computational resources than those of the target model. Existing works have dealt with only simple deep neural networks (DNNs), e.g., only three layers, as targets of model extraction attacks, and hence are not aware of the effectiveness of recurrent neural networks (RNNs) in dealing with time-series data. In this work, we shed light on the threats of model extraction attacks against RNNs. We discuss whether a model with a higher accuracy can be extracted with a simple RNN from a long short-term memory (LSTM), which is a more complicated and powerful RNN. Specifically, we tackle the following problems. First, in a case of a classification problem, such as image recognition, extraction of an RNN model without final outputs from an LSTM model is presented by utilizing outputs halfway through the sequence. Next, in a case of a regression problem. such as in weather forecasting, a new attack by newly configuring a loss function is presented. We conduct experiments on our model extraction attacks against an RNN and an LSTM trained with publicly available academic datasets. We then show that a model with a higher accuracy can be extracted efficiently, especially through configuring a loss function and a more complex architecture different from the target model.

[1]  Chengfang Fang,et al.  BDPL: A Boundary Differentially Private Layer Against Machine Learning Model Extraction Attacks , 2019, ESORICS.

[2]  Tribhuvanesh Orekondy,et al.  Knockoff Nets: Stealing Functionality of Black-Box Models , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[4]  Vijay Arya,et al.  Model Extraction Warning in MLaaS Paradigm , 2017, ACSAC.

[5]  Nael B. Abu-Ghazaleh,et al.  Rendered Insecure: GPU Side Channel Attacks are Practical , 2018, CCS.

[6]  Lejla Batina,et al.  CSI Neural Network: Using Side-channels to Recover Your Artificial Neural Network Information , 2018, IACR Cryptol. ePrint Arch..

[7]  Samuel Marchal,et al.  Extraction of Complex DNN Models: Real Threat or Boogeyman? , 2019, Communications in Computer and Information Science.

[8]  Robert Nikolai Reith,et al.  Efficiently Stealing your Machine Learning Models , 2019, WPES@CCS.

[9]  Samuel Marchal,et al.  PRADA: Protecting Against DNN Model Stealing Attacks , 2018, 2019 IEEE European Symposium on Security and Privacy (EuroS&P).

[10]  David Berthelot,et al.  High-Fidelity Extraction of Neural Network Models , 2019, ArXiv.

[11]  Ankur P. Parikh,et al.  Thieves on Sesame Street! Model Extraction of BERT-based APIs , 2019, ICLR.

[12]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[13]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[14]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[17]  A. Rubin Statistics for Evidence-Based Practice and Evaluation , 2006 .

[18]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[19]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[20]  Tony X. Han,et al.  Learning Efficient Object Detection Models with Knowledge Distillation , 2017, NIPS.

[21]  Kota Yoshida,et al.  Model-Extraction Attack Against FPGA-DNN Accelerator Utilizing Correlation Electromagnetic Analysis , 2019, 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[22]  Samuel Marchal,et al.  DAWN: Dynamic Adversarial Watermarking of Neural Networks , 2019, ACM Multimedia.

[23]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[24]  Vinod Ganapathy,et al.  A framework for the extraction of Deep Neural Networks by leveraging public data , 2019, ArXiv.