SEALing Neural Network Models in Secure Deep Learning Accelerators

Deep learning (DL) accelerators are increasingly deployed on edge devices to support fast local inferences. However, they suffer from a new security problem, i.e., being vulnerable to physical access based attacks. An adversary can easily obtain the entire neural network (NN) model by physically snooping the GDDR memory bus that connects the accelerator chip with DRAM memory. Therefore, memory encryption becomes important for DL accelerators on edge devices to improve the security of NN models. Nevertheless, we observe that traditional memory encryption solutions that have been efficiently used in CPU systems cause significant performance degradation when directly used in DL accelerators. The main reason comes from the big bandwidth gap between the GDDR memory bus and the encryption engine. To address this problem, our paper proposes SEAL, a Secure and Efficient Accelerator scheme for deep Learning. SEAL enhances the performance of the encrypted DL accelerator from two aspects, i.e., improving the data access bandwidth and the efficiency of memory encryption. Specifically, to improve the data access bandwidth, SEAL leverages a criticality-aware smart encryption scheme which identifies partial data that have no impact on the security of NN models and allows them to bypass the encryption engine, thus reducing the amount of data to be encrypted. To improve the efficiency of memory encryption, SEAL leverages a colocation mode encryption scheme to eliminate memory accesses from counters used for encryption by co-locating data and their counters. Our experimental results demonstrate that, compared with traditional memory encryption solutions, SEAL achieves 1.4 ~ 1.6 times IPC improvement and reduces the inference latency by 39% ~ 60%. Compared with a baseline accelerator without memory encryption, SEAL compromises only 5% ~ 7% IPC for significant security improvement.

[1]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Moinuddin K. Qureshi,et al.  DEUCE: Write-Efficient Encryption for Non-Volatile Memories , 2015, ASPLOS.

[4]  Chin-Long Chen,et al.  Error-Correcting Codes for Semiconductor Memory Applications: A State-of-the-Art Review , 1984, IBM J. Res. Dev..

[5]  Timothy J. Dell,et al.  A white paper on the benefits of chipkill-correct ecc for pc server main memory , 1997 .

[6]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[7]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[8]  Farinaz Koushanfar,et al.  Privacy-Preserving Deep Learning and Inference , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[9]  Rodrigo Bruno,et al.  Graviton: Trusted Execution Environments on GPUs , 2018, OSDI.

[10]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[12]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[13]  Fernando A. Mujica,et al.  An Empirical Evaluation of Deep Learning on Highway Driving , 2015, ArXiv.

[14]  Shay Gueron,et al.  A Memory Encryption Engine Suitable for General Purpose Processors , 2016, IACR Cryptol. ePrint Arch..

[15]  Nael B. Abu-Ghazaleh,et al.  Rendered Insecure: GPU Side Channel Attacks are Practical , 2018, CCS.

[16]  Mianxiong Dong,et al.  Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing , 2018, IEEE Network.

[17]  H.-H.S. Lee,et al.  Architectural support for high speed protection of memory integrity and confidentiality in multiprocessor systems , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[18]  Jose Joao,et al.  Morphable Counters: Enabling Compact Integrity Trees For Low-Overhead Secure Memories , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Yan Solihin,et al.  Silent Shredder: Zero-Cost Shredding for Secure Non-Volatile Main Memory Controllers , 2016, ASPLOS.

[20]  Ninghui Sun,et al.  DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.

[21]  Sanu Mathew,et al.  53Gbps native GF(24)2 composite-field AES-encrypt/decrypt accelerator for content-protection in 45nm high-performance microprocessors , 2010, 2010 Symposium on VLSI Circuits.

[22]  Paul Barford,et al.  Data Poisoning Attacks against Autoregressive Models , 2016, AAAI.

[23]  Josep Torrellas,et al.  Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures , 2018, USENIX Security Symposium.

[24]  Sanu Mathew,et al.  53 Gbps Native ${\rm GF}(2 ^{4}) ^{2}$ Composite-Field AES-Encrypt/Decrypt Accelerator for Content-Protection in 45 nm High-Performance Microprocessors , 2011, IEEE Journal of Solid-State Circuits.

[25]  Akashi Satoh,et al.  A 10-Gbps full-AES crypto design with a twisted BDD S-Box architecture , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[26]  Micah Sherr,et al.  Hidden Voice Commands , 2016, USENIX Security Symposium.

[27]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[28]  Patrick D. McDaniel,et al.  Making machine learning robust against adversarial inputs , 2018, Commun. ACM.

[29]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[30]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[31]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[32]  Brian Rogers,et al.  Improving Cost, Performance, and Security of Memory Encryption and Authentication , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[33]  Geoffrey Zweig,et al.  Toward Human Parity in Conversational Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[34]  Vincent Rijmen,et al.  The Design of Rijndael: AES - The Advanced Encryption Standard , 2002 .

[35]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[36]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[37]  Henry Wong,et al.  Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[38]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Zhiru Zhang,et al.  Reverse Engineering Convolutional Neural Networks Through Side-channel Information Leaks , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[40]  Carole-Jean Wu,et al.  Machine Learning at Facebook: Understanding Inference at the Edge , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[41]  Seong Joon Oh,et al.  Towards Reverse-Engineering Black-Box Neural Networks , 2017, ICLR.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  G. Edward Suh,et al.  Efficient Memory Integrity Verification and Encryption for Secure Processors , 2003, MICRO.

[44]  Derek Chiou,et al.  Cryptoraptor: High throughput reconfigurable cryptographic processor , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[45]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[46]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[47]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[48]  Yiren Zhao,et al.  To compress or not to compress: Understanding the Interactions between Adversarial Attacks and Neural Network Compression , 2018, SysML.

[49]  Atul Prakash,et al.  Robust Physical-World Attacks on Deep Learning Visual Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[51]  Chang Liu,et al.  DeepSniffer: A DNN Model Extraction Framework Based on Learning Architectural Hints , 2020, ASPLOS.

[52]  Ondrej Chum,et al.  CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[53]  Stephen Taylor,et al.  Memory encryption , 2014, ACM Comput. Surv..

[54]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[55]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[56]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[57]  Samira Manabi Khan,et al.  Crash Consistency in Encrypted Non-volatile Main Memory Systems , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[58]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[59]  Yan Solihin,et al.  ObfusMem: A low-overhead access obfuscation for trusted memories , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[60]  Shay Gueron,et al.  Memory Encryption for General-Purpose Processors , 2016, IEEE Security & Privacy.

[61]  Bin Liu,et al.  Parallel AES Encryption Engines for Many-Core Processor Arrays , 2013, IEEE Transactions on Computers.