GuardNN: Secure DNN Accelerator for Privacy-Preserving Deep Learning

This paper proposes GuardNN, a secure deep neural network (DNN) accelerator, which provides strong hardware-based protection for user data and model parameters even in an untrusted environment. GuardNN shows that the architecture and protection can be customized for a specific application to provide strong confidentiality and integrity protection with negligible overhead. The design of the GuardNN instruction set reduces the TCB to just the accelerator and enables confidentiality protection without the overhead of integrity protection. GuardNN also introduces a new application-specific memory protection scheme to minimize the overhead of memory encryption and integrity verification. The scheme shows that most of the off-chip meta-data in today's state-of-the-art memory protection can be removed by exploiting the known memory access patterns of a DNN accelerator. GuardNN is implemented as an FPGA prototype, which demonstrates effective protection with less than 2% performance overhead for inference over a variety of modern DNN models.

[1]  Hongbin Zheng,et al.  Optimizing Memory-Access Patterns for Deep Learning Accelerators , 2020, ArXiv.

[2]  Benjamin C. Lee,et al.  PoisonIvy: Safe speculation for secure memory , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[3]  G. Edward Suh,et al.  AEGIS: architecture for tamper-evident and tamper-resistant processing , 2003, ICS.

[4]  Srinivas Devadas,et al.  A Low-Latency, Low-Area Hardware Oblivious RAM Controller , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[5]  Yao Chen,et al.  High Level Synthesis of Complex Applications: An H.264 Video Decoder , 2016, FPGA.

[6]  Brian Rogers,et al.  Using Address Independent Seed Encryption and Bonsai Merkle Trees to Make Secure Processors OS- and Performance-Friendly , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[7]  William J. Dally,et al.  Darwin: A Genomics Co-processor Provides up to 15,000X Acceleration on Long Read Assembly , 2018, USENIX Annual Technical Conference.

[8]  Charanjit S. Jutla,et al.  Parallelizable Authentication Trees , 2005, IACR Cryptol. ePrint Arch..

[9]  Raluca Ada Popa,et al.  Delphi: A Cryptographic Inference System for Neural Networks , 2020, IACR Cryptol. ePrint Arch..

[10]  Morris J. Dworkin,et al.  SP 800-38D. Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC , 2007 .

[11]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[12]  Bruce Jacob,et al.  DRAMSim2: A Cycle Accurate Memory System Simulator , 2011, IEEE Computer Architecture Letters.

[13]  Carole-Jean Wu,et al.  The Architectural Implications of Facebook's DNN-Based Personalized Recommendation , 2019, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[14]  Yao Lu,et al.  Oblivious Neural Network Predictions via MiniONN Transformations , 2017, IACR Cryptol. ePrint Arch..

[15]  Brian Rogers,et al.  SecureME: a hardware-software approach to full system security , 2011, ICS '11.

[16]  Carlos V. Rozas,et al.  Intel® Software Guard Extensions (Intel® SGX) Support for Dynamic Memory Management Inside an Enclave , 2016, HASP 2016.

[17]  Gu-Yeon Wei,et al.  Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[18]  Yu Wang,et al.  ForeGraph: Exploring Large-scale Graph Processing on Multi-FPGA Architecture , 2017, FPGA.

[19]  Hsien-Hsin S. Lee,et al.  High efficiency counter mode security architecture via prediction and precomputation , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[20]  Lionel Torres,et al.  TEC-Tree: A Low-Cost, Parallelizable Tree for Efficient Defense Against Memory Replay Attacks , 2007, CHES.

[21]  Rodrigo Bruno,et al.  Graviton: Trusted Execution Environments on GPUs , 2018, OSDI.

[22]  James C. Hoe,et al.  GraphGen: An FPGA Framework for Vertex-Centric Graph Computation , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Shweta Shinde,et al.  Privado: Practical and Secure DNN Inference , 2018, ArXiv.

[25]  Natalie D. Enright Jerger,et al.  Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[26]  Srinivas Devadas,et al.  A secure processor architecture for encrypted computation on untrusted programs , 2012, STC '12.

[27]  Dan Boneh,et al.  Architectural support for copy and tamper resistant software , 2000, SIGP.

[28]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[29]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[30]  Jun Yang,et al.  Fast Secure Processor for Inhibiting Software Piracy and Tampering , 2003, MICRO.

[31]  Nael B. Abu-Ghazaleh,et al.  Iso-X: A Flexible Architecture for Hardware-Managed Isolated Execution , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[32]  Weiwei Shan,et al.  A 923 Gbps/W, 113-Cycle, 2-Sbox Energy-efficient AES Accelerator in 28nm CMOS , 2019, 2019 Symposium on VLSI Circuits.

[33]  Rafail Ostrovsky,et al.  Software protection and simulation on oblivious RAMs , 1996, JACM.

[34]  Zhiru Zhang,et al.  Reverse Engineering Convolutional Neural Networks Through Side-channel Information Leaks , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[35]  Bo Luo,et al.  I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators , 2018, ACSAC.

[36]  Srinivas Devadas,et al.  MI6: Secure Enclaves in a Speculative Out-of-Order Processor , 2018, MICRO.

[37]  Shay Gueron,et al.  Memory Encryption for General-Purpose Processors , 2016, IEEE Security & Privacy.

[38]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[39]  Xiaosong Ma,et al.  Exploiting Locality in Graph Analytics through Hardware-Accelerated Traversal Scheduling , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[40]  T. Alves,et al.  TrustZone : Integrated Hardware and Software Security , 2004 .

[41]  Rajesh K. Gupta,et al.  SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[42]  Jaehyuk Huh,et al.  Reducing the Memory Bandwidth Overheads of Hardware Security Support for Multi-Core Processors , 2016, IEEE Transactions on Computers.

[43]  G. Edward Suh,et al.  Caches and hash trees for efficient memory integrity verification , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[44]  Brian Rogers,et al.  Improving Cost, Performance, and Security of Memory Encryption and Authentication , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[45]  Hari Angepat,et al.  Serving DNNs in Real Time at Datacenter Scale with Project Brainwave , 2018, IEEE Micro.

[46]  Stephen Taylor,et al.  Memory encryption , 2014, ACM Comput. Surv..

[47]  Lejla Batina,et al.  CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel , 2019, USENIX Security Symposium.

[48]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[49]  Srinivas Devadas,et al.  Sanctum: Minimal Hardware Extensions for Strong Software Isolation , 2016, USENIX Security Symposium.

[50]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[51]  Kunle Olukotun,et al.  GraphOps: A Dataflow Library for Graph Analytics Acceleration , 2016, FPGA.

[52]  Ruby B. Lee,et al.  Architecture for protecting critical secrets in microprocessors , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[53]  Dan Boneh,et al.  Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware , 2018, ICLR.

[54]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[55]  Jose Joao,et al.  Morphable Counters: Enabling Compact Integrity Trees For Low-Overhead Secure Memories , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[56]  Elaine Shi,et al.  Path ORAM: an extremely simple oblivious RAM protocol , 2012, CCS.

[57]  Peter G. Neumann,et al.  CHERI: A Hybrid Capability-System Architecture for Scalable Software Compartmentalization , 2015, 2015 IEEE Symposium on Security and Privacy.

[58]  Rajeev Balasubramonian,et al.  VAULT: Reducing Paging Overheads in SGX with Efficient Integrity Verification Structures , 2018, ASPLOS.

[59]  Dawn Song,et al.  Keystone: an open framework for architecting trusted execution environments , 2020, EuroSys.

[60]  Josep Torrellas,et al.  Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures , 2018, USENIX Security Symposium.

[61]  Natalie D. Enright Jerger,et al.  Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[62]  Vivienne Sze,et al.  Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[63]  G. Edward Suh,et al.  Efficient Memory Integrity Verification and Encryption for Secure Processors , 2003, MICRO.

[64]  Ruby B. Lee,et al.  Scalable architectural support for trusted software , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[65]  Zhiru Zhang,et al.  Channel Gating Neural Networks , 2018, NeurIPS.

[66]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[67]  Zhiru Zhang,et al.  Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating , 2019, MICRO.