Towards Homomorphic Inference Beyond the Edge

Beyond edge devices can function off the power grid and without batteries, enabling them to operate in difficult to access regions. However, energy costly long-distance communication required for reporting results or offloading computation becomes a limitation. Here, we reduce this overhead by developing a beyond edge device which can effectively act as a nearby server to offload computation. For security reasons, this device must operate on encrypted data, which incurs a high overhead. We use energy-efficient and intermittent-safe in-memory computation to enable this encrypted computation, allowing it to provide a speedup for beyond edge applications within a power budget of a few milliWatts.

[1]  Xuehai Zhou,et al.  PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.

[2]  Satoshi Takaya,et al.  7.5 A 3.3ns-access-time 71.2μW/MHz 1Mb embedded STT-MRAM using physically eliminated read-disturb scheme and normally-off memory architecture , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[3]  Martin A. Green,et al.  Solar cell efficiency tables (version 52) , 2018, Progress in Photovoltaics: Research and Applications.

[4]  Shahar Kvatinsky,et al.  Performing Memristor-Aided Logic (MAGIC) using STT-MRAM , 2019, 2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS).

[5]  Sachin S. Sapatnekar,et al.  In-Memory Processing on the Spintronic CRAM: From Hardware Design to Application Mapping , 2019, IEEE Transactions on Computers.

[6]  Daeyeon Kim,et al.  The Phoenix Processor: A 30pW platform for sensor applications , 2008, 2008 IEEE Symposium on VLSI Circuits.

[7]  Matthew S. Reynolds,et al.  A 158 pJ/bit 1.0 Mbps Bluetooth Low Energy (BLE) Compatible Backscatter Communication System for Wireless Sensing , 2019, 2019 IEEE Topical Conference on Wireless Sensors and Sensor Networks (WiSNet).

[8]  Luca Benini,et al.  Hibernus++: A Self-Calibrating and Adaptive System for Transiently-Powered Embedded Devices , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  S. Sathiya Keerthi,et al.  Which Is the Best Multiclass SVM Method? An Empirical Study , 2005, Multiple Classifier Systems.

[10]  Luca Benini,et al.  Graceful Performance Modulation for Power-Neutral Transient Computing Systems , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Frederik Vercauteren,et al.  Somewhat Practical Fully Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[12]  Brandon Lucia,et al.  Orbital Edge Computing: Nanosatellite Constellations as a New Class of Computer System , 2020, ASPLOS.

[13]  Sara Arabi,et al.  Information-centric networking meets delay tolerant networking: Beyond edge caching , 2018, 2018 IEEE Wireless Communications and Networking Conference (WCNC).

[14]  David Blaauw,et al.  23.3 A 3nW fully integrated energy harvester based on self-oscillating switched-capacitor DC-DC converter , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[15]  Brandon Lucia,et al.  MANIC: A Vector-Dataflow Architecture for Ultra-Low-Power Embedded Systems , 2019, MICRO.

[16]  Luca Benini,et al.  XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Brandon Lucia,et al.  Practical Encrypted Computing for IoT Clients , 2021, ArXiv.

[18]  Brandon Lucia,et al.  Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems , 2018, ASPLOS.

[19]  Brandon Lucia,et al.  A Reconfigurable Energy Storage Architecture for Energy-harvesting Devices , 2018, ASPLOS.

[20]  Meng-Fan Chang,et al.  Ambient energy harvesting nonvolatile processors: From circuit to system , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[21]  Brandon Lucia,et al.  A simpler, safer programming and execution model for intermittent systems , 2015, PLDI.

[22]  Daisuke Saida,et al.  Sub-3 ns pulse with sub-100 µA switching of 1x–2x nm perpendicular MTJ for high-performance embedded STT-MRAM towards sub-20 nm CMOS , 2016, 2016 IEEE Symposium on VLSI Technology.

[23]  Narayanan Vijaykrishnan,et al.  Architecture exploration for ambient energy harvesting nonvolatile processors , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[24]  Ankur Srivastava,et al.  In Situ Stochastic Training of MTJ Crossbars With Machine Learning Algorithms , 2019, ACM J. Emerg. Technol. Comput. Syst..

[25]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[26]  Ingrid Verbauwhede,et al.  HEAWS: An Accelerator for Homomorphic Encryption on the Amazon AWS FPGA , 2020, IEEE Transactions on Computers.

[27]  Matthew Hicks,et al.  Clank: Architectural support for intermittent computation , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[28]  Jean-Jacques Chaillout,et al.  Energy Consumption Model for Sensor Nodes Based on LoRa and LoRaWAN , 2018, Sensors.

[29]  James M. Brown,et al.  CaRENets: Compact and Resource-Efficient CNN for Homomorphic Inference on Encrypted Medical Images , 2019, ArXiv.

[30]  Arnab Raha,et al.  QUICKRECALL: A Low Overhead HW/SW Approach for Enabling Computations across Power Cycles in Transiently Powered Computers , 2014, 2014 27th International Conference on VLSI Design and 2014 13th International Conference on Embedded Systems.

[31]  Sachin S. Sapatnekar,et al.  MOUSE: Inference In Non-volatile Memory for Energy Harvesting Applications , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[32]  Natalie D. Enright Jerger,et al.  The What's Next Intermittent Computing Architecture , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[33]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[34]  Sachin S. Sapatnekar,et al.  Analyzing the Effects of Interconnect Parasitics in the STT CRAM In-Memory Computational Platform , 2020, IEEE Journal on Exploratory Solid-State Computational Devices and Circuits.

[35]  Brandon Lucia,et al.  Adaptive Dynamic Checkpointing for Safe Efficient Intermittent Computing , 2018, OSDI.

[36]  Sachin S. Sapatnekar,et al.  Efficient In-Memory Processing Using Spintronics , 2018, IEEE Computer Architecture Letters.

[37]  Anupam Chattopadhyay,et al.  CONTRA: Area-Constrained Technology Mapping Framework For Memristive Memory Processing Unit , 2020, 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD).

[38]  Hsien-Hsin S. Lee,et al.  Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference , 2020, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[39]  Manos M. Tentzeris,et al.  Ambient RF Energy-Harvesting Technologies for Self-Sustainable Standalone Wireless Sensor Platforms , 2014, Proceedings of the IEEE.

[40]  Hongyang Jia,et al.  A Programmable Embedded Microprocessor for Bit-scalable In-memory Computing , 2019, 2019 IEEE Hot Chips 31 Symposium (HCS).

[41]  William Stafford Noble,et al.  Support vector machine , 2013 .

[42]  Brandon Lucia,et al.  Automatically enforcing fresh and consistent inputs in intermittent systems , 2021, PLDI.

[43]  Brandon Lucia,et al.  Termination checking and task decomposition for task-based intermittent programs , 2018, CC.

[44]  Jung Hee Cheon,et al.  HE-Friendly Algorithm for Privacy-Preserving SVM Training , 2020, IEEE Access.

[45]  Mahmut T. Kandemir,et al.  ResiRCA: A Resilient Energy Harvesting ReRAM Crossbar-Based Accelerator for Intelligent Embedded Processors , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[46]  Tom Zhong,et al.  Demonstration of fully functional 8Mb perpendicular STT-MRAM chips with sub-5ns writing for non-volatile embedded memories , 2014, 2014 Symposium on VLSI Technology (VLSI-Technology): Digest of Technical Papers.

[47]  Erkay Savas,et al.  Efficient number theoretic transform implementation on GPU for homomorphic encryption , 2021, IACR Cryptol. ePrint Arch..

[48]  Nishil Talati,et al.  Logic Design Within Memristive Memories Using Memristor-Aided loGIC (MAGIC) , 2016, IEEE Transactions on Nanotechnology.

[49]  Kevin Marquet,et al.  Peripheral state persistence for transiently-powered systems , 2017, 2017 Global Internet of Things Summit (GIoTS).

[50]  Nikil D. Dutt,et al.  CryptoPIM: In-memory Acceleration for Lattice-based Cryptographic Hardware , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).

[51]  John R. Long,et al.  Photovoltaic Antennas for Autonomous Wireless Systems , 2011, IEEE Transactions on Circuits and Systems II: Express Briefs.

[52]  Jeroen Famaey,et al.  Energy-Aware Battery-Less Bluetooth Low Energy Device Prototype Powered By Ambient Light , 2021, SenSys.

[53]  Naveen Verma,et al.  An In-memory-Computing DNN Achieving 700 TOPS/W and 6 TOPS/mm2 in 130-nm CMOS , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[54]  Brandon Lucia,et al.  Chain: tasks and channels for reliable intermittent programs , 2016, OOPSLA.

[55]  Alex S. Weddell,et al.  Using Sleep States to Maximize the Active Time of Transient Computing Systems , 2017, ENSsys@SenSys.

[56]  Nikil Dutt,et al.  Exploring Energy Efficient Quantum-resistant Signal Processing Using Array Processors , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[57]  Brian C. Sales,et al.  Thermoelectric Materials: New Approaches to an Old Problem , 1997 .

[58]  Faruk Yildiz,et al.  Potential Ambient Energy-Harvesting Sources and Techniques , 2009 .

[59]  Mario Badr,et al.  The EH Model: Early Design Space Exploration of Intermittent Processor Architectures , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[60]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[61]  Changhee Jung,et al.  Lightweight hardware support for transparent consistency-aware checkpointing in intermittent energy-harvesting systems , 2016, 2016 5th Non-Volatile Memory Systems and Applications Symposium (NVMSA).

[62]  Zvika Brakerski,et al.  Fully Homomorphic Encryption without Modulus Switching from Classical GapSVP , 2012, CRYPTO.

[63]  Ramesh Harjani,et al.  A unified framework for capacitive series-parallel DC-DC converter design , 2014, Proceedings of the IEEE 2014 Custom Integrated Circuits Conference.

[64]  Tajana Simunic,et al.  FELIX: Fast and Energy-Efficient Logic in Memory , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[65]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[66]  Kevin Fu,et al.  Mementos: system support for long-running computation on RFID-scale devices , 2011, ASPLOS XVI.

[67]  Brandon Lucia,et al.  Orbital Edge Computing: Machine Inference in Space , 2019, IEEE Computer Architecture Letters.

[68]  Che-jui Liu,et al.  A Comprehensive Study of Bluetooth Low Energy , 2021, Journal of Physics: Conference Series.

[69]  Brandon Lucia,et al.  Intermittent Computing: Challenges and Opportunities , 2017, SNAPL.

[70]  Apostolos Georgiadis,et al.  Conformal Hybrid Solar and Electromagnetic (EM) Energy Harvesting Rectenna , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.

[71]  Kevin Marquet,et al.  Incremental checkpointing of program state to NVRAM for transiently-powered systems , 2014, 2014 9th International Symposium on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC).

[72]  Paul Jaffe,et al.  Energy Conversion and Transmission Modules for Space Solar Power , 2013, Proceedings of the IEEE.

[73]  Bharadwaj Veeravalli,et al.  Implementation and Performance Evaluation of RNS Variants of the BFV Homomorphic Encryption Scheme , 2019, IEEE Transactions on Emerging Topics in Computing.

[74]  Jacob Sorber,et al.  Timely Execution on Intermittently Powered Batteryless Sensors , 2017, SenSys.

[75]  A.P. Chandrakasan,et al.  Voltage Scalable Switched Capacitor DC-DC Converter for Ultra-Low-Power On-Chip Applications , 2007, 2007 IEEE Power Electronics Specialists Conference.

[76]  Davide Anguita,et al.  A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[77]  Brandon Lucia,et al.  Transactional concurrency control for intermittent, energy-harvesting computing systems , 2019, PLDI.

[78]  David Blaauw,et al.  A 28-nm Compute SRAM With Bit-Serial Logic/Arithmetic Operations for Programmable In-Memory Vector Computing , 2020, IEEE Journal of Solid-State Circuits.

[79]  Milos Manic,et al.  Intelligent Buildings of the Future: Cyberaware, Deep Learning Powered, and Human Interacting , 2016, IEEE Industrial Electronics Magazine.

[80]  Brandon Lucia,et al.  Alpaca: intermittent execution without checkpoints , 2017, Proc. ACM Program. Lang..

[81]  Zhongrui Wang,et al.  Memristive Crossbar Arrays for Storage and Computing Applications , 2021, Adv. Intell. Syst..

[82]  P. D. Mitcheson,et al.  Ambient RF Energy Harvesting in Urban and Semi-Urban Environments , 2013, IEEE Transactions on Microwave Theory and Techniques.

[83]  Carles Gomez,et al.  Overview and Evaluation of Bluetooth Low Energy: An Emerging Low-Power Wireless Technology , 2012, Sensors.

[84]  Hossein Valavi,et al.  A 64-Tile 2.4-Mb In-Memory-Computing CNN Accelerator Employing Charge-Domain Compute , 2019, IEEE Journal of Solid-State Circuits.

[85]  Ronald M. Summers,et al.  Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique , 2016 .

[86]  Matthew Hicks,et al.  Intermittent Computation without Hardware Support or Programmer Intervention , 2016, OSDI.

[87]  Matti Siekkinen,et al.  How low energy is bluetooth low energy? Comparative measurements with ZigBee/802.15.4 , 2012, 2012 IEEE Wireless Communications and Networking Conference Workshops (WCNCW).