Low Overhead Online Data Flow Tracking for Intermittently Powered Non-Volatile FPGAs

Energy harvesting is an attractive way to power future Internet of Things (IoT) devices since it can eliminate the need for battery or power cables. However, harvested energy is intrinsically unstable. While Field-programmable Gate Array (FPGAs) have been widely adopted in various embedded systems, it is hard to survive unstable power since all the memory components in FPGA are based on volatile Static Random-access Memory (SRAMs). The emerging non-volatile memory-based FPGAs provide promising potentials to keep configuration data on the chip during power outages. Few works have considered implementing efficient runtime intermediate data checkpoint on non-volatile FPGAs. To realize accumulative computation under intermittent power on FPGA, this article proposes a low-cost design framework, Data-Flow-Tracking FPGA (DFT-FPGA), which utilizes binary counters to track intermediate data flow. Instead of keeping all on-chip intermediate data, DFT-FPGA only targets on necessary data that is labeled by off-line analysis and identified by an online tracking system. The evaluation shows that compared with state-of-the-art techniques, DFT-FPGA can realize accumulative computing with less off-line workload and significantly reduce online roll-back time and resource utilization.

[1]  Hiroyuki Tomiyama,et al.  Proposal and Quantitative Analysis of the CHStone Benchmark Program Suite for Practical C-based High-level Synthesis , 2009, J. Inf. Process..

[2]  Fabrizio Lombardi,et al.  Design and evaluation of a memristor-based look-up table for non-volatile field programmable gate arrays , 2016, IET Circuits Devices Syst..

[3]  Jason Cong,et al.  mrFPGA: A novel FPGA architecture with memristor-based reconfiguration , 2011, 2011 IEEE/ACM International Symposium on Nanoscale Architectures.

[4]  Yiyu Shi,et al.  Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference , 2019, ACM Trans. Embed. Comput. Syst..

[5]  Engin Ipek,et al.  Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing , 2010, ISCA.

[6]  Fabien Clermidy,et al.  Bipolar ReRAM Based non-volatile flip-flops for low-power architectures , 2012, 10th IEEE International NEWCAS Conference.

[7]  Luis Angel Barragan,et al.  High-Level Synthesis for Accelerating the FPGA Implementation of Computationally Demanding Control Algorithms for Power Converters , 2013, IEEE Transactions on Industrial Informatics.

[8]  Farinaz Koushanfar,et al.  Chime: Checkpointing Long Computations on Interm ittently Energized IoT Devices , 2016, IEEE Transactions on Multi-Scale Computing Systems.

[9]  Shimeng Yu,et al.  Metal–Oxide RRAM , 2012, Proceedings of the IEEE.

[10]  Huazhong Yang,et al.  CP-FPGA: Computation data-aware software/hardware co-design for nonvolatile FPGAs based on checkpointing techniques , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[11]  Yu Hu,et al.  An efficient memristor-based distance accelerator for time series data mining on data centers , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Farheen Fatima Khan,et al.  A study on the accuracy of minimum width transistor area in estimating FPGA layout area , 2017, Microprocess. Microsystems.

[13]  Lei Yang,et al.  Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[14]  Yiran Chen,et al.  A lightweight progress maximization scheduler for non-volatile processor under unstable energy harvesting , 2017, LCTES.

[15]  Anantha Chandrakasan,et al.  A 3.4pJ FeRAM-enabled D flip-flop in 0.13µm CMOS for nonvolatile processing in digital systems , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.

[16]  G. De Micheli,et al.  Design and Architectural Assessment of 3-D Resistive Memory Technologies in FPGAs , 2013, IEEE Transactions on Nanotechnology.

[17]  Jason Cong,et al.  Scaling for edge inference of deep neural networks , 2018 .

[18]  Yiyu Shi,et al.  Edge segmentation: Empowering mobile telemedicine with compressed cellular neural networks , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[19]  Srihari Cadambi,et al.  A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.

[20]  Mohamed Abid,et al.  Towards realisation of wireless sensor network-based water pipeline monitoring systems: a comprehensive review of techniques and platforms , 2016 .

[21]  Jingtong Hu,et al.  Checkpoint aware hybrid cache architecture for NV processor in energy harvesting powered systems , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[22]  Shraddha S. Deshpande,et al.  FPGA Based Power Saving Technique for Sensor Node in Wireless Sensor Network (WSN) , 2019 .

[23]  Guillaume Prenat,et al.  Ultra-energy-efficient CMOS/magnetic nonvolatile flip-flop based on spin-orbit torque device , 2014 .

[24]  Ramin Rajaei,et al.  Radiation-Hardened Design of Nonvolatile MRAM-Based FPGA , 2016, IEEE Transactions on Magnetics.

[25]  Mahmut T. Kandemir,et al.  Incidental Computing on IoT Nonvolatile Processors , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[26]  Jingtong Hu,et al.  NVM-Based FPGA Block RAM With Adaptive SLC-MLC Conversion , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[27]  Fabien Clermidy,et al.  Phase-change-memory-based storage elements for configurable logic , 2010, 2010 International Conference on Field-Programmable Technology.

[28]  Abderrezak Guessoum,et al.  FPGA-based wireless sensor nodes for vibration monitoring system and fault diagnosis , 2017 .

[29]  Narayanan Vijaykrishnan,et al.  Nonvolatile Processor Architecture Exploration for Energy-Harvesting Applications , 2015, IEEE Micro.

[30]  Jingtong Hu,et al.  Low Overhead Online Checkpoint for Intermittently Powered Non-volatile FPGAs , 2018, 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[31]  Ming-Jinn Tsai,et al.  A low store energy and robust ReRAM-based flip-flop for normally off microprocessors , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[32]  Giovanni De Micheli,et al.  A Study on the Programming Structures for RRAM-Based FPGA Architectures , 2016, IEEE Transactions on Circuits and Systems I: Regular Papers.

[33]  Yu Ting Chen,et al.  A Survey and Evaluation of FPGA High-Level Synthesis Tools , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[34]  Jingtong Hu,et al.  Fixing the broken time machine: Consistency-aware checkpointing for energy harvesting powered non-volatile processor , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[35]  Brandon Lucia,et al.  Alpaca: intermittent execution without checkpoints , 2017, Proc. ACM Program. Lang..

[36]  ChakradharSrimat,et al.  A dynamically configurable coprocessor for convolutional neural networks , 2010 .

[37]  Yuan Xie,et al.  3D-NonFAR: Three-dimensional non-volatile FPGA architecture using phase change memory , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[38]  Ronald F. DeMara,et al.  Radiation-hardened MRAM-based LUT for non-volatile FPGA soft error mitigation with multi-node upset tolerance , 2017 .

[39]  T Xue Analysis of Magnetic Plucking Configurations for Frequency Up-Converting Harvesters , 2015 .

[40]  Meng-Fan Chang,et al.  A ReRAM-Based Nonvolatile Flip-Flop With Self-Write-Termination Scheme for Frequent-OFF Fast-Wake-Up Nonvolatile Processors , 2017, IEEE Journal of Solid-State Circuits.

[41]  Srihari Cadambi,et al.  A programmable parallel accelerator for learning and classification , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[42]  Jason Helge Anderson,et al.  LegUp: An open-source high-level synthesis tool for FPGA-based processor/accelerator systems , 2013, TECS.

[43]  Luca Benini,et al.  Hibernus: Sustaining Computation During Intermittent Supply for Energy-Harvesting Systems , 2015, IEEE Embedded Systems Letters.

[44]  Junaid Haroon Siddiqui,et al.  Towards smaller checkpoints for better intermittent computing: poster abstract , 2018, IPSN.

[45]  Eric Belhaire,et al.  Spin transfer torque (STT)-MRAM--based runtime reconfiguration FPGA circuit , 2009, TECS.

[46]  Giovanni De Micheli,et al.  A high-performance low-power near-Vt RRAM-based FPGA , 2014, 2014 International Conference on Field-Programmable Technology (FPT).

[47]  Hongil Yoon,et al.  Variation-tolerant and low power look-up table (LUT) using spin-torque transfer magnetic RAM for non-volatile field programmable gate array (FPGA) , 2016, 2016 International SoC Design Conference (ISOCC).