Generating Efficient Context-Switch Capable Circuits through Autonomous Design Flow

Commercial off-the-shelf (COTS) Field-Programmable Gate Arrays (FPGAs) are becoming increasingly powerful. In addition to their huge hardware resources, they are also integrated into complete systems on chips (SOCs), e.g., in the latest Xilinx Zynq or Altera Stratix platforms. However, cooperation between FPGAs and their surroundings, and the flexibility of hardware task management could still be improved. For instance, mechanisms have yet to be automated to allow multi-user approaches. A reconfigurable resource can be shared between applications or users only if it has a context-switch ability allowing applications to be paused and resumed in response to system demands. Here, we present a high-level synthesis (HLS) design flow producing a context-switch-capable circuit. The design flow manipulates the intermediate representation of an HLS tool to build the context extraction mechanism and to optimize performance for the circuit produced. The method is based on efficient checkpoint selection and insertion of a powerful scan-chain into the initial circuit. This scan-chain can extract flip-flops or memory content. Experiments with the system produced show that it has a low hardware overhead for many benchmark applications, and that the hardware added has a negligible impact on application performance. Comparisons with current standard methods highlight the efficiency of our contributions.

[1]  Adrien Prost-Boucle,et al.  Fast and standalone Design Space Exploration for High-Level Synthesis under resource constraints , 2014, J. Syst. Archit..

[2]  Janak H. Patel,et al.  Reducing test application time for full scan embedded cores , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[3]  Stephen M. Scalera,et al.  The design and implementation of a context switching FPGA , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[4]  Scott Hauck,et al.  Performance of partial reconfiguration in FPGA systems: A survey and a cost model , 2011, TRETS.

[5]  Hiroaki Takada,et al.  Comparison of Preemption Schemes for Partially Reconfigurable FPGAs , 2012, IEEE Embedded Systems Letters.

[6]  Giorgio C. Buttazzo,et al.  Limited Preemptive Scheduling for Real-Time Systems. A Survey , 2013, IEEE Transactions on Industrial Informatics.

[7]  Ge Yu,et al.  Schedulability analysis of preemptive and nonpreemptive EDF on partial runtime-reconfigurable FPGAs , 2008, TODE.

[8]  Hiroyuki Tomiyama,et al.  Proposal and Quantitative Analysis of the CHStone Benchmark Program Suite for Practical C-based High-level Synthesis , 2009, J. Inf. Process..

[9]  Brent E. Nelson,et al.  Using Design-Level Scan to Improve FPGA Design Observability and Controllability for Functional Verification , 2001, FPL.

[10]  Yun Liang,et al.  High-Level Synthesis: Productivity, Performance, and Software Constraints , 2012, J. Electr. Comput. Eng..

[11]  L. Alvisi,et al.  A Survey of Rollback-Recovery Protocols , 2002 .

[12]  Steven Trimberger,et al.  A time-multiplexed FPGA , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[13]  Mehdi Baradaran Tahoori,et al.  Soft error rate estimation and mitigation for SRAM-based FPGAs , 2005, FPGA '05.

[14]  Reinhard Männer,et al.  Preemptive multitasking on FPGAs , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[15]  Chen Ding,et al.  Quantifying the cost of context switch , 2007, ExpCS '07.

[16]  Milind Girkar,et al.  The hierarchical task graph as a universal intermediate representation , 2007, International Journal of Parallel Programming.

[17]  Tobias Becker,et al.  Modular dynamic reconfiguration in Virtex FPGAs , 2006 .

[18]  Rudy Lauwereins,et al.  Infrastructure for design and management of relocatable tasks in a heterogeneous reconfigurable system-on-chip , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[19]  Brad L. Hutchings,et al.  Multitasking Hardware on the SLAAC1-V Reconfigurable Computing System , 2002, FPL.

[20]  Bin Huang,et al.  Checkpoint/Restart and Beyond: Resilient High Performance Computing with FPGAs , 2011, 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines.

[21]  K. Wakabayashi,et al.  A dynamically reconfigurable logic engine with a multi-context/multi-mode unified-cell architecture , 1999, 1999 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC. First Edition (Cat. No.99CH36278).

[22]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[23]  Guy Lemieux,et al.  ZUMA: An Open FPGA Overlay Architecture , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[24]  Heiko Kalte,et al.  Context saving and restoring for multitasking in reconfigurable systems , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[25]  Abhilash Thekkilakattil,et al.  Limited Preemptive Scheduling in Real-time Systems , 2016 .

[26]  Malgorzata Marek-Sadowska,et al.  Cost-free scan: a low-overhead scan path design methodology , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[27]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[28]  Ricardo Reis,et al.  A low-cost SEE mitigation solution for soft-processors embedded in Systems on Pogrammable Chips , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[29]  Philip James-Roxby,et al.  A Self-reconfiguring Platform , 2003, FPL.

[30]  Christian Haubelt,et al.  Efficient hardware checkpointing: concepts, overhead analysis, and implementation , 2007, FPGA '07.

[31]  Jürgen Becker,et al.  An FPGA run-time system for dynamical on-demand reconfiguration , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[32]  Chih-Chang Lin,et al.  Cost-free scan: a low-overhead scan path design methodology , 1995, Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).

[33]  Bran Selic,et al.  A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems , 2013, The Journal of Supercomputing.

[34]  Lilia Zaourar,et al.  A Graph-Based Approach to Optimal Scan Chain Stitching Using RTL Design Descriptions , 2012, VLSI Design.

[35]  Reinhard Männer,et al.  Multitasking on FPGA Coprocessors , 2000, FPL.

[36]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[37]  Olivier Muller,et al.  Automatic High-Level Hardware Checkpoint Selection for Reconfigurable Systems , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[38]  Scott Hauck,et al.  Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation , 2007 .

[39]  Nur A. Touba,et al.  Survey of Test Vector Compression Techniques , 2006, IEEE Design & Test of Computers.

[40]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[41]  Jürgen Teich,et al.  Dynamic Defragmentation of Reconfigurable Devices , 2010, TRETS.

[42]  Elias Vansteenkiste,et al.  Efficient implementation of Virtual Coarse Grained Reconfigurable Arrays on FPGAS , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.