Abstraction and simulation for strategic design-space exploration in reconfigurable computing

Due to recent trends in computing that favor device technologies and applications exploiting explicit parallelism to achieve greater performance, reconfigurable computing (RC) systems consisting of highly parallel applications executing on platforms featuring reconfigurable devices such as FPGAs are becoming an increasingly important option for accelerating applications in high-performance and embedded computing. Unfortunately, the time and difficulty associated with developing applications for RC platforms is often prohibitive, making it difficult to exploit the potential gains in performance and power savings that RC can offer. In order to facilitate RC productivity, better concepts and tools are needed to allow designers to plan and analyze their designs before coding a specific (and possibly fruitless) implementation, a process we call formulation. The research presented here defines a formal framework and presents a set of techniques to address the need for better RC formulation, which includes (1) a script-based discrete-event simulation framework for rapid analysis of RC systems, (2) an abstract modeling language for conveniently representing RC systems that can be integrated with existing prediction and analysis methods, and (3) an algorithm for automated scheduling and partitioning of applications onto scalable RC platforms. Case studies show the simulation framework to provide performance prediction results across multiple applications and platforms with errors of less than 10% and in a fraction of the time that traditional functional simulators require. The abstract modeling framework is demonstrated in modeling a number of RC systems and serves as an effective interface to the simulation framework. The automated scheduling algorithm for scalable RC systems efficiently provides users with near-optimal partitions and schedules of heterogeneous tasks graphs mapped to potentially large-scale distributed RC-based clusters. Combined, these techniques allow RC designers to efficiently model, analyze, and document their designs early in the development process, which is projected to provide large improvements in overall productivity and thus expand the usage of RC technologies.

[1]  Alan D. George,et al.  A framework for core-level modeling and design of reconfigurable computing algorithms , 2009, HPRCTA '09.

[2]  Jason Cong,et al.  Architecture evaluation for power-efficient FPGAs , 2003, FPGA '03.

[3]  Andy D. Pimentel,et al.  A systematic approach to exploring embedded system architectures at multiple abstraction levels , 2006, IEEE Transactions on Computers.

[4]  Nikil D. Dutt,et al.  Physically-aware HW-SW partitioning for reconfigurable architectures with partial dynamic reconfiguration , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[5]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[6]  A. Shoba Das,et al.  Hardware-software co-synthesis of hard real-time systems with reconfigurable FPGAs , 2004, Comput. Electr. Eng..

[7]  Massoud Pedram,et al.  High-level Power Modeling, Estimation, And Optimization , 1997, Proceedings of the 34th Design Automation Conference.

[8]  Wolfgang Rosenstiel,et al.  Power estimation approach for SRAM-based FPGAs , 2000, FPGA '00.

[9]  Alan D. George,et al.  RAT: a methodology for predicting performance in application design migration to FPGAs , 2007, HPRCTA.

[10]  Fernando Guirado,et al.  A New Task Graph Model for Mapping Message Passing Applications , 2007, IEEE Transactions on Parallel and Distributed Systems.

[11]  Scott Hauck,et al.  An Introduction to Reconfigurable Computing , 2000 .

[12]  Katherine Compton,et al.  A Simulation Platform for Reconfigurable Computing Research , 2006, 2006 International Conference on Field Programmable Logic and Applications.

[13]  Peter H. Feiler,et al.  The Architecture Analysis & Design Language (AADL): An Introduction , 2006 .

[14]  Jörg Henkel A low power hardware/software partitioning approach for core-based embedded systems , 1999, DAC '99.

[15]  Viktor K. Prasanna,et al.  Modeling and mapping for dynamically reconfigurable hybrid architectures , 2001 .

[16]  Steven J. E. Wilton,et al.  A detailed power model for field-programmable gate arrays , 2005, TODE.

[17]  Brad Calder,et al.  SimPoint 3.0: Faster and More Flexible Program Phase Analysis , 2005, J. Instr. Level Parallelism.

[18]  K. Keutzer,et al.  System-level design: orthogonalization of concerns andplatform-based design , 2000, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[19]  Alberto L. Sangiovanni-Vincentelli,et al.  FPGA Architecture Characterization for System Level Performance Analysis , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[20]  A.D. George,et al.  Multiparadigm Space Processing for Hyperspectral Imaging , 2008, 2008 IEEE Aerospace Conference.

[21]  Karama Kanoun,et al.  An architecture-based dependability modeling framework using AADL , 2007, ICSE 2007.

[22]  Ranga Vemuri,et al.  An Iterative Algorithm for Hardware-Software Partitioning, Hardware Design Space Exploration and Scheduling , 2000, Des. Autom. Embed. Syst..

[23]  Russell Tessier,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Reconfigurable Computing for Digital Signal Processing: A Survey ∗ , 1999 .

[24]  Edward A. Lee,et al.  Ptolemy: A Framework for Simulating and Prototyping Heterogenous Systems , 2001, Int. J. Comput. Simul..

[25]  Greg Stitt Hardware/software partitioning with multi-version implementation exploration , 2008, GLSVLSI '08.

[26]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[27]  Satish K. Tripathi,et al.  Static and Dynamic Processor Scheduling Disciplines in Heterogeneous Parallel Architectures , 1995, J. Parallel Distributed Comput..

[28]  Marco Platzner,et al.  System-level performance evaluation of reconfigurable processors , 2005, Microprocess. Microsystems.

[29]  Robert Michael Owens,et al.  Analysis of power consumption in memory hierarchies , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[30]  Alan D. George,et al.  FASE: A Framework for Scalable Performance Prediction of HPC Systems and Applications , 2007, Simul..

[31]  Viktor K. Prasanna,et al.  A model-based extensible framework for efficient application design using FPGA , 2007, TODE.

[32]  Sadaf R. Alam,et al.  Using FPGA Devices to Accelerate Biomolecular Simulations , 2007, Computer.

[33]  Luciano Lavagno,et al.  Metropolis: An Integrated Electronic System Design Environment , 2003, Computer.

[34]  André Seznec,et al.  Choosing representative slices of program execution for microarchitecture simulations: a preliminary , 2000 .

[35]  David W. Walker,et al.  The Design of a Standard Message Passing Interface for Distributed Memory Concurrent Computers , 1994, Parallel Comput..

[36]  Alan D. George,et al.  SCF: a device- and language-independent task coordination framework for reconfigurable, heterogeneous systems , 2009, HPRCTA '09.

[37]  Jesús Labarta,et al.  A Framework for Performance Modeling and Prediction , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[38]  Viktor K. Prasanna,et al.  Rapid design space exploration of heterogeneous embedded systems using symbolic search and multi-granular simulation , 2002, LCTES/SCOPES '02.

[39]  Michael J. Schulte,et al.  An Overview of Reconfigurable Hardware in Embedded Systems , 2006, EURASIP J. Embed. Syst..

[40]  Ed F. Deprettere,et al.  Exploring Embedded-Systems Architectures with Artemis , 2001, Computer.

[41]  Jason Helge Anderson,et al.  Power estimation techniques for FPGAs , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[42]  Christophe Bobda,et al.  Introduction to reconfigurable computing - architectures, algorithms, and applications , 2010 .

[43]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[44]  Peter A. Kollman,et al.  AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules , 1995 .

[45]  H. Simmler,et al.  Strategic Challenges for Application Development Productivity in Reconfigurable Computing , 2008, 2008 IEEE National Aerospace and Electronics Conference.

[46]  Andy D. Pimentel,et al.  The Artemis workbench for system-level performance evaluation of embedded systems , 2008, Int. J. Embed. Syst..

[47]  Reiner W. Hartenstein,et al.  A decade of reconfigurable computing: a visionary retrospective , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[48]  Alan D. George,et al.  System-level simulation modeling with MLDesigner , 2003, 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003..

[49]  Duncan Clarke,et al.  Schedulability analysis of AADL models , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[50]  C. Brooks Computer simulation of liquids , 1989 .

[51]  N.K. Jha,et al.  CORDS: hardware-software co-synthesis of reconfigurable real-time distributed embedded systems , 1998, 1998 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (IEEE Cat. No.98CB36287).

[52]  Jack J. Dongarra,et al.  A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..

[53]  Thomas F. Wenisch,et al.  Statistical sampling of microarchitecture simulation , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[54]  Ed F. Deprettere,et al.  A Methodology to Design Programmable Embedded Systems - The Y-Chart Approach , 2001, Embedded Processor Design Challenges.

[55]  Thomas Fahringer,et al.  Performance Prophet: a performance modeling and prediction tool for parallel and distributed programs , 2005, 2005 International Conference on Parallel Processing Workshops (ICPPW'05).

[56]  Laxmikant V. Kalé,et al.  NAMD: a Parallel, Object-Oriented Molecular Dynamics Program , 1996, Int. J. High Perform. Comput. Appl..

[57]  Frank Vahid,et al.  Dynamic hardware/software partitioning: a first approach , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[58]  Laurent Nana,et al.  Scheduling and memory requirements analysis with AADL , 2005, SIGAda.

[59]  Ed F. Deprettere,et al.  A trace transformation technique for communication refinement , 2001, CODES '01.

[60]  Ranga Vemuri,et al.  MAGELLAN: multiway hardware-software partitioning and scheduling for latency minimization of hierarchical control-dataflow task graphs , 2001, Ninth International Symposium on Hardware/Software Codesign. CODES 2001 (IEEE Cat. No.01TH8571).

[61]  Chein-I Chang,et al.  Real-time processing algorithms for target detection and classification in hyperspectral imagery , 2001, IEEE Trans. Geosci. Remote. Sens..

[62]  Petru Eles,et al.  System Level Hardware/Software Partitioning Based on Simulated Annealing and Tabu Search , 1997, Des. Autom. Embed. Syst..

[63]  Trevor N. Mudge,et al.  Trace-driven memory simulation: a survey , 1997, CSUR.

[64]  Craig P. Steffen,et al.  Parametrization of Algorithms and FPGA Accelerators To Predict Performance , 2007 .

[65]  Gregory D. Peterson,et al.  Analytical Modeling for High Performance Reconfigurable Computers , 2002 .

[66]  Yanbing Li,et al.  Hardware-software co-design of embedded reconfigurable architectures , 2000, DAC.

[67]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.

[68]  Viktor K. Prasanna,et al.  MILAN: A Model Based Integrated Simulation Framework for Design of Embedded Systems , 2001, OM '01.