ACM Transactions on Reconfigurable Technology and Systems Performance Analysis Framework for High-Level Language Applications in Reconfigurable Computing

High-Level Languages (HLLs) for Field-Programmable Gate Arrays (FPGAs) facilitate the use of reconfigurable computing resources for application developers by using familiar, higher-level syntax, semantics, and abstractions, typically enabling faster development times than with traditional Hardware Description Languages (HDLs). However, programming at a higher level of abstraction is typically accompanied by some loss of performance as well as reduced transparency of application behavior, making it difficult to understand and improve application performance. While runtime tools for performance analysis are often featured in development with traditional HLLs for sequential and parallel programming, HLL-based development for FPGAs has an equal or greater need yet lacks these tools. This article presents a novel and portable framework for runtime performance analysis of HLL applications for FPGAs, including an automated tool for performance analysis of designs created with Impulse C, a commercial HLL for FPGAs. As a case study, this tool is used to successfully locate performance bottlenecks in a molecular dynamics kernel in order to gain speedup.

[1]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[2]  Jean Paul Calvez,et al.  Performance assessment of embedded Hw/Sw systems , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.

[3]  Ieee Standard Test Access Port and Boundary-scan Architecture Ieee-sa Standards Board , 2001 .

[4]  William Gropp,et al.  Toward Scalable Performance Visualization with Jumpshot , 1999, Int. J. High Perform. Comput. Appl..

[5]  J.G. Tong,et al.  A Comparison of Profiling Tools for FPGA-Based Embedded Systems , 2007, 2007 Canadian Conference on Electrical and Computer Engineering.

[6]  Alan D. George,et al.  Performance analysis challenges and framework for high-performance reconfigurable computing , 2008, Parallel Comput..

[7]  Rodham E. Tulloss,et al.  The Test Access Port and Boundary Scan Architecture , 1990 .

[8]  Brent E. Nelson,et al.  Instrumenting Bitstreams for Debugging FPGA Circuits , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[9]  Alan D. George,et al.  Performance Monitoring for Run-time Management of Reconfigurable Devices , 2005, ERSA.

[10]  Alan D. George,et al.  Parallel performance wizard: A performance analysis tool for partitioned global-address-space programming , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[11]  Nisheeth Shrivastava,et al.  Formulating and implementing profiling over adaptive ranges , 2008, TACO.

[12]  Sadaf R. Alam,et al.  Performance characterization of molecular dynamics techniques for biomolecular simulations , 2006, PPoPP '06.

[13]  Robert J. Fowler,et al.  HPCVIEW: A Tool for Top-down Analysis of Node Performance , 2002, The Journal of Supercomputing.

[14]  Daniel S. Poznanovic,et al.  Application development on the SRC Computers, Inc. systems , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[15]  Jean Paul Calvez,et al.  Performance Monitoring and Assessment of Embedded HW/SW Systems , 1998, Des. Autom. Embed. Syst..

[16]  David Pellerin,et al.  Practical FPGA programming in C , 2005 .

[17]  Felix Wolf,et al.  KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications , 2003 .

[18]  Sally A. McKee,et al.  Owl: next generation system monitoring , 2005, CF '05.

[19]  Alan D. George,et al.  Performance Analysis with High-Level Languages for High-Performance Reconfigurable Computing , 2008, 2008 16th International Symposium on Field-Programmable Custom Computing Machines.

[20]  CurreriJohn,et al.  Performance Analysis Framework for High-Level Language Applications in Reconfigurable Computing , 2010 .

[21]  Guang R. Gao,et al.  Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform , 2007, HPRCTA.

[22]  Karl S. Hemmert,et al.  Source level debugger for the Sea Cucumber synthesizing compiler , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..