The hArtes Tool Chain

This chapter describes the different design steps needed to go from legacy code to a transformed application that can be efficiently mapped on the hArtes platform.

[1]  Stan Krolikoski,et al.  Virtual Component HW/SW Co-Design , 2001 .

[2]  Stamatis Vassiliadis,et al.  Recursive Variable Expansion: A Loop Transformation for Reconfigurable Systems , 2007, 2007 International Conference on Field-Programmable Technology.

[3]  Vivek Sarkar,et al.  Partitioning and Scheduling Parallel Programs for Multiprocessing , 1989 .

[4]  G. Ramalingam,et al.  Identifying loops in almost linear time , 1999, TOPL.

[5]  Georgi Gaydadjiev,et al.  3D Compaction: A Novel Blocking-Aware Algorithm for Online Hardware Task Scheduling and Placement on 2D Partially Reconfigurable Devices , 2010, ARC.

[6]  Stamatis Vassiliadis,et al.  The MOLEN ρμ-coded processor , 2001 .

[7]  T. Wiangtong,et al.  Hardware/software codesign: a systematic approach targeting data-intensive applications , 2005, IEEE Signal Processing Magazine.

[8]  Frank Schirrmeister Cadence Virtual Component Co-Design: An Environment for System Design and Its Application to Functional Specification and Architectural Selection for Automotive Systems , 2000 .

[9]  Todor Stefanov,et al.  Flexible pipelining design for recursive variable expansion , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[10]  Jianwen Zhu,et al.  Compiling SpecC for simulation , 2001, ASP-DAC '01.

[11]  Wayne Luk,et al.  Optimising designs by combining model-based and pattern-based transformations , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[12]  Antonino Tumeo,et al.  Performance estimation for task graphs combining sequential path profiling and control dependence regions , 2009, 2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design.

[13]  Wayne Luk,et al.  A Scripting Engine for Combining Design Transformations , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[14]  Michael F. P. O'Boyle,et al.  A complete compiler approach to auto-parallelizing C programs for multi-DSP systems , 2005, IEEE Transactions on Parallel and Distributed Systems.

[15]  Michael A. Frumkin,et al.  Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes , 2000, ISHPC.

[16]  Wayne Luk,et al.  Design Validation by Symbolic Simulation and Equivalence Checking: A Case Study in Memory Optimization for Image Manipulation , 2009, SOFSEM.

[17]  Giovanni Agosta,et al.  Static analysis of transaction-level models , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[18]  Vlad Mihai Sima,et al.  Runtime Memory Allocation in a Heterogeneous Reconfigurable Platform , 2009, 2009 International Conference on Reconfigurable Computing and FPGAs.

[19]  Shuvra S. Bhattacharyya,et al.  Efficient techniques for clustering and scheduling onto embedded multiprocessors , 2006, IEEE Transactions on Parallel and Distributed Systems.

[20]  Milind Girkar,et al.  Automatic Extraction of Functional Parallelism from Ordinary Programs , 1992, IEEE Trans. Parallel Distributed Syst..

[21]  Eduard Ayguadé,et al.  Automatic multilevel parallelization using OpenMP , 2003, Sci. Program..

[22]  Stamatis Vassiliadis,et al.  Cost-Efficient SHA Hardware Accelerators , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[23]  Wayne Luk,et al.  Power-Aware and Branch-Aware Word-Length Optimization , 2008, 2008 16th International Symposium on Field-Programmable Custom Computing Machines.

[24]  Diego Novillo Tree SSA A New Optimization Infrastructure for GCC , 2004 .

[25]  L. Hasan ACCURATE PROFILING AND ACCELERATION EVALUATION OF THE SMITH-WATERMAN ALGORITHM USING THE MOLEN PLATFORM , 2008 .

[26]  Luca Fossati,et al.  ReSP: A Nonintrusive Transaction-Level Reflective MPSoC Simulation Platform for Design Space Exploration , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[27]  Cos S. Ierotheou,et al.  Computer Aided Parallelisation Tools (CAPTools) - Conceptual Overview and Performance on the Parallelisation of Structured Mesh Codes , 1996, Parallel Comput..

[28]  Stamatis Vassiliadis,et al.  DWARV: Delftworkbench Automated Reconfigurable VHDL Generator , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[29]  S. Vassiliadis,et al.  The Molen Media Processor : Design and Evaluation , 2005 .

[30]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.

[31]  Nirwan Ansari,et al.  A Genetic Algorithm for Multiprocessor Scheduling , 1994, IEEE Trans. Parallel Distributed Syst..

[32]  Todor Stefanov,et al.  Efficient hardware generation for Dynamic Programming problems , 2009, 2009 International Conference on Field-Programmable Technology.

[33]  Georgi Gaydadjiev,et al.  A self-adaptive on-line task placement algorithm for partially reconfigurable systems , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[34]  Andy D. Pimentel,et al.  A systematic approach to exploring embedded system architectures at multiple abstraction levels , 2006, IEEE Transactions on Computers.

[35]  Kimberly Ryan,et al.  Cadence Design Systems Inc. , 1993 .

[36]  John Paul Shen,et al.  Automatic partitioning of signal processing programs for symmetric multiprocessors , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.

[37]  Koen Bertels,et al.  A novel fast online placement algorithm on 2D partially reconfigurable devices , 2009, 2009 International Conference on Field-Programmable Technology.

[38]  Marvin V. Zelkowitz,et al.  Programming Languages: Design and Implementation , 1975 .

[39]  Ahmed Amine Jerraya,et al.  COSMOS: a codesign approach for communicating systems , 1994, Third International Workshop on Hardware/Software Codesign.

[40]  Rudolf Eigenmann,et al.  Automatic program parallelization , 1993, Proc. IEEE.

[41]  J.G.F. Coutinho,et al.  Integrated Hardware/Software Codesign for Heterogeneous Computing Systems , 2008, 2008 4th Southern Conference on Programmable Logic.

[42]  Georgi Gaydadjiev,et al.  Intelligent Merging Online Task Placement Algorithm for Partial Reconfigurable Systems , 2008, 2008 Design, Automation and Test in Europe.

[43]  Rudolf Eigenmann,et al.  Min-cut program decomposition for thread-level speculation , 2004, PLDI '04.

[44]  J. P. Luis,et al.  Parallelism extraction in acyclic code , 1996, Proceedings of 4th Euromicro Workshop on Parallel and Distributed Processing.

[45]  Jürgen Becker,et al.  An Interface for a Decentralized 2D Reconfiguration on Xilinx Virtex-FPGAs for Organic Computing , 2009, Int. J. Reconfigurable Comput..

[46]  Georgi Gaydadjiev,et al.  Online Task Scheduling for the FPGA-Based Partially Reconfigurable Systems , 2009, ARC.

[47]  Tao Yang,et al.  DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors , 1994, IEEE Trans. Parallel Distributed Syst..

[48]  Susan L. Graham,et al.  Gprof: A call graph execution profiler , 1982, SIGPLAN '82.

[49]  James C. Browne,et al.  General approach to mapping of parallel computations upon multiprocessor architectures , 1988 .

[50]  Gianluca Palermo,et al.  Improving evolutionary exploration to area-time optimization of FPGA designs , 2008, J. Syst. Archit..

[51]  Stamatis Vassiliadis,et al.  The MOLEN polymorphic processor , 2004, IEEE Transactions on Computers.

[52]  Mitsuhisa Sato,et al.  OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors , 2002, 15th International Symposium on System Synthesis, 2002..

[53]  Stamatis Vassiliadis,et al.  The Virtex II ProTM MOLEN Processor , 2004, SAMOS.

[54]  Luigi Carro,et al.  The Molen FemtoJava Engine , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).

[56]  Antonino Tumeo,et al.  Mapping and scheduling of parallel C applications with Ant Colony Optimization onto heterogeneous reconfigurable MPSoCs , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[57]  Koen Bertels,et al.  Fast Smith-Waterman hardware implementation , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[58]  Sharad Malik,et al.  Flexible and formal modeling of microprocessors with application to retargetable simulation , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[59]  James Demmel,et al.  IEEE Standard for Floating-Point Arithmetic , 2008 .

[60]  Zaid Al-Ars,et al.  Acceleration of Smith-Waterman using Recursive Variable Expansion , 2008, 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools.

[61]  Pier Luca Lanzi,et al.  Ant colony optimization for mapping and scheduling in heterogeneous multiprocessor systems , 2008, 2008 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation.

[62]  Andy D. Pimentel,et al.  rSesame - A generic system-level runtime simulation framework for reconfigurable architectures , 2009, 2009 International Conference on Field-Programmable Technology.

[63]  Ken Kennedy,et al.  Scalar replacement in the presence of conditional control flow , 1994, Softw. Pract. Exp..

[64]  Wayne Luk,et al.  Multiloop Parallelisation Using Unrolling and Fission , 2010, Int. J. Reconfigurable Comput..

[65]  D. Quinlan,et al.  ROSE: Compiler Support for Object-Oriented Frameworks , 1999, Parallel Process. Lett..

[66]  Antonino Tumeo,et al.  Performance modeling of parallel applications on MPSoCs , 2009, 2009 International Symposium on System-on-Chip.

[67]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[68]  Wayne Luk,et al.  A high-level compilation toolchain for heterogeneous systems , 2009, 2009 IEEE International SOC Conference (SOCC).

[69]  Georgi Gaydadjiev,et al.  Online Hardware Task Scheduling and Placement Algorithm on Partially Reconfigurable Devices , 2008, ARC.

[70]  Stamatis Vassiliadis,et al.  The PowerPC Backend Molen Compiler , 2004, FPL.

[71]  Mohamed Abid,et al.  COSMOS: a codesign approach for communicating systems , 1994, CODES.

[72]  Niraj K. Jha,et al.  Task graph extraction for embedded system synthesis , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[73]  Georgi Gaydadjiev,et al.  An efficient algorithm for free resources management on the FPGA , 2008, 2008 Design, Automation and Test in Europe.

[74]  Willard Korfhage,et al.  Process scheduling using genetic algorithms , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.

[75]  Guilherme Ottoni,et al.  From sequential programs to concurrent threads , 2006, IEEE Computer Architecture Letters.

[76]  Edward A. Lee,et al.  Taming heterogeneity - the Ptolemy approach , 2003, Proc. IEEE.

[77]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[78]  Diego Novillo Design and Implementation of Tree SSA , 2004 .

[79]  Harold S. Stone,et al.  A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations , 1973, IEEE Transactions on Computers.

[80]  Luciano Lavagno,et al.  Metropolis: An Integrated Electronic System Design Environment , 2003, Computer.