HERCULES: A Pattern Driven Code Transformation System

New parallel computers are emerging, but developing efficient scientific code for them remains difficult. A scientist must manage not only the science-domain complexity but also the performance-optimization complexity. HERCULES is a code transformation system designed to help the scientist to separate the two concerns, which improves code maintenance, and facilitates performance optimization. The system combines three technologies, code patterns, transformation scripts and compiler plugins, to provide the scientist with an environment to quickly implement code transformations that suit his needs. Unlike existing code optimization tools, HERCULES is unique in its focus on user-level accessibility. In this paper we discuss the design, implementation and an initial evaluation of HERCULES.

[1]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[2]  Peter Grogono,et al.  Comments, assertions and pragmas , 1989, SIGP.

[3]  Mary W. Hall,et al.  CHiLL : A Framework for Composing High-Level Loop Transformations , 2007 .

[4]  Diane Kelly,et al.  Software Engineering for Scientists , 2011, Comput. Sci. Eng..

[5]  Samuel P. Midkiff,et al.  Automatic loop transformations and parallelization for Java , 2000, ICS '00.

[6]  François Bodin,et al.  A user level program transformation tool , 1998, ICS '98.

[7]  Uday Bondhugula,et al.  Towards effective automatic parallelization for multicore systems , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[8]  D. Quinlan,et al.  ROSE: Compiler Support for Object-Oriented Frameworks , 1999, Parallel Process. Lett..

[9]  Weiyu Chen Building a Source-to-Source UPC-toC Translator , 2004 .

[10]  Chun Chen,et al.  A Programming Language Interface to Describe Transformations and Code Generation , 2010, LCPC.

[11]  Richard W. Vuduc,et al.  POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[12]  Rudolf Eigenmann,et al.  Cetus: A Source-to-Source Compiler Infrastructure for Multicores , 2009, Computer.

[13]  Christoph W. Kessler,et al.  Automatic Parallelization by Pattern-Matching , 1993, ACPC.

[14]  Michael F. P. O'Boyle,et al.  Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping , 2009, PLDI '09.

[15]  Insung Park,et al.  Parallel Programming and Performance Evaluation with the URSA Tool Family , 2004, International Journal of Parallel Programming.

[16]  Juan Touriño,et al.  A GSA-based compiler infrastructure to extract parallelism from complex loops , 2003, ICS '03.

[17]  Apan Qasem,et al.  Improving Performance with Integrated Program Transformations , 2004 .

[18]  Beniamino Di Martino,et al.  PAP Recognizer: a tool for automatic recognition of parallelizable patterns , 1996, WPC '96. 4th Workshop on Program Comprehension.

[19]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.