HEAP: A Highly Efficient Adaptive Multi-processor Framework

Writing parallel code is difficult, especially when starting from a sequential reference implementation. Our research efforts, as demonstrated in this paper, face this challenge directly by providing an innovative toolset that helps software developers profile and parallelize an existing sequential implementation, by exploiting top-level pipeline-style parallelism. The innovation of our approach is based on the facts that a) we use both automatic and profiling-driven estimates of the available parallelism, b) we refine those estimates using metric-driven verification techniques, and c) we support dynamic recovery of excessively optimistic parallelization. The proposed toolset has been utilized to find an efficient parallel code organization for a number of real-world representative applications, and a version of the toolset is provided in an open-source manner.

[1]  T. von Eicken,et al.  Parallel programming in Split-C , 1993, Supercomputing '93.

[2]  Thomas J. Ashby,et al.  MPA: Parallelizing an Application onto a Multicore Platform Made Easy , 2009, IEEE Micro.

[3]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.

[4]  Stefanos Kaxiras,et al.  SARC Coherence: Scaling Directory Cache Coherence in Performance and Power , 2010, IEEE Micro.

[5]  B. Ramakrishna Rau,et al.  PICO: Automatically Designing Custom Computers , 2002, Computer.

[6]  George C. Necula,et al.  CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.

[7]  Andrea C. Arpaci-Dusseau,et al.  Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.

[8]  Vicki H. Allan,et al.  Software pipelining , 1995, CSUR.

[9]  E. Rijpkema,et al.  Compaan: deriving process networks from Matlab for embedded signal processing architectures , 2000, Proceedings of the Eighth International Workshop on Hardware/Software Codesign. CODES 2000 (IEEE Cat. No.00TH8518).

[10]  Hsien-Hsin S. Lee,et al.  Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[11]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.

[12]  William Thies,et al.  A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[13]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[14]  Michael L. Scott,et al.  False sharing and its effect on shared memory performance , 1993 .