The ParaPhrase Project: Parallel Patterns for Adaptive Heterogeneous Multicore Systems

This paper describes the ParaPhrase project, a new 3-year targeted research project funded under EU Framework 7 Objective 3.4 (Computer Systems), starting in October 2011. ParaPhrase aims to follow a new approach to introducing parallelism using advanced refactoring techniques coupled with high-level parallel design patterns. The refactoring approach will use these design patterns to restructure programs defined as networks of software components into other forms that are more suited to parallel execution. The programmer will be aided by high-level cost information that will be integrated into the refactoring tools. The implementation of these patterns will then use a well-understood algorithmic skeleton approach to achieve good parallelism.

[1]  Paul Feautrier,et al.  Automatic Parallelization in the Polytope Model , 1996, The Data Parallel Programming Model.

[2]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[3]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[4]  Peter Kilpatrick,et al.  Behavioural Skeletons in GCM: Autonomic Management of Grid Components , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).

[5]  Monica S. Lam,et al.  Blocking and array contraction across arbitrarily nested loops using affine partitioning , 2001, PPoPP '01.

[6]  Peter Kilpatrick,et al.  Autonomic management of multiple non-functional concerns in behavioural skeletons , 2009, CoreGRID@Euro-Par.

[7]  Hesham H. Ali,et al.  Task scheduling in parallel and distributed systems , 1994, Prentice Hall series in innovative technology.

[8]  Peter Kilpatrick,et al.  Autonomic management of non-functional concerns in distributed & parallel application programming , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[9]  Clemens Grelck,et al.  Shared memory multiprocessor support for functional array processing in SAC , 2005, J. Funct. Program..

[10]  Alexander V. Shafarenko,et al.  A Binding Scope Analysis for Generic Programs on Arrays , 2005, IFL.

[11]  Paul Feautrier,et al.  Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.

[12]  William F. Opdyke,et al.  Refactoring object-oriented frameworks , 1992 .

[13]  Timothy G. Mattson,et al.  Patterns for parallel programming , 2004 .

[14]  R Day,et al.  The eclipse open-development platform , 2008 .

[15]  Cédric Bastoul,et al.  Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[16]  Clemens Grelck,et al.  From Contracts Towards Dependent Types: Proofs by Partial Evaluation , 2008, IFL.

[17]  Ralph E. Johnson,et al.  Design Patterns: Abstraction and Reuse of Object-Oriented Design , 1993, ECOOP.

[18]  Marco Danelutto,et al.  HPC the easy way: new technologies for high performance application development and deployment , 2003, J. Syst. Archit..

[19]  Shinji Nakadai,et al.  Optimizing Multiple Machine Learning Jobs on MapReduce , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[20]  Hans Werner Meuer,et al.  Top500 Supercomputer Sites , 1997 .

[21]  Monica S. Lam,et al.  Maximizing parallelism and minimizing synchronization with affine transforms , 1997, POPL '97.

[22]  Christian Lengauer,et al.  Loop Parallelization in the Polytope Model , 1993, CONCUR.

[23]  Paul Feautrier,et al.  Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time , 1992, International Journal of Parallel Programming.

[24]  Patrizio Dazzi,et al.  Scalable Computing: Practice and Experience WSSP, Warsaw, Poland, 2007. To appear. MUSKEL: AN EXPANDABLE SKELETON ENVIRONMENT∗ , 2007 .

[25]  FeautrierPaul Some efficient solutions to the affine scheduling problem , 1992 .

[26]  Frank Yellin,et al.  The Java Virtual Machine Specification , 1996 .

[27]  Henri Casanova,et al.  Adaptive Scheduling for Task Farming with Grid Middleware , 1999, Int. J. High Perform. Comput. Appl..

[28]  L. Geppert,et al.  Transmeta's magic show [microprocessor chips] , 2000 .

[29]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..

[30]  Murray Cole,et al.  Algorithmic Skeletons: Structured Management of Parallel Computation , 1989 .

[31]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[32]  Albert Cohen,et al.  Iterative optimization in the polyhedral model: part ii, multidimensional time , 2008, PLDI '08.

[33]  Horacio González-Vélez,et al.  An adaptive parallel pipeline pattern for grids , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[34]  Alan Weiss,et al.  Allocating Independent Subtasks on Parallel Processors , 1985, IEEE Transactions on Software Engineering.

[35]  Horacio González-Vélez,et al.  A survey of algorithmic skeleton frameworks: high‐level structured parallel programming enablers , 2010, Softw. Pract. Exp..

[36]  Martin Griebl,et al.  Automatic Parallelization of Loop Programs for Distributed Memory Architectures , 2004 .

[37]  Monica S. Lam,et al.  Maximizing Parallelism and Minimizing Synchronization with Affine Partitions , 1998, Parallel Comput..

[38]  Clemens Grelck,et al.  With-Loop Scalarization - Merging Nested Array Operations , 2003, IFL.

[39]  Sven-Bodo Scholz,et al.  WITH-Loop-Folding in SAC - Condensing Consecutive Array Operations , 1997, Implementation of Functional Languages.

[40]  Clemens Grelck,et al.  A Hybrid Shared Memory Execution Model for a Data Parallel Language with I/O , 2008, Parallel Process. Lett..

[41]  Alexander V. Shafarenko,et al.  Index Vector Elimination - Making Index Vectors Affordable , 2006, IFL.

[42]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[43]  Danny Dig A Refactoring Approach to Parallelism , 2011, IEEE Software.

[44]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[45]  Francine Berman,et al.  Master/slave computing on the Grid , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[46]  Shikharesh Majumdar,et al.  Scheduling in multiprogrammed parallel systems , 1988, SIGMETRICS '88.

[47]  Jurriaan Hage,et al.  Implementation and Application of Functional Languages , 2011, Lecture Notes in Computer Science.

[48]  Debasish Ghose,et al.  Scheduling Divisible Loads in Parallel and Distributed Systems , 1996 .

[49]  Clemens Grelck,et al.  On Optimising Shape-Generic Array Programs Using Symbolic Structural Information , 2006, IFL.

[50]  D. Walker,et al.  Patterns and Skeletons for Parallel and Distributed Computing , 2022 .

[51]  Horacio González-Vélez,et al.  Adaptive structured parallelism for distributed heterogeneous architectures: a methodological approach with pipelines and farms , 2010, Concurr. Comput. Pract. Exp..

[52]  Sebastian Pop,et al.  Automatic streamization in GCC , 2009 .

[53]  Arlo Faria,et al.  MapReduce : Distributed Computing for Machine Learning , 2006 .

[54]  Sathish S. Vadhiyar,et al.  Self adaptivity in Grid computing , 2005, Concurr. Pract. Exp..

[55]  James Demmel,et al.  A view of the parallel computing landscape , 2009, CACM.

[56]  Michael F. P. O'Boyle,et al.  Adaptive java optimisation using instance-based learning , 2004, ICS '04.

[57]  Ken Kennedy,et al.  Interactive Parallel Programming using the ParaScope Editor , 1991, IEEE Trans. Parallel Distributed Syst..

[58]  Marco Danelutto,et al.  Map, reduce and mapreduce, the skeleton way , 2010, ICCS.

[59]  Marco Danelutto ON SKELETONS & DESIGN PATTERNS , 2002 .

[60]  Joseph M. Hellerstein,et al.  GraphLab: A New Framework For Parallel Machine Learning , 2010, UAI.

[61]  Werner Kluge,et al.  Implementation of Functional Languages , 1996, Lecture Notes in Computer Science.

[62]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[63]  Paul Feautrier The Data Parallel Programming Model , 1996, Lecture Notes in Computer Science.

[64]  Gerhard R. Joubert,et al.  Parallel Computing - Advances and Current Issues: Proceedings of the International Conference, Parco 2001 , 2002 .

[65]  Joseph E. Gonzalez,et al.  GraphLab: A New Parallel Framework for Machine Learning , 2010 .

[66]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[67]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[68]  Horacio González-Vélez,et al.  Adaptive statistical scheduling of divisible workloads in heterogeneous systems , 2010, J. Sched..

[69]  Frank Tip,et al.  Refactoring for reentrancy , 2009, ESEC/FSE '09.

[70]  Clemens Grelck,et al.  With-Loop Fusion for Data Locality and Parallelism , 2005, IFL.