The Polyhedral Model Is More Widely Applicable Than You Think

The polyhedral model is a powerful framework for automatic optimization and parallelization. It is based on an algebraic representation of programs, allowing to construct and search for complex sequences of optimizations. This model is now mature and reaches production compilers. The main limitation of the polyhedral model is known to be its restriction to statically predictable, loop-based program parts. This paper removes this limitation, allowing to operate on general data-dependent control-flow. We embed control and exit predicates as first-class citizens of the algebraic representation, from program analysis to code generation. Complementing previous (partial) attempts in this direction, our work concentrates on extending the code generation step and does not compromise the expressiveness of the model. We present experimental evidence that our extension is relevant for program optimization and parallelization, showing performance improvements on benchmarks that were thought to be out of reach of the polyhedral model.

[1]  Seif Haridi,et al.  EURO-PAR '95 Parallel Processing , 1995, Lecture Notes in Computer Science.

[2]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[3]  Mary W. Hall,et al.  CHiLL : A Framework for Composing High-Level Loop Transformations , 2007 .

[4]  Albert Cohen,et al.  Maximal Static Expansion , 1998, POPL '98.

[5]  Ken Kennedy,et al.  Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .

[6]  Paul Feautrier,et al.  Dataflow analysis of array and scalar references , 1991, International Journal of Parallel Programming.

[7]  Richard M. Karp,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.

[8]  David I. August,et al.  Decoupled software pipelining with the synchronization array , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[9]  Lawrence Rauchwerger,et al.  Sensitivity analysis for automatic parallelization on multi-cores , 2007, ICS '07.

[10]  Michael F. P. O'Boyle,et al.  Array recovery and high-level transformations for DSP applications , 2003, TECS.

[11]  Paul Feautrier,et al.  Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.

[12]  David Parello,et al.  Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies , 2006, International Journal of Parallel Programming.

[13]  Paul Feautrier,et al.  Fuzzy array dataflow analysis , 1995, PPOPP '95.

[14]  D. K. Arvind,et al.  Detection of Concurrency-Related Errors in Joyce , 1992, CONPAR.

[15]  Martin Griebl,et al.  A scheme for detecting the termination of a parallel loop nest , 1998 .

[16]  Vivek Sarkar,et al.  Array SSA form and its use in parallelization , 1998, POPL '98.

[17]  Cédric Bastoul,et al.  Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[18]  Martin Griebl,et al.  Generation of Synchronous Code for Automatic Parallelization of while Loops , 1995, Euro-Par.

[19]  Lawrence Rauchwerger,et al.  Hybrid Dependence Analysis for Automatic Parallelization , 2005 .

[20]  Martin Griebl,et al.  On Scanning Space-Time Mapped While Loops , 1994, CONPAR.

[21]  William Pugh,et al.  An Exact Method for Analysis of Value-based Array Data Dependences , 1993, LCPC.

[22]  M. Palkovic,et al.  Enhanced applicability of loop transformations , 2007 .

[23]  J.-F. Collard Space-time transformation of while-loops using speculative execution , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[24]  William Pugh,et al.  The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[25]  Pierre Boulet,et al.  Loop Parallelization Algorithms: From Parallelism Extraction to Code Generation , 1998, Parallel Comput..

[26]  Albert Cohen,et al.  Deep jam: conversion of coarse-grain parallelism to instruction-level and vector parallelism for irregular applications , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[27]  Paul Feautrier,et al.  Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time , 1992, International Journal of Parallel Programming.

[28]  Sanjay V. Rajopadhye,et al.  Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.

[29]  Monica S. Lam,et al.  Improving parallelism and data locality with affine partitioning , 2001 .

[30]  Kleanthis Psarris,et al.  The I Test: A New Test for Subscript Data Dependence , 1990, ICPP.

[31]  François Irigoin,et al.  Exact versus Approximate Array Region Analyses , 1996, LCPC.

[32]  Martin Griebl,et al.  Termination detection in parallel loop nests with while loops , 1999, Parallel Comput..

[33]  Jean-Francois Collard,et al.  Automatic parallelization ofwhile-loops using speculative execution , 1995, International Journal of Parallel Programming.

[34]  Uday Bondhugula,et al.  Hybrid Iterative and Model-Driven Optimization in the Polyhedral Model , 2008 .

[35]  Lawrence Rauchwerger,et al.  Hybrid Analysis: Static & Dynamic Memory Reference Analysis , 2004, International Journal of Parallel Programming.

[36]  Martin Griebl,et al.  Automatic Parallelization of Loop Programs for Distributed Memory Architectures , 2004 .

[37]  Lawrence Rauchwerger,et al.  The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization , 1995, PLDI '95.

[38]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[39]  Albert Cohen,et al.  Iterative optimization in the polyhedral model: part ii, multidimensional time , 2008, PLDI '08.