Free Scheduling of Tiles Based on the Transitive Closure of Dependence Graphs

A novel approach to form the free schedule of tiles comprising statement instances of the program loop nest is presented. Forming both valid tiles and free scheduling are based on the transitive closure of loop nest dependence graphs. Under the free schedule, tiles are executed as soon as their operands are available. To describe and implement the approach, loop dependences are presented in the form of tuple relations. A discussed algorithm is implemented in the open source TRACO compiler. Experimental results exposing the effectiveness of the introduced algorithm and speed-up of parallel programs, produced by means of this algorithm, are discussed.

[1]  Yves Robert,et al.  Linear Scheduling Is Nearly Optimal , 1991, Parallel Process. Lett..

[2]  Albert Cohen,et al.  The Polyhedral Model Is More Widely Applicable Than You Think , 2010, CC.

[3]  Marek Palkowski,et al.  Free scheduling for statement instances of parameterized arbitrarily nested affine loops , 2012, Parallel Comput..

[4]  William Pugh,et al.  The Omega Library interface guide , 1995 .

[5]  Martin Griebl,et al.  Automatic Parallelization of Loop Programs for Distributed Memory Architectures , 2004 .

[6]  Albert Cohen,et al.  Coarse-Grained Loop Parallelization: Iteration Space Slicing vs Affine Transformations , 2009, 2009 Eighth International Symposium on Parallel and Distributed Computing.

[7]  Cédric Bastoul,et al.  Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[8]  Jingling Xue,et al.  On Tiling as a Loop Transformation , 1997, Parallel Process. Lett..

[9]  William Pugh,et al.  Transitive Closure of Infinite Graphs and Its Applications , 2016, International Journal of Parallel Programming.

[10]  William Pugh,et al.  An Exact Method for Analysis of Value-based Array Data Dependences , 1993, LCPC.

[11]  Yves Robert,et al.  Scheduling and Automatic Parallelization , 2000, Birkhäuser Boston.

[12]  Marek Palkowski,et al.  Perfectly Nested Loop Tiling Transformations Based on the Transitive Closure of the Program Dependence Graph , 2014, ACS.

[13]  François Irigoin,et al.  Supernode partitioning , 1988, POPL '88.

[14]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[15]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[16]  William Pugh,et al.  Iteration space slicing and its application to communication optimization , 1997, ICS '97.

[17]  Monica S. Lam,et al.  An affine partitioning algorithm to maximize parallelism and minimize communication , 1999, ICS '99.