Tiling arbitrarily nested loops by means of the transitive
暂无分享,去创建一个
[1] Jacek Blaszczyk,et al. Object Library of Algorithms for Dynamic Optimization Problems: Benchmarking SQP and Nonlinear Interior Point Methods , 2007, Int. J. Appl. Math. Comput. Sci..
[2] Wlodzimierz Bielecki,et al. Using Basis Dependence Distance Vectors to Calculate the Transitive Closure of Dependence Relations by Means of the Floyd-Warshall Algorithm , 2013, COCOA.
[3] Albert Cohen,et al. Coarse-Grained Loop Parallelization: Iteration Space Slicing vs Affine Transformations , 2009, ISPDC.
[4] Jim Jeffers,et al. High Performance Parallelism Pearls Volume Two: Multicore and Many-core Programming Approaches , 2015 .
[5] Sanjay V. Rajopadhye,et al. Optimal semi-oblique tiling , 2001, SPAA '01.
[6] Jingling Xue,et al. Loop Tiling for Parallelism , 2000, Kluwer International Series in Engineering and Computer Science.
[7] P. Feautrier. Some Eecient Solutions to the Aane Scheduling Problem Part Ii Multidimensional Time , 1992 .
[8] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[9] Anna Beletska,et al. An Iterative Algorithm of Computing the Transitive Closure of a Union of Parameterized Affine Integer Tuple Relations , 2010, COCOA.
[10] Monica S. Lam,et al. Communication-Free Parallelization via Affine Transformations , 1994, LCPC.
[11] Guang R. Gao,et al. Tile Reduction: The First Step towards Tile Aware Parallelization in OpenMP , 2009, IWOMP.
[12] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[13] S. Campbell. Numerical analysis and systems theory , 2001 .
[14] Jingling Xue. Communication-Minimal Tiling of Uniform Dependence Loops , 1997, J. Parallel Distributed Comput..
[15] Sanjay V. Rajopadhye,et al. Parameterized Tiling for Imperfectly Nested Loops , 2009 .
[16] William Pugh,et al. Static analysis of upper and lower bounds on dependences and parallelism , 1994, TOPL.
[17] William Pugh,et al. Transitive Closure of Infinite Graphs and its Applications , 1995, Int. J. Parallel Program..
[18] Marek Palkowski,et al. Free scheduling for statement instances of parameterized arbitrarily nested affine loops , 2012, Parallel Comput..
[19] Uday Bondhugula,et al. Tiling for Dynamic Scheduling , 2014 .
[20] Marcin Maciazek,et al. Genetic and combinatorial algorithms for optimal sizing and placement of active power filters , 2015, Int. J. Appl. Math. Comput. Sci..
[21] J. Ramanujam,et al. Tiling Multidimensional Itertion Spaces for Multicomputers , 1992, J. Parallel Distributed Comput..
[22] Marek Palkowski,et al. Free Scheduling of Tiles Based on the Transitive Closure of Dependence Graphs , 2015, PPAM.
[23] William Pugh,et al. The Omega Library interface guide , 1995 .
[24] Peiyi Tang,et al. Generating efficient tiled code for distributed memory machines , 2000, Parallel Comput..
[25] Marek Palkowski,et al. Perfectly Nested Loop Tiling Transformations Based on the Transitive Closure of the Program Dependence Graph , 2014, ACS.
[26] Markus Kowarschik,et al. An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms , 2002, Algorithms for Memory Hierarchies.
[27] Cédric Bastoul,et al. Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[28] Monica S. Lam,et al. An affine partitioning algorithm to maximize parallelism and minimize communication , 1999, ICS '99.
[29] Jingling Xue,et al. Communication-Minimal Tiling of Uniform Dependence Loops , 1996, J. Parallel Distributed Comput..
[30] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time , 1992, International Journal of Parallel Programming.
[31] Martin Griebl,et al. Index Set Splitting , 2000, International Journal of Parallel Programming.
[32] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.
[33] Anne Greenbaum,et al. NUMERICAL METHODS , 2017 .
[34] Larry Carter,et al. Sparse Tiling for Stationary Iterative Methods , 2004, Int. J. High Perform. Comput. Appl..
[35] Matt W. Mutka,et al. Enabling unimodular transformations , 1994, Proceedings of Supercomputing '94.
[36] William Pugh,et al. Iteration Space Slicing for Locality , 1999, LCPC.
[37] Wlodzimierz Bielecki,et al. Using basis dependence distance vectors in the modified Floyd–Warshall algorithm , 2015, J. Comb. Optim..
[38] Martin Griebl,et al. Automatic Parallelization of Loop Programs for Distributed Memory Architectures , 2004 .
[39] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[40] William Pugh,et al. Iteration space slicing and its application to communication optimization , 1997, ICS '97.
[41] Marek Palkowski,et al. TRACO: An automatic loop nest parallelizer for numerical applications , 2015, 2015 Federated Conference on Computer Science and Information Systems (FedCSIS).
[42] Uday Bondhugula,et al. Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model , 2008, CC.
[43] F. H. Mcmahon,et al. The Livermore Fortran Kernels: A Computer Test of the Numerical Performance Range , 1986 .
[44] Albert Cohen,et al. Transitive Closures of Affine Integer Tuple Relations and Their Overapproximations , 2011, SAS.
[45] Paul Feautrier,et al. Improving Data Locality by Chunking , 2003, CC.
[46] Keshav Pingali,et al. Tiling Imperfectly-nested Loop Nests , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[47] Jingling Xue,et al. On Tiling as a Loop Transformation , 1997, Parallel Process. Lett..
[48] William Pugh,et al. An Exact Method for Analysis of Value-based Array Data Dependences , 1993, LCPC.
[49] J. Leader. Numerical Analysis and Scientific Computation , 2022 .
[50] Rafal Zdunek,et al. Regularized nonnegative matrix factorization: Geometrical interpretation and application to spectral unmixing , 2014, Int. J. Appl. Math. Comput. Sci..
[51] Albert Cohen,et al. Split tiling for GPUs: automatic parallelization using trapezoidal tiles , 2013, GPGPU@ASPLOS.