Towards Optimal Multi-level Tiling for Stencil Computations
暂无分享,去创建一个
Sanjay V. Rajopadhye | Rinku Dewri | Lakshminarayanan Renganarayanan | Manjukumar Harthikote-Matha | S. Rajopadhye | Rinku Dewri | Lakshminarayanan Renganarayanan | M. Harthikote-Matha
[1] David G. Wonnacott,et al. Achieving Scalable Locality with Time Skewing , 2002, International Journal of Parallel Programming.
[2] Richard M. Karp,et al. The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.
[3] William Gropp,et al. Solving PDEs on loosely-coupled parallel processors , 1987, Parallel Comput..
[4] Patrice Quinton,et al. The mapping of linear recurrence equations on regular arrays , 1989, J. VLSI Signal Process..
[5] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[6] Jingling Xue,et al. Loop Tiling for Parallelism , 2000, Kluwer International Series in Engineering and Computer Science.
[7] Sanjay V. Rajopadhye,et al. Synthesizing systolic arrays from recurrence equations , 1990, Parallel Comput..
[8] J. Lofberg,et al. YALMIP : a toolbox for modeling and optimization in MATLAB , 2004, 2004 IEEE International Conference on Robotics and Automation (IEEE Cat. No.04CH37508).
[9] Leonid Oliker,et al. Impact of modern memory subsystems on cache optimizations for stencil computations , 2005, MSP '05.
[10] David G. Wonnacott,et al. Using time skewing to eliminate idle time due to memory bandwidth and network limitations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[11] G. Roth,et al. Compiling Stencils in High Performance Fortran , 1997, ACM/IEEE SC 1997 Conference (SC'97).
[12] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[13] Larry Carter,et al. Quantifying the Multi-Level Nature of Tiling Interactions , 1997, International Journal of Parallel Programming.
[14] Zhiyuan Li,et al. Automatic tiling of iterative stencil loops , 2004, TOPL.
[15] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[16] Dan I. Moldovan,et al. Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays , 1986, IEEE Transactions on Computers.
[17] Larry Carter,et al. Determining the idle time of a tiling , 1997, POPL '97.
[18] Johan Efberg,et al. YALMIP : A toolbox for modeling and optimization in MATLAB , 2004 .
[19] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[20] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[21] Alain Darte. Regular partitioning for synthesizing fixed-size systolic arrays , 1991, Integr..
[22] Yinyu Ye,et al. An infeasible interior-point algorithm for solving primal and dual geometric programs , 1997, Math. Program..
[23] Sanjay V. Rajopadhye,et al. Optimal Semi-Oblique Tiling , 2003, IEEE Trans. Parallel Distributed Syst..
[24] Jingling Xue,et al. On Tiling as a Loop Transformation , 1997, Parallel Process. Lett..
[25] Michael A. Frumkin,et al. Tight bounds on cache use for stencil operations on rectangular grids , 2002, JACM.
[26] Sanjay V. Rajopadhye,et al. A Geometric Programming Framework for Optimal Multi-Level Tiling , 2004, Proceedings of the ACM/IEEE SC2004 Conference.
[27] Guy L. Steele,et al. Fortran at ten gigaflops: the connection machine convolution compiler , 1991, PLDI '91.
[28] Alok N. Choudhary,et al. Automatic optimization of communication in compiling out-of-core stencil codes , 1996, ICS '96.
[29] Chau-Wen Tseng,et al. Tiling Optimizations for 3D Scientific Computations , 2000, ACM/IEEE SC 2000 Conference (SC'00).