Automatic data and computation decomposition on distributed memory parallel computers
暂无分享,去创建一个
[1] Monica S. Lam,et al. Automatic computation and data decomposition for multiprocessors , 1997 .
[2] Monica S. Lam,et al. Communication-Free Parallelization via Affine Transformations , 1994, LCPC.
[3] Manish Gupta,et al. On privatization of variables for data-parallel execution , 1997, Proceedings 11th International Parallel Processing Symposium.
[4] Ken Kennedy,et al. Automatic data layout for distributed-memory machines , 1998, TOPL.
[5] Ching-Tien Ho,et al. Optimal communication primitives and graph embeddings on hypercubes , 1990 .
[6] Jingling Xue. Communication-Minimal Tiling of Uniform Dependence Loops , 1997, J. Parallel Distributed Comput..
[7] Jingling Xue,et al. Communication-Minimal Tiling of Uniform Dependence Loops , 1996, J. Parallel Distributed Comput..
[8] J. Ramanujam,et al. Compile-Time Techniques for Data Distribution in Distributed Memory Machines , 1991, IEEE Trans. Parallel Distributed Syst..
[9] Barbara M. Chapman,et al. Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.
[10] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[11] PeiZong Lee. Efficient Algorithms for Data Distribution on Distributed Memory Parallel Computers , 1997, IEEE Trans. Parallel Distributed Syst..
[12] Marina C. Chen,et al. The Generation of a Class of Multipliers: Synthesizing Highly Parallel Algorithms in VLSI , 1988, IEEE Trans. Computers.
[13] Weijia Shang,et al. On Supernode Transformation with Minimized Total Running Time , 1998, IEEE Trans. Parallel Distributed Syst..
[14] Ulrich Kremer,et al. Fortran RED - A Retargetable Environment for Automatic Data Layout , 1998, LCPC.
[15] P. Sadayappan,et al. Communication-Free Hyperplane Partitioning of Nested Loops , 1993, J. Parallel Distributed Comput..
[16] Monica Sin-Ling Lam,et al. A Systolic Array Optimizing Compiler , 1989 .
[17] Harry Berryman,et al. Distributed Memory Compiler Design for Sparse Problems , 1995, IEEE Trans. Computers.
[18] Christian Lengauer,et al. The derivation of systolic implementations of programs , 2004, Acta Informatica.
[19] Constantine D. Polychronopoulos. Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design , 1988, IEEE Trans. Computers.
[20] Prithviraj Banerjee,et al. Compiler techniques for optimizing communication and data distribution for distributed-memory multicomputers , 1996 .
[21] Marina C. Chen,et al. Generating explicit communication from shared-memory program references , 1990, Proceedings SUPERCOMPUTING '90.
[22] Jang-Ping Sheu,et al. Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers , 2004, The Journal of Supercomputing.
[23] Ping-Sheng Tseng. A Systolic Array Parallelizing Compiler , 1990, J. Parallel Distributed Comput..
[24] Monica S. Lam,et al. An affine partitioning algorithm to maximize parallelism and minimize communication , 1999, ICS '99.
[25] W. Shang,et al. On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays , 1992, IEEE Trans. Parallel Distributed Syst..
[26] Jang-Ping Sheu,et al. Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers , 1994, IEEE Trans. Parallel Distributed Syst..
[27] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[28] P. Lee,et al. Generating Global Name-Space Communication Sets for Array Assignment Statements , 1997 .
[29] Mahmut T. Kandemir,et al. Data Relation Vectors: A New Abstraction for Data Optimizations , 2001, IEEE Trans. Computers.
[30] Sandeep K. S. Gupta,et al. An Interprocedural Framework for Determining Efficient Array Data Redistributeions , 1998, J. Inf. Sci. Eng..
[31] Marina C. Chen,et al. The Data Alignment Phase in Compiling Programs for Distrubuted-Memory Machines , 1991, J. Parallel Distributed Comput..
[32] PeiZong Lee,et al. Techniques for Compiling Programs on Distributed Memory Multicomputers , 1995, Parallel Comput..
[33] Marina C. Chen,et al. Compiling Communication-Efficient Programs for Massively Parallel Machines , 1991, IEEE Trans. Parallel Distributed Syst..
[34] Larry Carter,et al. Selecting tile shape for minimal execution time , 1999, SPAA '99.
[35] Charles Koelbel,et al. High Performance Fortran Handbook , 1993 .
[36] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[37] Manish Gupta,et al. A methodology for high-level synthesis of communication on multicomputers , 1992, ICS '92.
[38] Isidoro Couvertier-Reyes,et al. Automatic Data and Computation Mapping for Distributed-Memory Machines. , 1996 .
[39] Rajeev Barua,et al. Communication-Minimal Partitioning of Parallel Loops and Data Arrays for Cache-Coherent Distributed-Memory Multiprocessors , 1996, LCPC.
[40] Jang-Ping Sheu,et al. Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers , 1996, LCPC.
[41] J. Ramanujam,et al. Tiling Multidimensional Itertion Spaces for Multicomputers , 1992, J. Parallel Distributed Comput..
[42] John R. Gilbert,et al. Modeling Data-Parallel Programs with the Alignment-Distribution Graph , 1994 .
[43] Jang-Ping Sheu,et al. Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers , 1993, 1993 International Conference on Parallel Processing - ICPP'93.
[44] Jan-Jan Wu. Optimization and transformation techniques for high performance Fortran , 1996 .
[45] G. C. Fox,et al. Solving Problems on Concurrent Processors , 1988 .
[46] John R. Gilbert,et al. Automatic array alignment in data-parallel programs , 1993, POPL '93.
[47] Zvi M. Kedem,et al. On high-speed computing with a programmable linear array , 1988, Supercomputing '88.
[48] Ken Kennedy,et al. Compiling programs for distributed-memory multiprocessors , 1988, The Journal of Supercomputing.
[49] M. Guptay,et al. Compile-Time Estimation of Communication Costs ofPrograms , 1994 .
[50] Anne Rogers,et al. Compiling for Distributed Memory Architectures , 1994, IEEE Trans. Parallel Distributed Syst..
[51] Ken Kennedy,et al. Automatic Data Layout for Distributed-Memory Machines in the D Programming Environment , 1994, Automatic Parallelization.
[52] William Gropp,et al. Users guide for mpich, a portable implementation of MPI , 1996 .
[53] Yves Robert,et al. Determining the idle time of a tiling: new results , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.
[54] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[55] Hans P. Zima,et al. Compiling for distributed-memory systems , 1993 .
[56] Mark A. Johnson,et al. Solving problems on concurrent processors. Vol. 1: General techniques and regular problems , 1988 .
[57] John R. Gilbert,et al. Array Distribution in Data-Parallel Programs , 1994, LCPC.
[58] Anant Agarwal,et al. Automatic Partitioning of Parallel Loops for Cache-Coherent Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.
[59] Vikram S. Adve,et al. High Performance Fortran Compilation Techniques for Parallelizing Scientific Codes , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[60] Anant Agarwal,et al. Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors , 1995, IEEE Trans. Parallel Distributed Syst..
[61] Lynn Conway,et al. Introduction to VLSI systems , 1978 .
[62] Kai Hwang,et al. Advanced computer architecture - parallelism, scalability, programmability , 1992 .
[63] PEIZONG LEE,et al. Synthesizing Linear Array Algorithms from Nested For Loop Algorithms , 2015, IEEE Trans. Computers.
[64] Monica S. Lam,et al. Maximizing Parallelism and Minimizing Synchronization with Affine Partitions , 1998, Parallel Comput..
[65] Guang R. Gao,et al. Automatic data and computation decomposition for distributed memory machines , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.
[66] Ken Kennedy,et al. Compiling Fortran D for MIMD distributed-memory machines , 1992, CACM.
[67] Manish Gupta,et al. Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers , 1992, IEEE Trans. Parallel Distributed Syst..
[68] Dan I. Moldovan,et al. Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays , 1986, IEEE Transactions on Computers.
[69] Hudson Benedito Ribas. Automatic generation of systolic programs from nested loops , 1990 .
[70] Ken Kennedy,et al. Automatic translation of FORTRAN programs to vector form , 1987, TOPL.
[71] Yves Robert,et al. (Pen)-ultimate tiling? , 1994, Integr..
[72] Guy L. Steele,et al. The High Performance Fortran Handbook , 1993 .
[73] Zvi M. Kedem,et al. Mapping Nested Loop Algorithms into Multidimensional Systolic Arrays , 2017, IEEE Trans. Parallel Distributed Syst..