An Advanced Compiler Framework for Non-Cache-Coherent Multiprocessors
暂无分享,去创建一个
Yunheung Paek | David A. Padua | Jay Hoeflinger | Emilio L. Zapata | Angeles G. Navarro | D. Padua | J. Hoeflinger | E. Zapata | Y. Paek | A. Navarro
[1] Yunheung Paek,et al. Simplification of array access patterns for compiler optimizations , 1998, PLDI.
[2] Evangelos P. Markatos,et al. Shared memory vs. message passing in shared-memory multiprocessors , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.
[3] David A. Padua,et al. Access descriptor based locality analysis for Distributed-Shared Memory multiprocessors , 1999, Proceedings of the 1999 International Conference on Parallel Processing.
[4] Constantine D. Polychronopoulos,et al. The structure of parafrase-2: an advanced parallelizing compiler for C and FORTRAN , 1990 .
[5] Piyush Mehrotra,et al. Dynamic data distributions in Vienna Fortran , 1993, Supercomputing '93.
[6] Ken Kennedy,et al. Evaluating Compiler Optimizations for Fortran D , 1994, J. Parallel Distributed Comput..
[7] Jaspal Subhlok,et al. A new model for integrated nested task and data parallel programming , 1997, PPOPP '97.
[8] Yunheung Paek,et al. Compiler Techniques for E ective Communication on Distributed-Memory Multiprocessors , 1997 .
[9] Yunheung Paek,et al. Parallel Programming with Polaris , 1996, Computer.
[10] A. Steen. EuroBen Experiences with the SGI Origin 2000 and the Cray T , 1998 .
[11] W. Daniel Hillis,et al. The connection machine , 1985 .
[12] Seth Copen Goldstein,et al. Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[13] Irad Yavneh,et al. Implementation and Performance of a Grand Challenge 3d Quasi-Geostrophic Multi-Grid Code on the Cray T3D and IBM SP2 ; CU-CS-771-95 , 1995 .
[14] Ken Kennedy,et al. Automatic Data Layout for High Performance Fortran , 1995, SC.
[15] Andrew A. Chien,et al. A comparison of architectural support for messaging in the TMC CM-5 and the Cray T3D , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[16] Yunheung Paek,et al. Parallelization of benchmarks for scalable shared-memory multiprocessors , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[17] Emilio L. Zapata,et al. An Automatic Iteration/Data Distribution Method Based on Access Descriptors for DSMM , 1999, LCPC.
[18] Robert J. Harrison,et al. Global Arrays: a portable "shared-memory" programming model for distributed memory computers , 1994, Proceedings of Supercomputing '94.
[19] Katherine A. Yelick,et al. Optimizing Parallel SPMD Programs , 1994, LCPC.
[20] E. Ayguade,et al. A Novel Approach Towards Automatic Data Distribution , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[21] T. von Eicken,et al. Parallel programming in Split-C , 1993, Supercomputing '93.
[22] Monica S. Lam,et al. Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.
[23] Monica S. Lam,et al. Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..
[24] Shahid H. Bokhari. Communication Overhead on the Intel Paragon, IBM SP2 and Meiko CS-2. , 1995 .
[25] Jay Hoeflinger,et al. Interprocedural parallelization using memory classification analysis , 1998 .
[26] John R. Gilbert,et al. The Alignment-Distribution Graph , 1993, LCPC.
[27] Bruno Raffin,et al. Comparing the Scalability of the Cray T3E-600 and the Cray Origin 2000 Using SHMEM Routines , 1998 .
[28] Seth Copen Goldstein,et al. Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.
[29] P. Sadayappan,et al. Communication-Free Hyperplane Partitioning of Nested Loops , 1991, LCPC.
[30] Paul Feautrier,et al. Direct parallelization of call statements , 1986, SIGPLAN '86.
[31] Steven L. Scott,et al. Synchronization and communication in the T3E multiprocessor , 1996, ASPLOS VII.
[32] Bradford L. Chamberlain,et al. A Compiler Abstraction for Machine Independent Parallel Communication Generation , 1997, LCPC.
[33] William Pugh,et al. A practical algorithm for exact array dependence analysis , 1992, CACM.
[34] Yunheung Paek,et al. Experimental study of compiler techniques for NUMA machines , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.
[35] Bruno Raffin,et al. Comparing the communication performance and scalability of a Linux and a NT cluster of PCs, a Cray origin 2000, an IBM SP and a Cray T3E-600 , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.
[36] Glenn R. Lue,et al. Comparing the Communication Performance and Scalability of a SGI . . . , 1999 .
[37] Edith Schonberg,et al. Static analysis to reduce synchronization costs in data-parallel programs , 1996, POPL '96.
[38] Ken Kennedy,et al. An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..
[39] Yunheung Paek,et al. The Access Region Test , 1999, LCPC.
[40] Yunheung Paek,et al. Compiling for Distributed Memory Multiprocessors Based on Access Region Analysis , 1997 .
[41] Yunheung Paek,et al. Unified Interprocedural Parallelism Detection , 2001, International Journal of Parallel Programming.
[42] Irad Yavneh,et al. Implementation and Performance of a Grand Challenge 3d Quasi-Geostrophic Multi-Grid code on the Cray T3D and IBM SP2 , 1995, SC.
[43] Remzi H. Arpaci-Dusseau,et al. Empirical evaluation of the CRAY-T3D: a compiler perspective , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[44] Andrea C. Arpaci-Dusseau,et al. Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.
[45] Rice UniversityCORPORATE,et al. High performance Fortran language specification , 1993 .
[46] David A. Kendrick,et al. GAMS : a user's guide, Release 2.25 , 1992 .
[47] Saman Amarasinghe,et al. The suif compiler for scalable parallel machines , 1995 .