Compiler optimizations for parallel loops with fine-grained synchronization
暂无分享,去创建一个
[1] Youcef Saad,et al. A Basic Tool Kit for Sparse Matrix Computations , 1990 .
[2] P. Sadayappan,et al. Removal of redundant dependences in DOACROSS loops with constant dependences , 1991, PPOPP '91.
[3] Michael Ian Shamos,et al. Computational geometry: an introduction , 1985 .
[4] Joel H. Saltz,et al. Runtime compilation techniques for data partitioning and communication schedule reuse , 1993, Supercomputing '93. Proceedings.
[5] Ralph Grishman,et al. The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.
[6] David S. Johnson,et al. Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .
[7] Robert H. Halstead,et al. MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.
[8] Pen-Chung Yew,et al. On Data Synchronization For Multiprocessors , 1989, The 16th Annual International Symposium on Computer Architecture.
[9] Kevin P. McAuliffe,et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.
[10] Yoichi Muraoka,et al. Parallelism exposure and exploitation in programs , 1971 .
[11] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .
[12] Joel H. Saltz,et al. Run-time parallelization and scheduling of loops , 1989, SPAA '89.
[13] Pen-Chung Yew,et al. A Scheme to Enforce Data Dependence on Large Multiprocessor Systems , 1987, IEEE Trans. Software Eng..
[14] Ken Kennedy,et al. Analysis of event synchronization in a parallel programming tool , 1990, PPOPP '90.
[15] David L. Kuck,et al. The Structure of Computers and Computations , 1978 .
[16] Phillip L. Shaffer. Minimization of Interprocessor Synchronization In Multiprocessors with Shared and Private Memory , 1989, ICPP.
[17] A. Veidenbaum,et al. The cedar system and an initial performance study , 1993, ISCA '93.
[18] B J Smith,et al. A pipelined, shared resource MIMD computer , 1986 .
[19] Michael J. Flynn,et al. Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.
[20] B. S. Garbow,et al. Matrix Eigensystem Routines — EISPACK Guide , 1974, Lecture Notes in Computer Science.
[21] Constantine D. Polychronopoulos,et al. Advanced Loop Optimizations for Parallel Computers , 1988, ICS.
[22] P. Sadayappan,et al. An approach to synchronization for parallel computing , 1988, ICS '88.
[23] Pen-Chung Yew,et al. A Synchronization Scheme and Its Applications for Large Multiprocessor Systems , 1984, ICDCS.
[24] Zhiyu Shen,et al. An Empirical Study of Fortran Programs for Parallelizing Compilers , 1990, IEEE Trans. Parallel Distributed Syst..
[25] Zhiyuan Li,et al. A technique for reducing synchronization overhead in large scale multiprocessors , 1985, ISCA '85.
[26] Robert E. Tarjan,et al. Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..
[27] E. Reingold,et al. Combinatorial Algorithms: Theory and Practice , 1977 .
[28] Anoop Gupta,et al. Performance evaluation of memory consistency models for shared-memory multiprocessors , 1991, ASPLOS IV.
[29] Leslie Lamport,et al. The parallel execution of DO loops , 1974, CACM.
[30] H. F. Jordan. A Special Purpose Architecture for Finite Element Analysis , 1978 .
[31] Utpal Banerjee,et al. Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.
[32] David A. Padua,et al. Compiler Algorithms for Synchronization , 1987, IEEE Transactions on Computers.
[33] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).
[34] Anoop Gupta,et al. The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.
[35] S. P. Midkiff. Automatic generation of synchronization instructions for parallel processors , 1986 .
[36] Ding-Kai Chen,et al. MaxPar: An execution driven simulator for studying parallel systems , 1989 .
[37] Ken Kennedy,et al. Automatic translation of FORTRAN programs to vector form , 1987, TOPL.
[38] Zhiyuan Li,et al. On Reducing Data Synchronization in Multiprocessed Loops , 1987, IEEE Transactions on Computers.
[39] Geoffrey C. Fox,et al. The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers , 1989, Int. J. High Perform. Comput. Appl..
[40] Pen-Chung Yew,et al. Efficient Doacross execution on distributed shared-memory multiprocessors , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[41] Richard J. Anderson,et al. A Scheduling Problem Arising From Loop Parallelization on MIMD Machines , 1988, AWOC.
[42] Z. Chen,et al. On uniformization of affine dependence algorithms , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.
[43] David A. Padua,et al. Compiler Generated Synchronization for Do Loops , 1986, ICPP.
[44] David A. Padua,et al. A Comparison of Four Synchronization Optimization Techniques , 1991, ICPP.
[45] Pen-Chung Yew,et al. Execution-driven tools for parallel simulation of parallel architectures and applications , 1993, Supercomputing '93. Proceedings.
[46] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[47] Joel H. Saltz,et al. The Preprocessed Doacross Loop , 1991, ICPP.
[48] Pen-Chung Yew,et al. Cedar architecture and its software , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track.
[49] Lionel M. Ni,et al. Dependence Uniformization: A Loop Parallelization Technique , 1993, IEEE Trans. Parallel Distributed Syst..
[50] Ronald Gary Cytron. Compile-time scheduling and optimization for asynchronous machines (multiprocessor, compiler, parallel processing) , 1984 .
[51] David Alejandro Padua Haiek. Multiprocessors: discussion of some theoretical and practical problems , 1980 .
[52] John Zahorjan,et al. Improving the performance of runtime parallelization , 1993, PPOPP '93.