On the implementation and effectiveness of autoscheduling for shared-memory multiprocessors
暂无分享,去创建一个
[1] Ken Kennedy,et al. Compiling Fortran D for MIMD distributed-memory machines , 1992, CACM.
[2] Vivek Sarkar. PTRAN—the IBM parallel translation system , 1991 .
[3] Edith Schonberg,et al. Low-overhead scheduling of nested parallelism , 1991, IBM J. Res. Dev..
[4] Sivarama P. Dandamudi,et al. A Hierarchical Task Queue Organization for Shared-Memory Multiprocessor Systems , 1995, IEEE Trans. Parallel Distributed Syst..
[5] Rudolf Eigenmann,et al. Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs , 1992, IEEE Trans. Parallel Distributed Syst..
[6] Jyh-Herng Chow,et al. Switch-stacks: A scheme for microtasking nested parallel loops , 1990, Proceedings SUPERCOMPUTING '90.
[7] Benjamin G. Zorn,et al. Memory allocation costs in large C and C++ programs , 1994, Softw. Pract. Exp..
[8] Alexandru Nicolau,et al. Parallelizing Programs with Recursive Data Structures , 1989, IEEE Trans. Parallel Distributed Syst..
[9] Manish Gupta,et al. Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers , 1992, IEEE Trans. Parallel Distributed Syst..
[10] James R. Larus. C**: A Large-Grain, Object-Oriented, Data-Parallel Programming Language , 1992, LCPC.
[11] Ray Trimble. Storage Management in IBM APL Systems , 1991, IBM Syst. J..
[12] Thomas R. Gross,et al. Exploiting task and data parallelism on a multicomputer , 1993, PPOPP '93.
[13] Evangelos P. Markatos. Scheduling for locality in shared-memory multiprocessors , 1993 .
[14] Ron Y. Pinter,et al. The parallel C (pC) programming language , 1991, IBM J. Res. Dev..
[15] David E. Culler,et al. Monsoon: an explicit token-store architecture , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[16] Constantine D. Polychronopoulos. Multiprocessing versus Multiprogramming , 1989, ICPP.
[17] Bill Nitzberg,et al. Distributed shared memory: a survey of issues and algorithms , 1991, Computer.
[18] Barbara M. Chapman,et al. Handling Distributed Data in Vienna Fortran Procedures , 1992, LCPC.
[19] Kenneth R. Traub,et al. Multithreading: a revisionist view of dataflow architectures , 1991, ISCA '91.
[20] Carl J. Beckmann,et al. Hardware and software for functional and fine grain parallelism , 1993 .
[21] G. N. Srinivasa Prasanna,et al. Compile-time Techniques for Processor Allocation in Macro Dataflow Graphs for Multiprocessors , 1992, ICPP.
[22] Ralph Duncan,et al. A survey of parallel computer architectures , 1990, Computer.
[23] Tao Yang,et al. Clustering task graphs for message passing architectures , 1990, ICS '90.
[24] John P. Hayes,et al. Computer architecture and organization; (2nd ed.) , 1988 .
[25] Tao Yang,et al. A fast static scheduling algorithm for DAGs on an unbounded number of processors , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[26] David E. Culler,et al. Fine-grain parallelism with minimal hardware support: a compiler-controlled threaded abstract machine , 1991, ASPLOS IV.
[27] Ken Kennedy,et al. Compiling Fortran 77D and 90D for MIMD distributed-memory machines , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.
[28] L. Verlet. Computer "Experiments" on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules , 1967 .
[29] B. Quentrec,et al. New method for searching for neighbors in molecular dynamics computations , 1973 .
[30] Gordon Bell,et al. Ultracomputers: a teraflop before its time , 1992, CACM.
[31] Constantine D. Polychronopoulos,et al. Symbolic analysis for parallelizing compilers , 1996, TOPL.
[32] R. S. Nikhil. Can dataflow subsume von Neumann computing? , 1989, ISCA '89.
[33] Ken Kennedy,et al. Interprocedural compilation of Fortran D for MIMD distributed-memory machines , 1992, Proceedings Supercomputing '92.
[34] Stephen R. Goldschmidt,et al. Simulation of multiprocessors: accuracy and performance , 1993 .
[35] Dennis Gannon,et al. Distributed pC++ Basic Ideas for an Object Parallel Language , 1993, Sci. Program..
[36] Sachin S. Sapatnekar,et al. A Convex Programming Approach for Exploiting Data and Functional Parallelism on Distributed Memory Multicomputers , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[37] Alfred V. Aho,et al. The Transitive Reduction of a Directed Graph , 1972, SIAM J. Comput..
[38] Kenji Nishida,et al. Evaluation of a Prototype Data Flow Processor of the SIGMA-1 for Scientific Computations , 1986, ISCA.
[39] Hans P. Zima,et al. Compiling for distributed-memory systems , 1993 .
[40] Krishna M. Kavi,et al. Parallelism in object-oriented languages: a survey , 1992, IEEE Software.
[41] Thomas E. Anderson,et al. The performance implications of thread management alternatives for shared-memory multiprocessors , 1989, SIGMETRICS '89.
[42] David B. Loveman. High performance Fortran , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.
[43] Vivek Sarkar,et al. A Concurrent Execution Semantics for Parallel Program Graphs and Program Dependence Graphs , 1992, LCPC.
[44] Sandeep K. S. Gupta,et al. On the Synthesis of Parallel Programs from Tensor Product Formulas for Block Recursive Algorithms , 1992, LCPC.
[45] Eric Williams,et al. Performance optimizations, implementation, and verification of the SGI Challenge multiprocessor , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.
[46] Vivek Sarkar,et al. Parallel Program Graphs and their Classification , 1993, LCPC.
[47] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[48] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[49] Barbara M. Chapman,et al. A Software Architecture for Multidisciplinary Applications: Integrating Task and Data Parallelism , 1994, CONPAR.
[50] Robert A. Iannucci,et al. A dataflow/von Neumann hybrid architecture , 1988 .
[51] Thomas G. Macdonald,et al. MPP Fortran Programming Model , 1992 .
[52] Prithviraj Banerjee,et al. Processor Allocation and Scheduling of Macro Dataflow Graphs on Distributed Memory Multicomputers by the PARADIGM Compiler , 1993, 1993 International Conference on Parallel Processing - ICPP'93.
[53] Michael J. Flynn,et al. Very high-speed computing systems , 1966 .
[54] Jaspal Subhlok. Automatic Mapping of Task and Data Parallel Programs for Efficient Execution on Multicomputers , 1993 .
[55] Ken Kennedy,et al. Computer support for machine-independent parallel programming in Fortran D , 1992 .
[56] Anoop Gupta,et al. Making effective use of shared-memory multiprocessors: the process control approach , 1991 .
[57] Janak H. Patel,et al. A low-overhead coherence solution for multiprocessors with private cache memories , 1984, ISCA '84.
[58] Geoffrey C. Fox,et al. A Compilation Approach for Fortran 90D/HPF Compilers on Distributed Memory MIMD Computers , 1993 .
[59] David K. Poulsen. Memory latency reduction via data prefetching and data forwarding in shared memory multiprocessors , 1994 .
[60] J. Ramanujam,et al. Compile-Time Techniques for Data Distribution in Distributed Memory Machines , 1991, IEEE Trans. Parallel Distributed Syst..
[61] V. Sarkar,et al. Automatic partitioning of a program dependence graph into parallel tasks , 1991, IBM J. Res. Dev..
[62] Jack J. Dongarra,et al. Performance of various computers using standard linear equations software in a FORTRAN environment , 1988, CARN.
[63] Anne Rogers,et al. Compiling for Distributed Memory Architectures , 1994, IEEE Trans. Parallel Distributed Syst..
[64] Satoshi Sekiguchi,et al. Efficient vector processing on a dataflow supercomputer SIGMA-1 , 1988, Proceedings. SUPERCOMPUTING '88.
[65] Milind Girkar. Functional parallelism: theoretical foundations and implementation , 1992 .
[66] Anoop Gupta,et al. COOL: a language for parallel programming , 1990 .
[67] Vivek Sarkar,et al. Partitioning and Scheduling Parallel Programs for Multiprocessing , 1989 .
[68] Niklaus Wirth,et al. Algorithms + Data Structures = Programs , 1976 .
[69] Sachin S. Sapatnekar,et al. A Framework for Exploiting Data and Functional Parallelism on Distributed Memory Multicomputers , 1994 .
[70] William J. Dally,et al. Experiences Implementing Dataflow on a General-Purpose Parallel Computer , 1991, ICPP.
[71] Kenji Nishida,et al. A hardware design of the SIGMA-1, a data flow computer for scientific computations , 1986 .
[72] Barbara M. Chapman,et al. Automatic Support for Data Distribution on Distributed Memory Multiprocessor Systems , 1993, LCPC.
[73] Constantine D. Polychronopoulos,et al. Microarchitecture support for dynamic scheduling of acyclic task graphs , 1992, MICRO.
[74] CONSTANTINE D. POLYCHRONOPOULOS,et al. Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.
[75] Andrew A. Chien,et al. Concurrent aggregates (CA) , 1990, PPOPP '90.
[76] Peter A. Dinda,et al. Communication and memory requirements as the basis for mapping task and data parallel programs , 1994, Proceedings of Supercomputing '94.
[77] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.