An introduction to compilation issues for parallel machines
暂无分享,去创建一个
[1] Ron Cytron,et al. Interprocedural dependence analysis and parallelization , 1986, SIGP.
[2] James R. Larus,et al. Restructuring symbolic programs for concurrent execution on multiprocessors , 1989 .
[3] Bowen Alpern,et al. Detecting equality of variables in programs , 1988, POPL '88.
[4] Janusz S. Kowalik,et al. Parallel MIMD computation : the HEP supercomputer and its applications , 1985 .
[5] David J. Kuck,et al. The Burroughs Scientific Processor (BSP) , 1982, IEEE Transactions on Computers.
[6] Jack J. Dongarra,et al. Vectorizing compilers: a test suite and results , 1988, Proceedings. SUPERCOMPUTING '88.
[7] Alan Norton,et al. A Class of Boolean Linear Transformations for Conflict-Free Power-of-Two Stride Access , 1987, ICPP.
[8] Edith Schonberg,et al. Low-overhead scheduling of nested parallelism , 1991, IBM J. Res. Dev..
[9] Utpal Banerjee,et al. Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.
[10] Michael Gerndt,et al. SUPERB: A tool for semi-automatic MIMD/SIMD parallelization , 1988, Parallel Comput..
[11] Guy L. Steele,et al. Fortran at ten gigaflops: the connection machine convolution compiler , 1991, PLDI '91.
[12] Marina C. Chen,et al. Compiling Communication-Efficient Programs for Massively Parallel Machines , 1991, IEEE Trans. Parallel Distributed Syst..
[13] David J. Lilja,et al. Combining hardware and software cache coherence strategies , 1991, ICS '91.
[14] Ken Kennedy,et al. A technique for summarizing data access and its use in parallelism enhancing transformations , 1989, PLDI '89.
[15] J A Fisher,et al. Instruction-Level Parallel Processing , 1991, Science.
[16] Kevin Smith,et al. PAT : An Interactive Fortran Parallelizing Assistant Tool , 1988, ICPP.
[17] Mary E. Mace. Memory storage patterns in parallel processing , 1987, The Kluwer international series in engineering and computer science.
[18] Apostolos Dollas,et al. The evolution of instruction sequencing , 1991, Computer.
[19] Ken Kennedy,et al. Computer support for machine-independent parallel programming in Fortran D , 1992 .
[20] Henry G. Dietz,et al. Static scheduling for barrier MIMD architectures , 1992, The Journal of Supercomputing.
[21] Duncan H. Lawrie,et al. The Prime Memory System for Array Access , 1982, IEEE Transactions on Computers.
[22] Mark N. Wegman,et al. Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.
[23] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[24] Vincent A. Guarna,et al. A Technique for Analyzing Pointer and Structure References In Parallel Restructuring Compilers , 1988, ICPP.
[25] Peiyi Tang,et al. Dynamic Processor Self-Scheduling for General Parallel Nested Loops , 1987, IEEE Trans. Computers.
[26] Manish Gupta,et al. Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers , 1992, IEEE Trans. Parallel Distributed Syst..
[27] Edsger W. Dijkstra,et al. Cooperating sequential processes , 2002 .
[28] Ahmed Sameh,et al. The Illiac IV system , 1972 .
[29] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[30] Guy L. Steele,et al. Data Optimization: Allocation of Arrays to Reduce Communication on SIMD Machines , 1990, J. Parallel Distributed Comput..
[31] John R. Ellis,et al. Bulldog: A Compiler for VLIW Architectures , 1986 .
[32] H. Wijshoff. Data organization in parallel computers , 1987 .
[33] Mark N. Wegman,et al. Analysis of pointers and structures , 1990, SIGP.
[34] Alexandru Nicolau,et al. Parallelizing Programs with Recursive Data Structures , 1989, IEEE Trans. Parallel Distributed Syst..
[35] Phil Pfeiffer,et al. Dependence analysis for pointer variables , 1989, PLDI '89.
[36] Alexander Aiken,et al. Optimal loop parallelization , 1988, PLDI '88.
[37] Steve Johnson,et al. Compiling C for vectorization, parallelization, and inline expansion , 1988, PLDI '88.
[38] Ii Robert G. Babb. Programming parallel processors , 1987 .
[39] Harold Stuart Stone. High-performance computer architecture (2nd ed.) , 1990 .
[40] Thomas R. Gross,et al. Postpass Code Optimization of Pipeline Constraints , 1983, TOPL.
[41] Anne Rogers,et al. Process decomposition through locality of reference , 1989, PLDI '89.
[42] Roy F. Touzeau. A Fortran compiler for the FPS-164 scientific computer , 1984, SIGPLAN '84.
[43] Robert G. Babb. SARA: A Cray Assembly Language Speedup Tool , 1990 .
[44] Williams Ludwell Harrison,et al. The interprocedural analysis and automatic parallelization of Scheme programs , 1990, LISP Symb. Comput..
[45] Richard M. Russell,et al. The CRAY-1 computer system , 1978, CACM.
[46] Keith D. Cooper,et al. An experiment with inline substitution , 1991, Softw. Pract. Exp..
[47] Ping-Sheng Tseng. Compiling programs for a linear systolic array , 1990, PLDI '90.
[48] Allen D. Malony,et al. Faust: an integrated environment for parallel programming , 1989, IEEE Software.
[49] Arvind,et al. T: a multithreaded massively parallel architecture , 1992, ISCA '92.
[50] Ken Kennedy,et al. The ParaScope parallel programming environment , 1993, Proc. IEEE.
[51] Thomas P. Murtagh,et al. Lifetime analysis of dynamically allocated objects , 1988, POPL '88.
[52] Arvind,et al. T: A Multithreaded Massively Parallel Architecture , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[53] Charles Koelbel,et al. Supporting shared data structures on distributed memory architectures , 1990, PPOPP '90.
[54] Daniel Gajski,et al. CEDAR: a large scale multiprocessor , 1983, CARN.
[55] Joel H. Saltz,et al. Languages, compilers and run-time environments for distributed memory machines , 1992 .
[56] Howard Jay Siegel,et al. Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.) , 1985 .
[57] Ken Kennedy,et al. Compiling programs for distributed-memory multiprocessors , 2004, The Journal of Supercomputing.
[58] Ralph Grishman,et al. The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.
[59] Ken Kennedy,et al. Automatic translation of FORTRAN programs to vector form , 1987, TOPL.
[60] I. Waston,et al. A practical data flow computer , 1982 .
[61] David E. Culler,et al. Monsoon: an explicit token-store architecture , 1998, ISCA '98.
[62] Michael J. Flynn,et al. Very high-speed computing systems , 1966 .
[63] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[64] Ken Kennedy,et al. An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..
[65] Tadashi Watanabe. Architecture and performance of NEC supercomputer SX system , 1987, Parallel Comput..
[66] CONSTANTINE D. POLYCHRONOPOULOS,et al. Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.
[67] Kevin P. McAuliffe,et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.
[68] K. McKinley,et al. Interactive Parallel Programming Using the Parascope Editor Interactive Parallel Programming Using the Parascope Editor , 1991 .
[69] S. Lennart Johnsson. The connection machine systems CM-5 , 1993, SPAA '93.
[70] Alan E. Charlesworth,et al. An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family , 1981, Computer.
[71] Santosh G. Abraham,et al. Compile-Time Partitioning of Iterative Parallel Loops to Reduce Cache Coherency Traffic , 1991, IEEE Trans. Parallel Distributed Syst..
[72] Allan Porterfield,et al. Exploiting heterogeneous parallelism on a multithreaded multiprocessor , 1992, ICS '92.
[73] Jack B. Dennis,et al. Data Flow Supercomputers , 1980, Computer.
[74] James R. Larus,et al. Detecting conflicts between structure accesses , 1988, PLDI '88.
[75] Siamak Arya. An Optimal Instruction-Scheduling Model for a Class of Vector Processors , 1985, IEEE Transactions on Computers.
[76] Paul Feautrier,et al. Direct parallelization of call statements , 1986, SIGPLAN '86.
[77] Barbara M. Chapman,et al. Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.
[78] Lauren L. Smith. Vectorizing C compilers: how good are they? , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[79] Weijia Shang,et al. Time Optimal Linear Schedules for Algorithms with Uniform Dependencies , 1991, IEEE Trans. Computers.
[80] William J. Dally,et al. Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.
[81] H. T. Kung,et al. The Warp Computer: Architecture, Implementation, and Performance , 1987, IEEE Transactions on Computers.
[82] Zhiyuan Li,et al. Program parallelization with interprocedural analysis , 2004, The Journal of Supercomputing.
[83] David A. Padua,et al. Advanced compiler optimizations for supercomputers , 1986, CACM.
[84] David Bernstein,et al. Scheduling expressions on a pipelined processor with a maximal delay of one cycle , 1989, TOPL.