Optimal software pipelining of nested loops
暂无分享,去创建一个
[1] Alexandru Nicolau,et al. Loop Quantization: A Generalized Loop Unwinding Technique , 1988, J. Parallel Distributed Comput..
[2] Kemal Ebcioglu,et al. A global resource-constrained parallelization technique , 1989 .
[3] Shlomo Weiss,et al. A study of scalar compilation techniques for pipelined supercomputers , 1987, ASPLOS 1987.
[4] Christine Eisenbeis. Optimization of horizontal microcode generation for loop structures , 1988, ICS '88.
[5] J. Ramanujam,et al. Tiling multidimensional iteration spaces for nonshared memory machines , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[6] Alexander Aiken,et al. Optimal loop parallelization , 1988, PLDI '88.
[7] H. P. Williams. THEORY OF LINEAR AND INTEGER PROGRAMMING (Wiley-Interscience Series in Discrete Mathematics and Optimization) , 1989 .
[8] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[9] Roy F. Touzeau. A Fortran compiler for the FPS-164 scientific computer , 1984, SIGPLAN '84.
[10] Utpal Banerjee,et al. Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.
[11] Joseph A. Fisher,et al. Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.
[12] David A. Padua,et al. Advanced compiler optimizations for supercomputers , 1986, CACM.
[13] Barbara B. Simons,et al. Scheduling Sequential Loops on Parallel Processors , 1987, ICS.
[14] Monica Sin-Ling Lam,et al. A Systolic Array Optimizing Compiler , 1989 .
[15] Alex Aiken,et al. Compaction-Based Parallelization , 1988 .
[16] Leslie Lamport,et al. The parallel execution of DO loops , 1974, CACM.
[17] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[18] Robert P. Colwell,et al. A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS 1987.
[19] Ronald Gary Cytron. Compile-time scheduling and optimization for asynchronous machines (multiprocessor, compiler, parallel processing) , 1984 .
[20] B. Ramakrishna Rau. Cydra 5 directed dataflow architecture , 1988, Digest of Papers. COMPCON Spring 88 Thirty-Third IEEE Computer Society International Conference.
[21] Guang R. Gao,et al. Extending Software Pipelining Techniques for Scheduling Nested Loops , 1993, LCPC.
[22] J. Ramanujam,et al. Tiling Multidimensional Itertion Spaces for Multicomputers , 1992, J. Parallel Distributed Comput..
[23] Jian Wang,et al. GURPR—a method for global software pipelining , 1987, MICRO 20.
[24] Guang R. Gao,et al. A timed Petri-net model for fine-grain loop scheduling , 1991, PLDI '91.
[25] Vicki H. Allan,et al. Software pipelining: a comparison and improvement , 1990, [1990] Proceedings of the 23rd Annual Workshop and Symposium@m_MICRO 23: Microprogramming and Microarchitecture.
[26] J. Ramanujam. Software Pipelining of Nested Loops , 1994 .
[27] Steven Vajda,et al. Linear Programming. Methods and Applications , 1964 .
[28] Ken Kennedy,et al. Automatic translation of FORTRAN programs to vector form , 1987, TOPL.
[29] Robert E. Tarjan,et al. Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..
[30] Alan E. Charlesworth,et al. An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family , 1981, Computer.
[31] D. Bartholomew,et al. Linear Programming: Methods and Applications , 1970 .
[32] Kemal Ebcioglu,et al. A compilation technique for software pipelining of loops with conditional jumps , 1987, MICRO 20.
[33] Ron Cytron,et al. Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.
[34] B. Ramakrishna Rau,et al. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing , 1981, MICRO 14.
[35] Kazuo Iwano,et al. An Efficient Algorithm for Optimal Loop Parallelization , 1990, SIGAL International Symposium on Algorithms.