A Compiler Optimization Algorithm for Shared-Memory Multiprocessors
暂无分享,去创建一个
[1] A. Veidenbaum,et al. The cedar system and an initial performance study , 1993, ISCA '93.
[2] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[3] David A. Padua,et al. Dependence graphs and compiler optimizations , 1981, POPL '81.
[4] Ken Kennedy,et al. A technique for summarizing data access and its use in parallelism enhancing transformations , 1989, PLDI '89.
[5] Chau-Wen Tseng,et al. Compiler optimizations for eliminating barrier synchronization , 1995, PPOPP '95.
[6] Michael F. P. O'Boyle,et al. Compiler reduction of synchronisation in shared virtual memory systems , 1995, ICS '95.
[7] Mary W. Hall,et al. The ParaScope Parallel Programming , 1992 .
[8] Stephen J. Wright. Stable Parallel Algorithms for Two-Point Boundary Value Problems , 1992, SIAM J. Sci. Comput..
[9] Ken Kennedy,et al. Optimizing for parallelism and data locality , 1992 .
[10] Jaspal Subhlok,et al. Analysis of synchronization in a parallel programming environment , 1992 .
[11] Ken Kennedy,et al. Automatic loop interchange , 2004, SIGP.
[12] Ken Kennedy,et al. Automatic decomposition of scientific programs for parallel execution , 1987, POPL '87.
[13] Michael E. Wolf,et al. Improving locality and parallelism in nested loops , 1992 .
[14] David A. Padua,et al. Restructuring Fortran programs for Cedar , 1993, Concurr. Pract. Exp..
[15] Monica S. Lam,et al. Data and computation transformations for multiprocessors , 1995, PPOPP '95.
[16] Wilson C. Hsieh,et al. A framework for determining useful parallelism , 1988, ICS '88.
[17] Ken Kennedy,et al. Procedure cloning , 1992, Proceedings of the 1992 International Conference on Computer Languages.
[18] Guangye Li,et al. An Implementation of a Parallel Primal-Dual Interior Point Method for Multicommodity Flow Problems , 1992 .
[19] Pen-Chung Yew,et al. Efficient interprocedural analysis for program parallelization and restructuring , 1988, PPoPP 1988.
[20] Kathryn S. McKinley,et al. Automatic and interactive parallelization , 1992 .
[21] Susan J. Eggers,et al. Reducing false sharing on shared memory multiprocessors through compile time data transformations , 1995, PPOPP '95.
[22] Ken Kennedy,et al. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.
[23] Mary W. Hall,et al. Interprocedural Transformations for Parallel Code Generation Interprocedural Transformations for Parallel Code Generation , 1991 .
[24] Anita Osterhaug. Guide to parallel programming on Sequent computer systems , 1989 .
[25] Ken Kennedy,et al. Automatic translation of FORTRAN programs to vector form , 1987, TOPL.
[26] Monica S. Lam,et al. Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[27] Utpal Banerjee,et al. A theory of loop permutations , 1990 .
[28] V. Klema. LINPACK user's guide , 1980 .
[29] David A. Padua,et al. On the Automatic Parallelization of the Perfect Benchmarks , 1998, IEEE Trans. Parallel Distributed Syst..
[30] Stephen G. Nash,et al. A General-Purpose Parallel Algorithm for Unconstrained Optimization , 1991, SIAM J. Optim..
[31] David A. Padua,et al. The Cedar System And An Initial Performance Study , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[32] Paul Feautrier,et al. Direct parallelization of call statements , 1986, SIGPLAN '86.
[33] John L. Hennessy,et al. Finding and Exploiting Parallelism in an Ocean Simulation Program: Experience, Results, and Implications , 1992, J. Parallel Distributed Comput..
[34] Michael Wolfe,et al. Advanced Loop Interchanging , 1986, ICPP.
[35] V. Sarkar,et al. Automatic partitioning of a program dependence graph into parallel tasks , 1991, IBM J. Res. Dev..
[36] Ken Kennedy,et al. An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..
[37] Monica S. Lam,et al. A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..
[38] Lawrence Rauchwerger,et al. Effective Automatic Parallelization with Polaris , 1995 .
[39] Ken Kennedy,et al. Typed Fusion with Applications to Parallel and Sequential Code Generation , 1994 .
[40] Ken Kennedy,et al. Practical dependence testing , 1991, PLDI '91.
[41] J. Dennis,et al. Direct Search Methods on Parallel Machines , 1991 .
[42] Vivek Sarkar,et al. A general framework for iteration-reordering loop transformations , 1992, PLDI '92.
[43] Ken Kennedy,et al. Analysis and transformation in an interactive parallel programming tool , 1993, Concurr. Pract. Exp..
[44] Stephen J. Wright,et al. Parallel Algorithms for Banded Linear Systems , 1991, SIAM J. Sci. Comput..
[45] William F. Appelbe,et al. A New Algorithm for Global Optimization for Parallelism and Locality , 1994, LCPC.
[46] Stephen G. Nash,et al. Algorithm 711: BTN: software for parallel unconstrained optimization , 1992, TOMS.
[47] William F. Appelbe,et al. Program Transformation for Locality Using Affinity Regions , 1993, LCPC.
[48] Olivier Temam,et al. A quantitative analysis of loop nest locality , 1996, ASPLOS VII.