Enhancing Spatial Locality using Data Layout Optimizations
暂无分享,去创建一个
J. Ramanujam | N. Shenoy | M. Kandemir | P. Banerjee | A. Choudhary | M. Kandemir | A. Choudhary | P. Banerjee | J. Ramanujam | N. Shenoy
[1] Mahmut T. Kandemir,et al. Compiler algorithms for optimizing locality and parallelism on shared and distributed memory machines , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.
[2] John Zahorjan,et al. Optimizing Data Locality by Array Restructuring , 1995 .
[3] Mahmut Kandemir,et al. A Data Layout Optimization Technique Based on Hyperplanes , 1997 .
[4] Wei Li,et al. Compiler cache optimizations for banded matrix problems , 1995, ICS '95.
[5] Anoop Gupta,et al. The DASH prototype: implementation and performance , 1992, ISCA '92.
[6] Chau-Wen Tseng,et al. Compiler optimizations for improving data locality , 1994, ASPLOS VI.
[7] Henry G. Dietz,et al. Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation , 1991, LCPC.
[8] Mahmut T. Kandemir,et al. A compiler algorithm for optimizing locality in loop nests , 1997, ICS '97.
[9] Monica S. Lam,et al. Data and computation transformations for multiprocessors , 1995, PPOPP '95.
[10] LiWei,et al. Unifying data and control transformations for distributed shared-memory machines , 1995 .
[11] TsengChau-Wen,et al. Compiler optimizations for improving data locality , 1994 .
[12] Barbara M. Chapman,et al. Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.
[13] J. Ramanujam,et al. Compile-Time Techniques for Data Distribution in Distributed Memory Machines , 1991, IEEE Trans. Parallel Distributed Syst..
[14] Susan J. Eggers,et al. Eliminating False Sharing , 1991, ICPP.
[15] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[16] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[17] Wei Li,et al. Compiling for NUMA Parallel Machines , 1993 .
[18] Wei Li,et al. Unifying data and control transformations for distributed shared-memory machines , 1995, PLDI '95.
[19] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[20] Dennis Gannon,et al. Strategies for cache and local memory management by global program transformation , 1988, J. Parallel Distributed Comput..
[21] Susan J. Eggers,et al. Reducing false sharing on shared memory multiprocessors through compile time data transformations , 1995, PPOPP '95.
[22] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.
[23] Michael F. P. O'Boyle,et al. Non-singular data transformations: definition, validity and applications , 1997, ICS '97.