Improving workload balance and code optimization in processor-in-memory systems
暂无分享,去创建一个
[1] M. Castells. Multilevel tiling for non-rectangular interation spaces , 1999 .
[2] Steve Carr,et al. Combining optimization for cache and instruction-level parallelism , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.
[3] M. Oskin,et al. Active Pages: a computation model for intelligent memory , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[4] William H. Press,et al. Numerical Recipes: FORTRAN , 1988 .
[5] Frederic T. Chong,et al. Active pages: a computation model for intelligent memory , 1998, ISCA.
[6] William H. Press,et al. Numerical Recipes in Fortran 77 , 1992 .
[7] Tsung-Chuan Huang,et al. A new analyzing approach for intelligent memory systems , 2001, Computers and Their Applications.
[8] Csaba Andras Moritz,et al. FlexCache: A Framework for Flexible Compiler Generated Data Caching , 2000, Intelligent Memory Systems.
[9] Ko-Yang Wang. Precise compile-time performance prediction for superscalar-based computers , 1994, PLDI '94.
[10] Tsung-Chuan Huang,et al. SAGE: A New Analysis and Optimization System for FlexRAM Architecture , 2000, Intelligent Memory Systems.
[11] Robert J. Fowler,et al. MINT: a front end for efficient simulation of shared-memory multiprocessors , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.
[12] David J. Kuck,et al. A Survey of Parallel Machine Organization and Programming , 1977, CSUR.
[13] Rajesh K. Gupta,et al. Adapting cache line size to application behavior , 1999, ICS '99.
[14] Christoforos E. Kozyrakis,et al. Exploiting On-Chip Memory Bandwidth in the VIRAM Compiler , 2000, Intelligent Memory Systems.
[15] Tarek S. Abdelrahman,et al. Fusion of Loops for Parallelism and Locality , 1997, IEEE Trans. Parallel Distributed Syst..
[16] Michael C. Huang,et al. FlexRAM Architecture Design Parameters , 2002 .