A compiler-based approach for dynamically managing scratch-pad memories in embedded systems
暂无分享,去创建一个
Mahmut T. Kandemir | Narayanan Vijaykrishnan | Mary Jane Irwin | Ismail Kadayif | J. Ramanujam | M. J. Irwin | Amisha Parikh | M. Kandemir | J. Ramanujam | N. Vijaykrishnan | I. Kadayif | M. Irwin | A. Parikh
[1] Jennifer Eyre,et al. DSP Processors Hit the Mainstream , 1998, Computer.
[2] Ken Kennedy,et al. Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.
[3] John Zahorjan,et al. Optimizing Data Locality by Array Restructuring , 1995 .
[4] Donald Yeung,et al. Evaluating the impact of memory system performance on software prefetching and locality optimizations , 2001, ICS '01.
[5] W. Jalby,et al. To copy or not to copy: a compile-time technique for assessing when data copying should be used to eliminate cache conflicts , 1993, Supercomputing '93.
[6] Steven K. Reinhardt,et al. A fully associative software-managed cache design , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[7] Wei Li,et al. Unifying data and control transformations for distributed shared-memory machines , 1995, PLDI '95.
[8] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[9] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[10] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[11] Michael F. P. O'Boyle,et al. Integrating Loop and Data Transformations for Global Optimization , 2002, J. Parallel Distributed Comput..
[12] H. De Man,et al. Global communication and memory optimizing transformations for low power signal processing systems , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.
[13] Mahmut T. Kandemir,et al. Improving locality using loop and data transformations in an integrated framework , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[14] Miodrag Potkonjak,et al. MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[15] Sally A. McKee,et al. Access ordering and memory-conscious cache utilization , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.
[16] Santosh Pande,et al. Optimizing On-Chip Memory Usage Through Loop Restructuring for Embedded Processors , 2000 .
[17] Michael F. P. O'Boyle,et al. Integrating loop and data transformations for global optimisation , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[18] David A. Patterson,et al. Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .
[19] Chau-Wen Tseng,et al. Improving Locality for Adaptive Irregular Scientific Codes , 2000, LCPC.
[20] Kathryn S. McKinley,et al. Tile size selection using cache organization and data layout , 1995, PLDI '95.
[21] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[22] Chaitali Chakrabarti,et al. Memory exploration for low power embedded systems , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).
[23] Keshav Pingali,et al. Data-centric multi-level blocking , 1997, PLDI '97.
[24] Monica S. Lam,et al. Automatic computation and data decomposition for multiprocessors , 1997 .
[25] Sanjay Ranka,et al. Memory hierarchy management for iterative graph structures , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.
[26] Ken Kennedy,et al. Improving memory hierarchy performance for irregular applications , 1999, ICS '99.
[27] Francky Catthoor,et al. Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design , 1998 .
[28] Mahmut T. Kandemir,et al. Energy-driven integrated hardware-software optimizations using SimplePower , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[29] Michael Wolfe,et al. High performance compilers for parallel computing , 1995 .
[30] Keith D. Cooper,et al. Compiler-controlled memory , 1998, ASPLOS VIII.
[31] Luca Benini,et al. Increasing Energy Efficiency of Embedded Systems by Application-Specific Memory Hierarchy Generation , 2000, IEEE Des. Test Comput..
[32] Saman Amarasinghe,et al. The suif compiler for scalable parallel machines , 1995 .
[33] Mahmut T. Kandemir,et al. Influence of compiler optimizations on system power , 2000, Proceedings 37th Design Automation Conference.
[34] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[35] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .
[36] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[37] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[38] J. Eyre,et al. The evolution of DSP processors , 2000, IEEE Signal Process. Mag..
[39] Nikil D. Dutt,et al. Architectural exploration and optimization of local memory in embedded systems , 1997, Proceedings. Tenth International Symposium on System Synthesis (Cat. No.97TB100114).
[40] Anant Agarwal,et al. Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors , 1995, IEEE Trans. Parallel Distributed Syst..
[41] Nikil D. Dutt,et al. Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.
[42] Dennis Gannon,et al. Strategies for cache and local memory management by global program transformation , 1988, J. Parallel Distributed Comput..
[43] Larry Carter,et al. Localizing non-affine array references , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[44] Utpal Banerjee,et al. Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.