Software-assisted cache mechanisms for embedded systems
暂无分享,去创建一个
[1] S. Kim,et al. Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[2] Israel Koren,et al. The minimax cache: an energy-efficient framework for media processors , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[3] Mateo Valero,et al. Software management of selective and dual data caches , 1997 .
[4] Dean M. Tullsen,et al. Hardware identification of cache conflict misses , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[5] Arnold L. Rosenberg,et al. Using the compiler to improve cache replacement decisions , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[6] David J. Lilja,et al. A compiler-assisted data prefetch controller , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).
[7] Steven K. Reinhardt,et al. A fully associative software-managed cache design , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[8] Miodrag Potkonjak,et al. Application-driven synthesis of core-based systems , 1997, 1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).
[9] Steven Przybylski. The performance impact of block sizes and fetch strategies , 1990, ISCA '90.
[10] Brad Calder,et al. Reducing cache misses using hardware and software page placement , 1999, ICS '99.
[11] William J. Dally,et al. Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.
[12] Todd C. Mowry,et al. Taming the memory hogs: using compiler-inserted releases to manage physical memory intelligently , 2000, OSDI.
[13] Jun Yang,et al. Lightweight set buffer: low power data cache for multimedia application , 2003, ISLPED '03.
[14] David A. Padua,et al. Estimating cache misses and locality using stack distances , 2003, ICS '03.
[15] Kenneth C. Yeager. The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.
[16] R. E. Kessler,et al. The Alpha 21264 Microprocessor: Out-of-Order Execution at 600 Mhz , 1998 .
[17] Brad Calder,et al. Quantifying load stream behavior , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[18] David J. Sager,et al. The microarchitecture of the Pentium 4 processor , 2001 .
[19] Peter Petrov,et al. Performance and power effectiveness in embedded processors customizable partitioned caches , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[20] N. Maki,et al. A data-replace-controlled cache memory system and its performance evaluations , 1999, Proceedings of IEEE. IEEE Region 10 Conference. TENCON 99. 'Multimedia Technology for Asia-Pacific Information Infrastructure' (Cat. No.99CH37030).
[21] Anant Agarwal,et al. Column-associative caches: a technique for reducing the miss rate of direct-mapped caches , 1993, ISCA '93.
[22] James R. Larus,et al. Cache-conscious structure layout , 1999, PLDI '99.
[23] Israel Koren,et al. Cool-Cache: A compiler-enabled energy efficient data caching framework for embedded/multimedia processors , 2003, TECS.
[24] David J. Lilja,et al. When Caches Aren't Enough: Data Prefetching Techniques , 1997, Computer.
[25] Gilles Pokam,et al. Energy-efficiency potential of a phase-based cache resizing scheme for embedded systems , 2004, Eighth Workshop on Interaction between Compilers and Computer Architectures, 2004. INTERACT-8 2004..
[26] Gary S. Tyson,et al. Utilizing reuse information in data cache management , 1998, ICS '98.
[27] Chen Ding,et al. Miss rate prediction across all program inputs , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[28] Yale N. Patt,et al. An effective programmable prefetch engine for on-chip caches , 1995, MICRO 1995.
[29] Yutao Zhong,et al. Predicting whole-program locality through reuse distance analysis , 2003, PLDI.
[30] James R. Larus,et al. Cache-conscious structure definition , 1999, PLDI '99.
[31] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[32] Henry M. Levy,et al. An Architecture for Software-Controlled Data Prefetching , 1991, ISCA.
[33] Mark D. Hill,et al. A case for direct-mapped caches , 1988, Computer.
[34] Krste Asanovic,et al. Direct addressed caches for reduced power consumption , 2001, MICRO.
[35] K. Kavi. Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .
[36] Michel Dubois,et al. Self-correcting LRU replacement policies , 2004, CF '04.
[37] Jih-Kwon Peir,et al. Capturing dynamic memory reference behavior with adaptive cache topology , 1998, ASPLOS VIII.
[38] James R. Goodman,et al. Hardware techniques to improve the performance of the processor/memory interface , 1998 .
[39] Alexander V. Veidenbaum,et al. Reducing power consumption for high-associativity data caches in embedded processors , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.
[40] Kathryn S. McKinley,et al. Guided region prefetching: a cooperative hardware/software approach , 2003, ISCA '03.
[41] Srinivas Devadas,et al. A Code Reordering Transformation for Improved Cache Performance , 2001 .
[42] Wen-mei W. Hwu,et al. Run-time Adaptive Cache Hierarchy Via Reference Analysis , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[43] John Turek,et al. Optimal Partitioning of Cache Memory , 1992, IEEE Trans. Computers.
[44] Kathryn S. McKinley,et al. Cooperative caching with keep-me and evict-me , 2005, 9th Annual Workshop on Interaction between Compilers and Computer Architectures (INTERACT'05).
[45] Sang Lyul Min,et al. On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies , 1999, SIGMETRICS '99.
[46] Margaret Martonosi,et al. TCP: tag correlating prefetchers , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[47] George Murillo,et al. Enhancing Data Cache Performance via Dynamic Allocation , 2003 .
[48] Cathy May,et al. The PowerPC Architecture: A Specification for a New Family of RISC Processors , 1994 .
[49] Hiroyuki Tomiyama,et al. Code placement techniques for cache miss rate reduction , 1997, TODE.
[50] Ravi R. Iyer,et al. CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.
[51] Csaba Andras Moritz,et al. Cool-Mem: combining statically speculative memory accessing with selective address translation for energy efficiency , 2002, ASPLOS X.
[52] Yannis Smaragdakis,et al. EELRU: simple and effective adaptive page replacement , 1999, SIGMETRICS '99.
[53] Harish Patil,et al. Profile-guided post-link stride prefetching , 2002, ICS '02.
[54] D. Burger,et al. Memory Bandwidth Limitations of Future Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[55] Mahmut T. Kandemir,et al. Partitioned instruction cache architecture for energy efficiency , 2003, TECS.
[56] Todd C. Mowry,et al. Compiler-directed page coloring for multiprocessors , 1996, ASPLOS VII.
[57] Gary S. Tyson,et al. Region-based caching: an energy-delay efficient memory architecture for embedded processors , 2000, CASES '00.
[58] Tien-Fu Chen,et al. Alternative implementations of hybrid branch predictors , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[59] M. Schulz,et al. Identifying and Exploiting Spatial Regularity in Data Memory References , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[60] Per Stenström,et al. A prefetching technique for irregular accesses to linked data structures , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[61] Rajesh K. Gupta,et al. Adapting cache line size to application behavior , 1999, ICS '99.
[62] Seh-Woong Jeong,et al. Reducing cache pollution of prefetching in a small data cache , 2001, Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001.
[63] T. Ozawa,et al. Cache miss heuristics and preloading techniques for general-purpose programs , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[64] Sanjeev Kumar,et al. Exploiting spatial locality in data caches using spatial footprints , 1998, ISCA.
[65] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[66] Jean-Loup Baer,et al. Modified LRU policies for improving second-level cache behavior , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[67] Srinivas Devadas,et al. Application-specific memory management for embedded systems using software-controlled caches , 2000, Proceedings 37th Design Automation Conference.
[68] Frank Vahid,et al. A highly configurable cache architecture for embedded systems , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[69] Richard E. Kessler,et al. Evaluating stream buffers as a secondary cache replacement , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[70] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[71] Steve Carr,et al. Reuse-distance-based miss-rate prediction on a per instruction basis , 2004, MSP '04.
[72] Anna R. Karlin,et al. A study of integrated prefetching and caching strategies , 1995, SIGMETRICS '95/PERFORMANCE '95.
[73] Dean M. Tullsen,et al. Runtime identification of cache conflict misses: The adaptive miss buffer , 2001, TOCS.
[74] Yale N. Patt,et al. The V-Way Cache: Demand Based Associativity via Global Replacement , 2005, ISCA 2005.
[75] Alexander V. Veidenbaum,et al. An Integrated Hardware/Software Data Prefetching Scheme for Shared-Memory Multiprocessors1 , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[76] Jean-Loup Baer,et al. Effective Hardware Based Data Prefetching for High-Performance Processors , 1995, IEEE Trans. Computers.
[77] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[78] Wei Zhang,et al. A compiler approach for reducing data cache energy , 2003, ICS '03.
[79] Laszlo A. Belady,et al. A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..
[80] Sally A. McKee,et al. Design and evaluation of dynamic access ordering hardware , 1996, ICS '96.
[81] S. M. Shahrier,et al. On predictability and optimization of multiprogrammed caches for real-time applications , 1997, 1997 IEEE International Performance, Computing and Communications Conference.
[82] Olivier Temam,et al. An Algorithm for Optimally Exploiting Spatial and Temporal Locality in Upper Memory Levels , 1999, IEEE Trans. Computers.
[83] Norman P. Jouppi,et al. Reconfigurable caches and their application to media processing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[84] Mikko H. Lipasti,et al. Cache miss heuristics and preloading techniques for general-purpose programs , 1995, MICRO 28.
[86] G. Edward Suh,et al. Analytical cache models with applications to cache partitioning , 2001, ICS '01.
[87] Wen-mei W. Hwu,et al. Run-time spatial locality detection and optimization , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[88] Alan Jay Smith,et al. Evaluating Associativity in CPU Caches , 1989, IEEE Trans. Computers.
[89] Israel Koren,et al. Cool-cache for hot multimedia , 2001, MICRO.
[90] Michel Dubois,et al. Optimal replacements in caches with two miss costs , 1999, SPAA '99.
[91] Mahmut T. Kandemir,et al. Power-aware partitioned cache architectures , 2001, ISLPED '01.
[92] Sharad Malik,et al. Precise miss analysis for program transformations with caches of arbitrary associativity , 1998, ASPLOS VIII.
[93] Srinivas Devadas,et al. Software-assisted cache replacement mechanisms for embedded systems , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).
[94] Ana Pont,et al. The filter cache: a run-time cache management approach , 1999, Proceedings 25th EUROMICRO Conference. Informatics: Theory and Practice for the New Millennium.
[95] Carole Dulong,et al. The IA-64 Architecture at Work , 1998, Computer.
[96] Francky Catthoor,et al. Fast and extensive system-level memory exploration for ATM applications , 1997, Proceedings. Tenth International Symposium on System Synthesis (Cat. No.97TB100114).
[97] Babak Falsafi,et al. Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[98] Margaret Martonosi,et al. Timekeeping in the memory system: predicting and optimizing memory behavior , 2002, ISCA.
[99] Wei-Chung Hsu,et al. Data Prefetching On The HP PA-8000 , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[100] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[101] Michel Dubois,et al. Cost-sensitive cache replacement algorithms , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[102] Aamer Jaleel,et al. Adaptive insertion policies for high performance caching , 2007, ISCA '07.
[103] Gary S. Tyson,et al. Active Management of Data Caches by Exploiting Reuse Information , 1999, IEEE Trans. Computers.
[104] Babak Falsafi,et al. Selective, accurate, and timely self-invalidation using last-touch prediction , 2000, ISCA '00.
[105] Jun Yang,et al. Low cost instruction cache designs for tag comparison elimination , 2003, ISLPED '03.
[106] Siddhartha Chatterjee,et al. Exact analysis of the cache behavior of nested loops , 2001, PLDI '01.
[107] Doug Burger,et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.
[108] Antonio González,et al. A locality sensitive multi-module cache with explicit management , 1999, ICS '99.
[109] Harold S. Stone,et al. Improving Disk Cache Hit-Ratios Through Cache Partitioning , 1992, IEEE Trans. Computers.
[110] Sally A. McKee,et al. Smarter Memory: Improving Bandwidth for Streamed References , 1998, Computer.
[111] Jignesh M. Patel,et al. Data prefetching by dependence graph precomputation , 2001, ISCA 2001.
[112] Gary S. Tyson,et al. A modified approach to data cache management , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[113] Brian N. Bershad,et al. Avoiding conflict misses dynamically in large direct-mapped caches , 1994, ASPLOS VI.
[114] Wayne H. Wolf,et al. A task-level hierarchical memory model for system synthesis of multiprocessors , 1997, DAC.
[115] Srinivas Devadas,et al. Controlling Cache Pollution in Prefetching With Software-assisted Cache Replacement , 2005 .
[116] Yale N. Patt,et al. Partitioned first-level cache design for clustered microarchitectures , 2003, ICS '03.
[117] Babak Falsafi,et al. Dead-block prediction & dead-block correlating prefetchers , 2001, ISCA 2001.
[118] Alexandru Nicolau,et al. Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration , 1998 .
[119] David J. Lilja,et al. Data prefetch mechanisms , 2000, CSUR.
[120] Mahmut T. Kandemir,et al. A matrix-based approach to the global locality optimization problem , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[121] Wen-mei W. Hwu,et al. Run-Time Cache Bypassing , 1999, IEEE Trans. Computers.
[122] Babak Falsafi,et al. Accurate and complexity-effective spatial pattern prediction , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[123] Mikko H. Lipasti,et al. Partial resolution in branch target buffers , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[124] Yale N. Patt,et al. The V-Way cache: demand-based associativity via global replacement , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[125] Björn Lisper,et al. Data cache locking for higher program predictability , 2003, SIGMETRICS '03.
[126] Richard E. Kessler,et al. The Alpha 21264 microprocessor , 1999, IEEE Micro.
[127] Scott McFarling,et al. Program optimization for instruction caches , 1989, ASPLOS III.
[128] Santosh G. Abraham,et al. Efficient simulation of caches under optimal replacement with applications to miss characterization , 1993, SIGMETRICS '93.
[129] James R. Larus,et al. Using generational garbage collection to implement cache-conscious data placement , 1998, ISMM '98.
[130] Chandra Krintz,et al. Cache-conscious data placement , 1998, ASPLOS VIII.
[131] Emmett Witchel. The Span Cache: Software Controlled Tag Checks and Cache Line Size , 2001 .
[132] Jim Zelenka,et al. Informed prefetching and caching , 1995, SOSP.
[133] Todd C. Mowry,et al. Compiler-based prefetching for recursive data structures , 1996, ASPLOS VII.
[134] C. M. Krishna,et al. Cool-cache for hot multimedia , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[135] Kathryn S. McKinley,et al. Combining Cooperative Software / Hardware Prefetching and Cache Replacment , 2004 .
[136] Peter Petrov,et al. Towards effective embedded processors in codesigns: customizable partitioned caches , 2001, Ninth International Symposium on Hardware/Software Codesign. CODES 2001 (IEEE Cat. No.01TH8571).