The evicted-address filter: A unified mechanism to address both cache pollution and thrashing
暂无分享,去创建一个
Onur Mutlu | Todd C. Mowry | Michael A. Kozuch | Vivek Seshadri | Vivek Seshadri | O. Mutlu | M. Kozuch | T. Mowry
[1] Ken Kennedy,et al. Improving effective bandwidth through compiler enhancement of global cache reuse , 2004, J. Parallel Distributed Comput..
[2] Jichuan Chang,et al. Cooperative cache partitioning for chip multiprocessors , 2007, ICS '07.
[3] Abhishek Kumar,et al. A New Design of Bloom Filter for Packet Inspection Speedup , 2007, IEEE GLOBECOM 2007 - IEEE Global Telecommunications Conference.
[4] A. Snavely,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[5] Mor Harchol-Balter,et al. Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[6] Aamer Jaleel,et al. High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.
[7] John Turek,et al. Optimal Partitioning of Cache Memory , 1992, IEEE Trans. Computers.
[8] Balaram Sinharoy,et al. POWER7: IBM's next generation server processor , 2010, 2009 IEEE Hot Chips 21 Symposium (HCS).
[9] Ashish Goel,et al. Small subset queries and bloom filters using ternary associative memories, with applications , 2010, SIGMETRICS '10.
[10] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[11] Gabriel H. Loh,et al. PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches , 2009, ISCA '09.
[12] Wen-mei W. Hwu,et al. Run-Time Cache Bypassing , 1999, IEEE Trans. Computers.
[13] Yale N. Patt,et al. The V-Way cache: demand-based associativity via global replacement , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[14] Burton H. Bloom,et al. Space/time trade-offs in hash coding with allowable errors , 1970, CACM.
[15] Ken Kennedy,et al. Improving effective bandwidth through compiler enhancement of global cache reuse , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[16] Stijn Eyerman,et al. System-Level Performance Metrics for Multiprogram Workloads , 2008, IEEE Micro.
[17] Dennis Shasha,et al. 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm , 1994, VLDB.
[18] S. Kim,et al. Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[19] Carole-Jean Wu,et al. SHiP: Signature-based Hit Predictor for high performance caching , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[20] André Seznec,et al. Exploiting Single-Usage for Effective Memory Management , 2007, Asia-Pacific Computer Systems Architecture Conference.
[21] Jaehyuk Huh,et al. Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[22] Mor Harchol-Balter,et al. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[23] R. Govindarajan,et al. Emulating Optimal Replacement with a Shepherd Cache , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[24] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[25] Brad Calder,et al. Automatically characterizing large scale program behavior , 2002, ASPLOS X.
[26] Basilio B. Fraguela,et al. Adaptive line placement with the set balancing cache , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[27] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[28] David M. Brooks,et al. The design of a bloom filter hardware accelerator for ultra low power systems , 2009, ISLPED.
[29] Aamer Jaleel,et al. Adaptive insertion policies for high performance caching , 2007, ISCA '07.
[30] Dean M. Tullsen,et al. Hardware identification of cache conflict misses , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[31] Arnold L. Rosenberg,et al. Using the compiler to improve cache replacement decisions , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[32] Gary S. Tyson,et al. A modified approach to data cache management , 1995, MICRO 1995.
[33] Steven K. Reinhardt,et al. A fully associative software-managed cache design , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[34] Onur Mutlu,et al. A Case for MLP-Aware Cache Replacement , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[35] Song Jiang,et al. LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance , 2002, SIGMETRICS '02.
[36] Dharmendra S. Modha,et al. CAR: Clock with Adaptive Replacement , 2004, FAST.
[37] Chau-Wen Tseng,et al. Compiler optimizations for eliminating cache conflict misses , 1997 .
[38] Gerhard Weikum,et al. The LRU-K page replacement algorithm for database disk buffering , 1993, SIGMOD Conference.
[39] André Seznec,et al. A case for two-way skewed-associative caches , 1993, ISCA '93.
[40] Christoforos E. Kozyrakis,et al. The ZCache: Decoupling Ways and Associativity , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[41] Aamer Jaleel,et al. Adaptive insertion policies for managing shared caches , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[42] Maurice Herlihy,et al. Virtualizing transactional memory , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[43] Edward S. Davidson,et al. Reducing conflicts in direct-mapped caches with a temporality-based design , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.
[44] Manoj Franklin,et al. Balancing thoughput and fairness in SMT processors , 2001, 2001 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS..
[45] Ravi R. Iyer,et al. CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.
[46] James E. Smith,et al. Virtual private caches , 2007, ISCA '07.
[47] H BloomBurton. Space/time trade-offs in hash coding with allowable errors , 1970 .
[48] Basilio B. Fraguela,et al. Reducing capacity and conflict misses using Set Saturation Levels , 2010, 2010 International Conference on High Performance Computing.
[49] Nimrod Megiddo,et al. ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.
[50] Shih-Lien Lu,et al. Bloom filtering cache misses for accurate data speculation and prefetching , 2014, ICS 25th Anniversary.
[51] M. V. Ramakrishna,et al. Efficient Hardware Hashing Functions for High Performance Computers , 1997, IEEE Trans. Computers.
[52] G. Edward Suh,et al. A new memory monitoring scheme for memory-aware scheduling and partitioning , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[53] Hong Jiang,et al. STEM: Spatiotemporal Management of Capacity for Intra-core Last Level Caches , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[54] Stefanos Kaxiras,et al. Cache replacement based on reuse-distance prediction , 2007, 2007 25th International Conference on Computer Design.