High performing cache hierarchies for server workloads: Relaxing inclusion to capture the latency benefits of exclusive caches
暂无分享,去创建一个
Adrian Moga | Aamer Jaleel | Joel S. Emer | Simon C. Steely | Joseph Nuzman | J. Emer | A. Jaleel | S. Steely | Joseph Nuzman | Adrian Moga
[1] Babak Falsafi,et al. Proactive instruction fetch , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[2] Wen-Hann Wang,et al. On the inclusion properties for multi-level cache hierarchies , 1988, ISCA '88.
[3] Babak Falsafi,et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.
[4] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[5] Aamer Jaleel,et al. Adaptive insertion policies for high performance caching , 2007, ISCA '07.
[6] Mainak Chaudhuri,et al. Bypass and insertion algorithms for exclusive last-level caches , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[7] Moinuddin K. Qureshi. Adaptive Spill-Receive for robust high-performance caching in CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[8] Thomas F. Wenisch,et al. Practical off-chip meta-data for temporal memory streaming , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[9] Mohamed M. Zahran,et al. Non-Inclusion Property in Multi-Level Caches Revisited , 2007, Int. J. Comput. Their Appl..
[10] Norman P. Jouppi,et al. Tradeoffs in two-level on-chip caching , 1994, ISCA '94.
[11] Aamer Jaleel,et al. Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[12] Aamer Jaleel,et al. High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.
[13] Pejman Lotfi Kamran. Scale-Out Processors , 2013 .
[14] Babak Falsafi,et al. Database Servers on Chip Multiprocessors: Limitations and Opportunities , 2007, CIDR.
[15] Ying Zheng,et al. Performance evaluation of exclusive cache hierarchies , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.
[16] Babak Falsafi,et al. Scale-out processors , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[17] Luiz André Barroso,et al. Memory system characterization of commercial workloads , 1998, ISCA.
[18] Aamer Jaleel,et al. Adaptive insertion policies for managing shared caches , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[19] Thomas F. Wenisch,et al. Temporal instruction fetch streaming , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[20] B. Jacob,et al. CMP $ im : A Pin-Based OnThe-Fly Multi-Core Cache Simulator , 2008 .
[21] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[22] A. Jaleel. Memory Characterization of Workloads Using Instrumentation-Driven Simulation A Pin-based Memory Characterization of the SPEC CPU 2000 and SPEC CPU 2006 Benchmark Suites , 2022 .
[23] Carole-Jean Wu,et al. PACMan: Prefetch-Aware Cache Management for high performance caching , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[24] Scott McFarling. Cache replacement with dynamic exclusion , 1992, ISCA '92.
[25] Hyesoon Kim,et al. FLEXclusion: Balancing cache capacity and on-chip bandwidth via Flexible Exclusion , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[26] M. Zahran. Cache Replacement Policy Revisited , 2022 .
[27] Jichuan Chang,et al. Cooperative Caching for Chip Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[28] Glenn Reinman,et al. Optimizations Enabled by a Decoupled Front-End Architecture , 2001, IEEE Trans. Computers.
[29] Thomas F. Wenisch,et al. Temporal streams in commercial server applications , 2008, 2008 IEEE International Symposium on Workload Characterization.
[30] Rajiv Kapoor,et al. Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[31] David A. Wood,et al. ASR: Adaptive Selective Replication for CMP Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[32] Carole-Jean Wu,et al. SHiP: Signature-based Hit Predictor for high performance caching , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[33] Kevin M. Lepak,et al. Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor , 2010, IEEE Micro.
[34] Katherine E. Fletcher,et al. Techniques For Reducing the Impact of Inclusion in Shared Network Cache Multiprocessors , 1994 .