Combining Process-Based Cache Partitioning and Pollute Region Isolation to Improve Shared Last Level Cache Utilization on Multicore Systems
暂无分享,去创建一个
Tao Huang | Jing Wang | Qi Zhong | Xuetao Guan | Keyi Wang | Tao Huang | Xuetao Guan | Qi Zhong | Keyi Wang | Jing Wang
[1] Yutao Zhong,et al. Predicting whole-program locality through reuse distance analysis , 2003, PLDI.
[2] Gary S. Tyson,et al. Region-based caching: an energy-delay efficient memory architecture for embedded processors , 2000, CASES '00.
[3] Zhao Zhang,et al. Enabling software management for multicore caches with a lightweight hardware support , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[4] Tao Huang,et al. Reducing last level cache pollution through OS-level software-controlled region-based partitioning , 2012, SAC '12.
[5] Emery D. Berger,et al. CRAMM: virtual memory support for garbage-collected applications , 2006, OSDI '06.
[6] Yan Solihin,et al. Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.
[7] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[8] Aamer Jaleel,et al. Adaptive insertion policies for high performance caching , 2007, ISCA '07.
[9] S. Kim,et al. Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[10] Michael Stumm,et al. RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations , 2009, ASPLOS.
[11] Yuanyuan Zhou,et al. The Multi-Queue Replacement Algorithm for Second Level Buffer Caches , 2001, USENIX Annual Technical Conference, General Track.
[12] Michael Stumm,et al. Enhancing operating system support for multicore processors by using hardware performance monitoring , 2009, OPSR.
[13] Yan Solihin,et al. Counter-based cache replacement algorithms , 2005, 2005 International Conference on Computer Design.
[14] Sang Lyul Min,et al. A low-overhead high-performance unified buffer management scheme that exploits sequential and looping references , 2000, OSDI.
[15] Michael Stumm,et al. Path: page access tracking to improve memory management , 2007, ISMM '07.
[16] Aamer Jaleel,et al. Adaptive insertion policies for managing shared caches , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[17] John L. Henning. SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.
[18] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[19] Laszlo A. Belady,et al. A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..
[20] Irving L. Traiger,et al. Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..
[21] Zhang Min-xuan. A Parallel Stream Memory Architecture for Heterogeneous Multi-core Processor , 2009 .
[22] Xiaoning Ding,et al. ULCC: a user-level facility for optimizing shared cache performance on multicores , 2011, PPoPP '11.
[23] David K. Tam,et al. Managing Shared L2 Caches on Multicore Systems in Software , 2007 .
[24] Chen Ding,et al. On the theory and potential of LRU-MRU collaborative cache management , 2011, ISMM '11.
[25] Michael Stumm,et al. Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[26] Zhao Zhang,et al. Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[27] Richard E. Kessler,et al. Page placement algorithms for large real-indexed caches , 1992, TOCS.