Hierarchical private/shared classification: The key to simple and efficient coherence for clustered cache hierarchies
暂无分享,去创建一个
[1] Babak Falsafi,et al. Reactive NUCA: near-optimal block placement and replication in distributed caches , 2009, ISCA '09.
[2] Sarita V. Adve,et al. DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[3] Thomas J. Ashby,et al. Software-Based Cache Coherence with Hardware-Assisted Selective Self-Invalidations Using Bloom Filters , 2011, IEEE Transactions on Computers.
[4] Charles E. Leiserson,et al. A consistency architecture for hierarchical shared caches , 2008, SPAA '08.
[5] Mohammad Alisafaee. Spatiotemporal Coherence Tracking , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[6] David A. Wood,et al. Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[7] Niraj K. Jha,et al. GARNET: A detailed on-chip network model inside a full-system simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[8] Michael C. Huang,et al. POPS: Coherence Protocol Optimization for Both Private and Shared Data , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[9] David A. Wood,et al. Variability in architectural simulations of multi-threaded workloads , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[10] Milo M. K. Martin,et al. Why on-chip cache coherence is here to stay , 2012, Commun. ACM.
[11] James R. Larus,et al. Mechanisms for cooperative shared memory , 1993, ISCA '93.
[12] David B. Gustavson. The Scalable Coherent Interface and related standards projects , 1992, IEEE Micro.
[13] Mark D. Hill,et al. Virtual Hierarchies , 2008, IEEE Micro.
[14] Michael Butler,et al. Bulldozer: An Approach to Multithreaded Compute Performance , 2011, IEEE Micro.
[15] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[16] Alan J. Hu,et al. Improving multiple-CMP systems using token coherence , 2005, 11th International Symposium on High-Performance Computer Architecture.
[17] Antonio Robles,et al. Temporal-Aware Mechanism to Detect Private Data in Chip Multiprocessors , 2013, 2013 42nd International Conference on Parallel Processing.
[18] Antonio Robles,et al. Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[19] Andrew W. Wilson,et al. Hierarchical cache/bus architecture for shared memory multiprocessors , 1987, ISCA '87.
[20] M. Hill,et al. Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[21] Babak Falsafi,et al. Multi-grain coherence directories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[22] David A. Wood,et al. A Primer on Memory Consistency and Cache Coherence , 2012, Synthesis Lectures on Computer Architecture.
[23] Li Shang,et al. In-Network Cache Coherence , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[24] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[25] Jaehyuk Huh,et al. Subspace snooping: Filtering snoops with operating system support , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[26] Stefanos Kaxiras,et al. SARC Coherence: Scaling Directory Cache Coherence in Performance and Power , 2010, IEEE Micro.
[27] William J. Dally,et al. The GPU Computing Era , 2010, IEEE Micro.
[28] N. Gura,et al. UltraSPARC T2: A highly-treaded, power-efficient, SPARC SOC , 2007, 2007 IEEE Asian Solid-State Circuits Conference.
[29] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[30] Natalie D. Enright Jerger,et al. Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[31] Stefanos Kaxiras,et al. Complexity-effective multicore coherence , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[32] David A. Wood,et al. Heterogeneous-race-free memory models , 2014, ASPLOS.
[33] Dhiraj K. Pradhan,et al. Two economical directory schemes for large-scale cache coherent multiprocessors , 1991, CARN.
[34] Seth H. Pugsley,et al. SWEL: Hardware cache coherence protocols to map shared data onto shared caches , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[35] Babak Falsafi,et al. Cuckoo directory: A scalable directory for many-core systems , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[36] Meng Zhang,et al. Fractal Coherence: Scalably Verifiable Cache Coherence , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[37] Stefanos Kaxiras,et al. A new perspective for efficient virtual-cache coherence , 2013, ISCA.
[38] Sanjay J. Patel,et al. Rigel: an architecture and scalable programming interface for a 1000-core accelerator , 2009, ISCA '09.
[39] Milo M. K. Martin,et al. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.
[40] Mark D. Hill,et al. Virtual hierarchies to support server consolidation , 2007, ISCA '07.
[41] Per Stenström,et al. The Scalable Tree Protocol-a cache coherence approach for large-scale multiprocessors , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.
[42] Stefanos Kaxiras,et al. The GLOW cache coherence protocol extensions for widely shared data , 1996, ICS '96.
[43] Milo M. K. Martin,et al. Token Coherence: decoupling performance and correctness , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[44] Sarita V. Adve,et al. DeNovoND: efficient hardware support for disciplined non-determinism , 2013, ASPLOS '13.