TC-Release++: An Efficient Timestamp-Based Coherence Protocol for Many-Core Architectures
暂无分享,去创建一个
[1] Josep Torrellas,et al. Bulk Disambiguation of Speculative Threads in Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[2] Stefanos Kaxiras,et al. Callback: Efficient synchronization without invalidation with a directory just for spin-waiting , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[3] Christoforos E. Kozyrakis,et al. SCD: A scalable coherence directory with flexible sharer set encoding , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[4] Paul Gastin,et al. Avoiding State Explosion for Distributed Systems with Timestamps , 2001, FME.
[5] David A. Wood,et al. QuickRelease: A throughput-oriented approach to release consistency on GPUs , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[6] Srinivas Devadas,et al. Tardis: Time Traveling Coherence Algorithm for Distributed Shared Memory , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[7] S. K. Nandy,et al. An Incessantly Coherent Cache Scheme for SharedMemory Multithreaded , 1994 .
[8] Alan L. Cox,et al. Lazy release consistency for software distributed shared memory , 1992, ISCA '92.
[9] Sarita V. Adve,et al. DeNovoSync: Efficient Support for Arbitrary Synchronization without Writer-Initiated Invalidations , 2015, ASPLOS.
[10] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[11] David L. Dill,et al. The Murphi Verification System , 1996, CAV.
[12] Anoop Gupta,et al. Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[13] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[14] Mike O'Connor,et al. Cache coherence for GPU architectures , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[15] Tatsuhiro Tsuchiya,et al. Model Checking of Consensus Algorit , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).
[16] Bratin Saha,et al. McRT-STM: a high performance software transactional memory system for a multi-core runtime , 2006, PPoPP '06.
[17] Vijay Nagarajan,et al. TSO-CC: Consistency directed cache coherence for TSO , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[18] Thomas J. Ashby,et al. Software-Based Cache Coherence with Hardware-Assisted Selective Self-Invalidations Using Bloom Filters , 2011, IEEE Transactions on Computers.
[19] David A. Wood,et al. Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[20] Stefanos Kaxiras,et al. Complexity-effective multicore coherence , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[21] Kenneth L. McMillan,et al. Parameterized Verification of the FLASH Cache Coherence Protocol by Compositional Model Checking , 2001, CHARME.
[22] Babak Falsafi,et al. Cuckoo directory: A scalable directory for many-core systems , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[23] Vijay Nagarajan,et al. RC3: Consistency Directed Cache Coherence for x86-64 with RC Extensions , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[24] Sang Lyul Min,et al. Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps , 1992, IEEE Trans. Parallel Distributed Syst..
[25] Rami G. Melhem,et al. A timestamp-based selective invalidation scheme for multiprocessor cache coherence , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.
[26] Srinivas Devadas,et al. Memory coherence in the age of multicores , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).
[27] M. Martonosi,et al. Timekeeping in the memory system: predicting and optimizing memory behavior , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[28] Mark D. Hill,et al. Weak ordering—a new definition , 1998, ISCA '98.
[29] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[30] John Goodacre,et al. Parallelism and the ARM instruction set architecture , 2005, Computer.
[31] Helmut Veith,et al. Progress on the State Explosion Problem in Model Checking , 2001, Informatics.
[32] Wojciech Penczek,et al. Verifying Security Protocols with Timestamps via Translation to Timed Automata ⋆ , 2005 .
[33] Jaehyuk Huh,et al. Coherence decoupling: making use of incoherence , 2004, ASPLOS XI.
[34] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[35] Srinivas Devadas,et al. Tardis 2.0: Optimized time traveling coherence for relaxed consistency models , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).
[36] Tatsuhiro Tsuchiya,et al. Model Checking of Consensus Algorithms , 2007 .
[37] Sarita V. Adve,et al. DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[38] David A. Wood,et al. Lazy release consistency for GPUs , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[39] Balaram Sinharoy,et al. The implementation of POWER7TM: A highly parallel and scalable multi-core high-end server processor , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).
[40] David A. Wood,et al. A Primer on Memory Consistency and Cache Coherence , 2012, Synthesis Lectures on Computer Architecture.
[41] Sarita V. Adve,et al. DeNovoND: efficient hardware support for disciplined non-determinism , 2013, ASPLOS '13.
[42] Milo M. K. Martin,et al. Why on-chip cache coherence is here to stay , 2012, Commun. ACM.