Non-speculative load-load reordering in TSO
暂无分享,去创建一个
Stefanos Kaxiras | Alberto Ros | Trevor E. Carlson | Mehdi Alipour | S. Kaxiras | Alberto Ros | M. Alipour
[1] Stefanos Kaxiras,et al. Racer: TSO consistency via race detection , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[2] Alberto Ros,et al. To be silent or not: on the impact of evictions of clean data in cache-coherent multicores , 2017, The Journal of Supercomputing.
[3] Stijn Eyerman,et al. An Evaluation of High-Level Mechanistic Core Models , 2014, ACM Trans. Archit. Code Optim..
[4] Josep Torrellas,et al. BulkSC: bulk enforcement of sequential consistency , 2007, ISCA '07.
[5] Sarita V. Adve,et al. Using speculative retirement and larger instruction windows to narrow the performance gap between memory consistency models , 1997, SPAA '97.
[6] Mikko H. Lipasti,et al. Memory ordering: a value-based approach , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[7] David A. Wood,et al. Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[8] Michael C. Huang,et al. Cherry: checkpointed early resource recycling in out-of-order microprocessors , 2002, MICRO.
[9] Margaret Martonosi,et al. DeSC: Decoupled supply-compute communication management for heterogeneous architectures , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[10] Erik Hagersten,et al. Gigaplane: A High Performance Bus for Large SMPs , 2003 .
[11] N. Binkert,et al. Atomic Coherence: Leveraging nanophotonics to build race-free cache coherence protocols , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[12] Mikko H. Lipasti,et al. Atomic SC for simple in-order processors , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[13] Milo M. K. Martin,et al. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.
[14] Corporate. SPARC architecture manual - version 8 , 1992 .
[15] David A. Wood,et al. A Primer on Memory Consistency and Cache Coherence , 2012, Synthesis Lectures on Computer Architecture.
[16] Pedro López,et al. The impact of out-of-order commit in coarse-grain, fine-grain and simultaneous multithreaded architectures , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[17] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[18] Marc Tremblay,et al. Rock: A High-Performance Sparc CMT Processor , 2009, IEEE Micro.
[19] Niraj K. Jha,et al. GARNET: A detailed on-chip network model inside a full-system simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[20] Alexander V. Veidenbaum,et al. Compiler-assisted, selective out-of-order commit , 2013, IEEE Computer Architecture Letters.
[21] Pedro López,et al. VB-MT: Design Issues and Performance of the Validation Buffer Microarchitecture for Multithreaded Processors , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[22] Srinivas Devadas,et al. Tardis 2.0: Optimized time traveling coherence for relaxed consistency models , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).
[23] Thomas F. Wenisch,et al. InvisiFence: performance-transparent memory ordering in conventional multiprocessors , 2009, ISCA '09.
[24] Josep Llosa,et al. Out-of-order commit processors , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[25] Hui Zeng,et al. A group-commit mechanism for ROB-based processors implementing the X86 ISA , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[26] Stefanos Kaxiras,et al. Splash-3: A properly synchronized benchmark suite for contemporary research , 2016, 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[27] Stefanos Kaxiras,et al. Exploring the Performance Limits of Out-of-order Commit , 2017, Conf. Computing Frontiers.
[28] Josep Torrellas,et al. SCsafe: Logging sequential consistency violations continuously and precisely , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[29] David Wentzlaff,et al. OpenPiton: An Open Source Manycore Research Framework , 2016, ASPLOS.
[30] Mateo Valero,et al. Toward kilo-instruction processors , 2004, TACO.
[31] Amir Roth,et al. Store vulnerability window (SVW): re-execution filtering for enhanced load optimization , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[32] Francesco Zappa Nardelli,et al. 86-TSO : A Rigorous and Usable Programmer ’ s Model for x 86 Multiprocessors , 2010 .
[33] S.P. Marti,et al. A Complexity-Effective Out-of-Order Retirement Microarchitecture , 2009, IEEE Transactions on Computers.
[34] T. N. Vijaykumar,et al. Reducing design complexity of the load/store queue , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[35] Mikko H. Lipasti,et al. Deconstructing commit , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.
[36] Rajiv Gupta,et al. Efficient sequential consistency via conflict ordering , 2012, ASPLOS XVII.
[37] Anoop Gupta,et al. Two Techniques to Enhance the Performance of Memory Consistency Models , 1991, ICPP.
[38] Amir Roth,et al. BOLT: Energy-efficient Out-of-Order Latency-Tolerant execution , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[39] Vijay Nagarajan,et al. TSO-CC: Consistency directed cache coherence for TSO , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).