Efficient Fine-Grain Synchronization on a Multi-Core Chip Architecture: A Fresh Look
暂无分享,去创建一个
[1] William J. Dally,et al. The message-driven processor: a multicomputer processing node with efficient mechanisms , 1992, IEEE Micro.
[2] Donald Yeung,et al. Sparcle: an evolutionary processor design for large-scale multiprocessors , 1993, IEEE Micro.
[3] Pen-Chung Yew,et al. The impact of synchronization and granularity on parallel systems , 1990, ISCA '90.
[4] Maged M. Michael,et al. High performance dynamic lock-free hash tables and list-based sets , 2002, SPAA '02.
[5] Keshav Pingali,et al. I-structures: data structures for parallel computing , 1986, Graph Reduction.
[6] David A. Wood,et al. LogTM: log-based transactional memory , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[7] P. Sadayappan,et al. Removal of redundant dependences in DOACROSS loops with constant dependences , 1991, PPOPP '91.
[8] Kunle Olukotun,et al. Transactional memory coherence and consistency , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[9] Pen-Chung Yew,et al. The impact of synchronization and granularity on parallel systems , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[10] José E. Moreira,et al. Demonstrating the scalability of a molecular dynamics application on a Petaflop computer , 2001, ICS '01.
[11] Mark Moir,et al. Universal Constructions for Large Objects , 1995, IEEE Trans. Parallel Distributed Syst..
[12] Zhiyuan Li,et al. A technique for reducing synchronization overhead in large scale multiprocessors , 1985, ISCA '85.
[13] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[14] Burton J. Smith,et al. The architecture of HEP , 1985 .
[15] Maged M. Michael. ABA Prevention Using Single-Word Instructions , 2004 .
[16] Allan Porterfield,et al. The Tera computer system , 1990 .
[17] Donald Yeung,et al. Low-Cost Support for Fine-Grain Synchronization in Multiprocessors , 1992, Multithreaded Computer Architecture.
[18] Ding-Kai Chen,et al. Compiler optimizations for parallel loops with fine-grained synchronization , 1994 .
[19] Bradley C. Kuszmaul,et al. Unbounded Transactional Memory , 2005, HPCA.
[20] James R. Goodman,et al. Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.
[21] Maurice Herlihy,et al. Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[22] D. Burger,et al. Efficient Synchronization: Let Them Eat QOLB /sup1/ , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[23] P. Sadayappan,et al. Removal of Redundant Dependences in DOACROSS Loops with Constant Dependences , 1991, IEEE Trans. Parallel Distributed Syst..
[24] Vincent J. Mooney,et al. The System-on-a-Chip Lock Cache , 2004 .
[25] Michael F. P. O'Boyle,et al. Synchronization Minimization in a SPMD Execution Model , 1995, J. Parallel Distributed Comput..
[26] Guang R. Gao,et al. TiNy threads: a thread virtual machine for the Cyclops64 cellular architecture , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[27] Guang R. Gao,et al. Optimization of Dense Matrix Multiplication on IBM Cyclops-64: Challenges and Experiences , 2006, Euro-Par.
[28] David A. Padua,et al. Compiler Algorithms for Synchronization , 1987, IEEE Transactions on Computers.
[29] Collin McCurdy,et al. User-controllable coherence for high performance shared memory multiprocessors , 2003, PPoPP '03.
[30] William J. Dally,et al. Exploiting fine-grain thread level parallelism on the MIT multi-ALU processor , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[31] Yuan Zhang,et al. Sequential Consistency Revisit: The Sufficient Condition and Method to Reason the Consistency Model of a Multiprocessor-on-a-Chip Architecture , 2005, Parallel and Distributed Computing and Networks.
[32] Dean M. Tullsen,et al. Supporting fine-grained synchronization on a simultaneous multithreading processor , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[33] Maurice Herlihy,et al. Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[34] José E. Moreira,et al. Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer , 2002, International Journal of Parallel Programming.
[35] Alan L. Cox,et al. Optimally synchronizing DOACROSS loops on shared memory multiprocessors , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.
[36] Mateo Valero,et al. Proceedings of the 2nd conference on Computing frontiers , 2005, CF 2008.
[37] Anant Agarwal,et al. APRIL: a processor architecture for multiprocessing , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[38] Andris Padegs,et al. Architecture of the IBM system/370 , 1978, CACM.
[39] William J. Dally,et al. The message-driven processor , 1992 .
[40] G. Gao,et al. FAST : A Functionally Accurate Simulation Toolset for the Cyclops 64 Cellular Architecture , 2005 .
[41] Donald Yeung,et al. Experience with fine-grain synchronization in MIMD machines for preconditioned conjugate gradient , 1993, PPOPP '93.
[42] Maged M. Michael. Hazard pointers: safe memory reclamation for lock-free objects , 2004, IEEE Transactions on Parallel and Distributed Systems.
[43] José E. Moreira,et al. Evaluation of a multithreaded architecture for cellular computing , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[44] Kunle Olukotun,et al. Architectural Semantics for Practical Transactional Memory , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[45] James R. Goodman,et al. Transactional lock-free execution of lock-based programs , 2002, ASPLOS X.