MiSAR: Minimalistic synchronization accelerator with resource overflow management
暂无分享,去创建一个
[1] D. Lenoski,et al. The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[2] Donald Yeung,et al. The MIT Alewife machine: architecture and performance , 1995, ISCA '98.
[3] Constantine D. Polychronopoulos,et al. Fast barrier synchronization hardware , 1990, Proceedings SUPERCOMPUTING '90.
[4] William Gropp,et al. Design and implementation of message-passing services for the Blue Gene/L supercomputer , 2005, IBM J. Res. Dev..
[5] Norman P. Jouppi,et al. Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[6] José L. Abellán,et al. A G-Line-Based Network for Fast and Efficient Barrier Synchronization in Many-Core CMPs , 2010, 2010 39th International Conference on Parallel Processing.
[7] Ralph Grishman,et al. The NYU Ultracomputer—designing a MIMD, shared-memory parallel machine (Extended Abstract) , 1982, ISCA '82.
[8] Milos Prvulovic,et al. TLSync: Support for multiple fast barriers using on-chip transmission lines , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[9] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[10] Ralph Grishman,et al. The NYU Ultracomputer—designing a MIMD, shared-memory parallel machine (Extended Abstract) , 1982, ISCA 1982.
[11] Guang R. Gao,et al. Synchronization state buffer: supporting efficient fine-grain synchronization on many-core architectures , 2007, ISCA '07.
[12] William J. Dally,et al. Exploiting fine-grain thread level parallelism on the MIT multi-ALU processor , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[13] John T. Robinson. A fast general-purpose hardware synchronization mechanism , 1985, SIGMOD '85.
[14] Men-Chow Chiang,et al. Memory system design for bus-based multiprocessors , 1992 .
[15] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[16] Christian Bienia,et al. Benchmarking modern multiprocessors , 2011 .
[17] Mateo Valero,et al. Architectural Support for Fair Reader-Writer Locking , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[18] Zhen Fang,et al. Highly efficient synchronization based on active memory operations , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[19] James R. Goodman,et al. Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.
[20] Fabrizio Petrini,et al. Scalable collective communication on the ASCI Q machine , 2003, 11th Symposium on High Performance Interconnects, 2003. Proceedings..
[21] José L. Abellán,et al. GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[22] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[23] Allan Porterfield,et al. The Tera computer system , 1990, ICS '90.
[24] Steven L. Scott,et al. Synchronization and communication in the T3E multiprocessor , 1996, ASPLOS VII.
[25] W. Daniel Hillis,et al. The network architecture of the Connection Machine CM-5 (extended abstract) , 1992, SPAA '92.
[26] Jaehwan Lee,et al. A system-on-a-chip lock cache with task preemption support , 2001, CASES '01.