Design of an efficient communication infrastructure for highly contended locks in many-core CMPs
暂无分享,去创建一个
[1] Sanjeev Kumar,et al. Evaluating synchronization on shared address space multiprocessors: methodology and performance , 1999, SIGMETRICS '99.
[2] Pat Conway,et al. Blade computing with the AMD Opteron™ processor ("magny-cours") , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).
[3] Nathan R. Tallent,et al. Analyzing lock contention in multithreaded applications , 2010, PPoPP '10.
[4] M. Erez,et al. Express Virtual Channels with Capacitively Driven Global Links , 2009, IEEE Micro.
[5] Christoforos E. Kozyrakis,et al. Comparing memory systems for chip multiprocessors , 2007, ISCA '07.
[6] Frank Mueller,et al. Token-Based Read/Write-Locks for Distributed Mutual Exclusion , 2000, Euro-Par.
[7] Guang R. Gao,et al. Synchronization state buffer: supporting efficient fine-grain synchronization on many-core architectures , 2007, ISCA '07.
[8] K.L. Shepard,et al. Distributed Loss-Compensation Techniques for Energy-Efficient Low-Latency On-Chip Communication , 2007, IEEE Journal of Solid-State Circuits.
[9] José L. Abellán,et al. A G-Line-Based Network for Fast and Efficient Barrier Synchronization in Many-Core CMPs , 2010, 2010 39th International Conference on Parallel Processing.
[10] Gianluca Palermo,et al. An efficient synchronization technique for multiprocessor systems on-chip , 2006, MEDEA '05.
[11] Christopher J. Hughes,et al. RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors , 2002, Computer.
[12] S. Wong,et al. Near speed-of-light signaling over on-chip electrical interconnects , 2003 .
[13] K. Okada,et al. A Bidirectional- and Multi-Drop-Transmission-Line Interconnect for Multipoint-to-Multipoint On-Chip Communications , 2008, IEEE Journal of Solid-State Circuits.
[14] James R. Goodman,et al. Transactional lock-free execution of lock-based programs , 2002, ASPLOS X.
[15] Beng-Hong Lim,et al. Reactive synchronization algorithms for multiprocessors , 1994, ASPLOS VI.
[16] Anant Agarwal,et al. Smartlocks: lock acquisition scheduling for self-aware synchronization , 2010, ICAC '10.
[17] José L. Abellán,et al. GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[18] John B. Carter,et al. MP-LOCKs: replacing H/W synchronization primitives with message passing , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[19] William N. Scherer,et al. Scalable queue-based spin locks with timeout , 2001, PPoPP '01.
[20] Manuel E. Acacio,et al. Sim-PowerCMP: A Detailed Simulator for Energy Consumption Analysis in Future Embedded CMP Architectures , 2007, 21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07).
[21] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[22] Milos Prvulovic,et al. TLSync: Support for multiple fast barriers using on-chip transmission lines , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[23] Richard McDougall,et al. Solaris internals : core kernel components , 2001 .
[24] John Sartori,et al. Low-Overhead, High-Speed Multi-core Barrier Synchronization , 2010, HiPEAC.
[25] Dawei Huang,et al. A 40 nm 16-Core 128-Thread SPARC SoC Processor , 2011, IEEE Journal of Solid-State Circuits.
[26] Thomas E. Anderson,et al. The Performance Implications of Spin-Waiting Alternatives for Shared-Memory Multiprocessors , 1989, ICPP.
[27] James R. Goodman,et al. Efficient Synchronization: Let Them Eat QOLB , 1997, International Symposium on Computer Architecture.
[28] Justin Schauer,et al. High Speed and Low Energy Capacitively Driven On-Chip Wires , 2008, IEEE Journal of Solid-State Circuits.
[29] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[30] Hugh Garraway. Parallel Computer Architecture: A Hardware/Software Approach , 1999, IEEE Concurrency.